The source and destination addresses are included to allow channel
selection based on address alignment.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Pass a full set of flags to drivers' per-operation 'prep' routines.
Currently the only flag passed is DMA_PREP_INTERRUPT. The expectation is
that arch-specific async_tx_find_channel() implementations can exploit this
capability to find the best channel for an operation.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Reviewed-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
The tx_set_src and tx_set_dest methods were originally implemented to allow
an array of addresses to be passed down from async_xor to the dmaengine
driver while minimizing stack overhead. Removing these methods allows
drivers to have all transaction parameters available at 'prep' time, saves
two function pointers in struct dma_async_tx_descriptor, and reduces the
number of indirect branches..
A consequence of moving this data to the 'prep' routine is that
multi-source routines like async_xor need temporary storage to convert an
array of linear addresses into an array of dma addresses. In order to keep
the same stack footprint of the previous implementation the input array is
reused as storage for the dma addresses. This requires that
sizeof(dma_addr_t) be less than or equal to sizeof(void *). As a
consequence CONFIG_DMADEVICES now depends on !CONFIG_HIGHMEM64G. It also
requires that drivers be able to make descriptor resources available when
the 'prep' routine is polled.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Remove the unused ASYNC_TX_ASSUME_COHERENT flag. Async_tx is
meant to hide the difference between asynchronous hardware and synchronous
software operations, this flag requires clients to understand cache
coherency consequences of the async path.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
single list_head variable initialized with LIST_HEAD_INIT could almost
always can be replaced with LIST_HEAD declaration, this shrinks the code
and looks better.
Signed-off-by: Denis Cheng <crquan@gmail.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
do_async_xor must be compiled away on !HAS_DMA archs.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (125 commits)
[CRYPTO] twofish: Merge common glue code
[CRYPTO] hifn_795x: Fixup container_of() usage
[CRYPTO] cast6: inline bloat--
[CRYPTO] api: Set default CRYPTO_MINALIGN to unsigned long long
[CRYPTO] tcrypt: Make xcbc available as a standalone test
[CRYPTO] xcbc: Remove bogus hash/cipher test
[CRYPTO] xcbc: Fix algorithm leak when block size check fails
[CRYPTO] tcrypt: Zero axbuf in the right function
[CRYPTO] padlock: Only reset the key once for each CBC and ECB operation
[CRYPTO] api: Include sched.h for cond_resched in scatterwalk.h
[CRYPTO] salsa20-asm: Remove unnecessary dependency on CRYPTO_SALSA20
[CRYPTO] tcrypt: Add select of AEAD
[CRYPTO] salsa20: Add x86-64 assembly version
[CRYPTO] salsa20_i586: Salsa20 stream cipher algorithm (i586 version)
[CRYPTO] gcm: Introduce rfc4106
[CRYPTO] api: Show async type
[CRYPTO] chainiv: Avoid lock spinning where possible
[CRYPTO] seqiv: Add select AEAD in Kconfig
[CRYPTO] scatterwalk: Handle zero nbytes in scatterwalk_map_and_copy
[CRYPTO] null: Allow setkey on digest_null
...
Currently the gcm(aes) tests have to be taken together with all other
algorithms. This patch makes it available by itself at number 106.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When setting the digest size xcbc tests to see if the underlying algorithm
is a hash. This is silly because we don't allow it to be a hash and we've
specifically requested for a cipher.
This patch removes the bogus test.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When the underlying algorithm has a block size other than 16 we abort
without freeing it. In fact, we try to return the algorithm itself
as an error!
This patch plugs the leak and makes it return -EINVAL instead.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The axbuf buffer is used by test_aead and therefore should be zeroed
there instead of in test_hash.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This is the x86-64 version of the Salsa20 stream cipher algorithm. The
original assembly code came from
<http://cr.yp.to/snuffle/salsa20/amd64-3/salsa20.s>. It has been
reformatted for clarity.
Signed-off-by: Tan Swee Heng <thesweeheng@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch contains the salsa20-i586 implementation. The original
assembly code came from
<http://cr.yp.to/snuffle/salsa20/x86-pm/salsa20.s>. I have reformatted
it (added indents) so that it matches the other algorithms in
arch/x86/crypto.
Signed-off-by: Tan Swee Heng <thesweeheng@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch introduces the rfc4106 wrapper for GCM just as we have an
rfc4309 wrapper for CCM. The purpose of the wrapper is to include part
of the IV in the key so that it can be negotiated by IPsec.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch makes chainiv avoid spinning by postponing requests on lock
contention if the user allows the use of asynchronous algorithms. If
a synchronous algorithm is requested then we behave as before.
This should improve IPsec performance on SMP when two CPUs attempt to
transmit over the same SA. Currently one of them will spin doing nothing
waiting for the other CPU to finish its encryption. This patch makes it
postpone the request and get on with other work.
If only one CPU is transmitting for a given SA, then we will process
the request synchronously as before.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Now that seqiv supports AEAD algorithms it needs to select the AEAD option.
Thanks to Erez Zadok for pointing out the problem.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
It's better to return silently than crash and burn when someone feeds us
a zero length. In particular the null digest algorithm when used as part
of authenc will do that to us.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
We need to allow setkey on digest_null if it is to be used directly by
authenc instead of through hmac.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds a null blkcipher algorithm called ecb(cipher_null) for
backwards compatibility. Previously the null algorithm when used by
IPsec copied the data byte by byte. This new algorithm optimises that
to a straight memcpy which lets us better measure inherent overheads in
our IPsec code.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds 7 test vectors to tcrypt for CCM.
The test vectors are from rfc 3610.
There are about 10 more test vectors in RFC 3610
and 4 or 5 more in NIST. I can add these as time permits.
I also needed to set authsize. CCM has a prerequisite of
authsize.
Signed-off-by: Joy Latten <latten@austin.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds Counter with CBC-MAC (CCM) support.
RFC 3610 and NIST Special Publication 800-38C were referenced.
Signed-off-by: Joy Latten <latten@austin.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch makes crypto_alloc_aead always return algorithms that is
capable of generating their own IVs through givencrypt and givdecrypt.
All existing AEAD algorithms already do. New ones must either supply
their own or specify a generic IV generator with the geniv field.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds support for using seqiv with AEAD algorithms. This is
useful for those AEAD algorithms that performs authentication before
encryption because the IV generated by the underlying encryption algorithm
won't be available for authentication.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch creates the infrastructure to help the construction of IV
generator templates that wrap around AEAD algorithms by adding an IV
generator to them. This is useful for AEAD algorithms with no built-in
IV generator or to replace their built-in generator.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Some algorithms always require manual IV construction. For instance,
the generic CCM algorithm requires the first byte of the IV to be manually
constructed. Such algorithms are always used by other algorithms equipped
with their own IV generators and do not need IV generation per se.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch implements the givencrypt function for authenc. It simply
calls the givencrypt operation on the underlying cipher instead of encrypt.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the underlying givcrypt operations for aead and associated
support elements. The rationale is identical to that of the skcipher
givcrypt operations, i.e., sometimes only the algorithm knows how the
IV should be generated.
A new request type aead_givcrypt_request is added which contains an
embedded aead_request structure with two new elements to support this
operation. The new elements are seq and giv. The seq field should
contain a strictly increasing 64-bit integer which may be used by
certain IV generators as an input value. The giv field will be used
to store the generated IV. It does not need to obey the alignment
requirements of the algorithm because it's not used during the operation.
The existing iv field must still be available as it will be used to store
intermediate IVs and the output IV if chaining is desired.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This generator generates an IV based on a sequence number by xoring it
with a salt. This algorithm is mainly useful for CTR and similar modes.
This patch also sets it as the default IV generator for ctr.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch converts the gcm algorithm over to crypto_grab_skcipher
which is a prerequisite for IV generation.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the gcm_base template which takes a block cipher
parameter instead of cipher. This allows the user to specify a
specific CTR implementation.
This also fixes a leak of the cipher algorithm that was previously
looked up but never freed.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch converts the authenc algorithm over to crypto_grab_skcipher
which is a prerequisite for IV generation.
This patch also changes authenc to set its ASYNC status depending on
the ASYNC status of the underlying skcipher.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch makes crypto_alloc_ablkcipher/crypto_grab_skcipher always
return algorithms that are capable of generating their own IVs through
givencrypt and givdecrypt. Each algorithm may specify its default IV
generator through the geniv field.
For algorithms that do not set the geniv field, the blkcipher layer will
pick a default. Currently it's chainiv for synchronous algorithms and
eseqiv for asynchronous algorithms. Note that if these wrappers do not
work on an algorithm then that algorithm must specify its own geniv or
it can't be used at all.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This generator generates an IV based on a sequence number by xoring it
with a salt and then encrypting it with the same key as used to encrypt
the plain text. This algorithm requires that the block size be equal
to the IV size. It is mainly useful for CBC.
It has one noteworthy property that for IPsec the IV happens to lie
just before the plain text so the IV generation simply increases the
number of encrypted blocks by one. Therefore the cost of this generator
is entirely dependent on the speed of the underlying cipher.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The chain IV generator is the one we've been using in the IPsec stack.
It simply starts out with a random IV, then uses the last block of each
encrypted packet's cipher text as the IV for the next packet.
It can only be used by synchronous ciphers since we have to make sure
that we don't start the encryption of the next packet until the last
one has completed.
It does have the advantage of using very little CPU time since it doesn't
have to generate anything at all.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch creates the infrastructure to help the construction of givcipher
templates that wrap around existing blkcipher/ablkcipher algorithms by adding
an IV generator to them.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
If the underlying algorithm specifies a specific geniv algorithm then
we should use it for the cryptd version as well.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch introduces the geniv field which indicates the default IV
generator for each algorithm. It should point to a string that is not
freed as long as the algorithm is registered.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Different block cipher modes have different requirements for intialisation
vectors. For example, CBC can use a simple randomly generated IV while
modes such as CTR must use an IV generation mechanisms that give a stronger
guarantee on the lack of collisions. Furthermore, disk encryption modes
have their own IV generation algorithms.
Up until now IV generation has been left to the users of the symmetric
key cipher API. This is inconvenient as the number of block cipher modes
increase because the user needs to be aware of which mode is supposed to
be paired with which IV generation algorithm.
Therefore it makes sense to integrate the IV generation into the crypto
API. This patch takes the first step in that direction by creating two
new ablkcipher operations, givencrypt and givdecrypt that generates an
IV before performing the actual encryption or decryption.
The operations are currently not exposed to the user. That will be done
once the underlying functionality has actually been implemented.
It also creates the underlying givcipher type. Algorithms that directly
generate IVs would use it instead of ablkcipher. All other algorithms
(including all existing ones) would generate a givcipher algorithm upon
registration. This givcipher algorithm will be constructed from the geniv
string that's stored in every algorithm. That string will locate a template
which is instantiated by the blkcipher/ablkcipher algorithm in question to
give a givcipher algorithm.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Note: From now on the collective of ablkcipher/blkcipher/givcipher will
be known as skcipher, i.e., symmetric key cipher. The name blkcipher has
always been much of a misnomer since it supports stream ciphers too.
This patch adds the function crypto_grab_skcipher as a new way of getting
an ablkcipher spawn. The problem is that previously we did this in two
steps, first getting the algorithm and then calling crypto_init_spawn.
This meant that each spawn user had to be aware of what type and mask to
use for these two steps. This is difficult and also presents a problem
when the type/mask changes as they're about to be for IV generators.
The new interface does both steps together just like crypto_alloc_ablkcipher.
As a side-effect this also allows us to be stronger on type enforcement
for spawns. For now this is only done for ablkcipher but it's trivial
to extend for other types.
This patch also moves the type/mask logic for skcipher into the helpers
crypto_skcipher_type and crypto_skcipher_mask.
Finally this patch introduces the function crypto_require_sync to determine
whether the user is specifically requesting a sync algorithm.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the necessary changes for GCM to be used with async
ciphers. This would allow it to be used with hardware devices that
support CTR.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As discussed previously, this patch moves the basic CTR functionality
into a chainable algorithm called ctr. The IPsec-specific variant of
it is now placed on top with the name rfc3686.
So ctr(aes) gives a chainable cipher with IV size 16 while the IPsec
variant will be called rfc3686(ctr(aes)). This patch also adjusts
gcm accordingly.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
With the impending addition of the givcipher type, both blkcipher and
ablkcipher algorithms will use it to create givcipher objects. As such
it no longer makes sense to split the system between ablkcipher and
blkcipher. In particular, both ablkcipher.c and blkcipher.c would need
to use the givcipher type which has to reside in ablkcipher.c since it
shares much code with it.
This patch merges the two Kconfig options as well as the modules into one.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch fixes the request context alignment so that it is actually
aligned to the value required by the algorithm.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds a new helper crypto_attr_alg_name which is basically the
first half of crypto_attr_alg. That is, it returns an algorithm name
parameter as a string without looking it up. The caller can then look it
up immediately or defer it until later.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
i get here:
----
LD vmlinux
SYSMAP System.map
SYSMAP .tmp_System.map
Building modules, stage 2.
MODPOST 226 modules
ERROR: "crypto_hash_type" [crypto/authenc.ko] undefined!
make[1]: *** [__modpost] Error 1
make: *** [modules] Error 2
---
which fails because crypto_hash_type is declared in crypto/hash.c. You might wanna
fix it like so:
Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds a simple speed test for salsa20.
Usage: modprobe tcrypt mode=206
Signed-of-by: Tan Swee Heng <thesweeheng@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add common compression tester function
Modify deflate test case to use the common compressor test function
Signed-off-by: Zoltan Sogor <weth@inf.u-szeged.hu>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This is a large test vector for Salsa20 that crosses the 4096-bytes
page boundary.
Signed-off-by: Tan Swee Heng <thesweeheng@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch fixes the multi-page processing bug that affects large test
vectors (the same bug that previously affected ctr.c).
There is an optimization for the case walk.nbytes == nbytes. Also we
now use crypto_xor() instead of adhoc XOR routines.
Signed-off-by: Tan Swee Heng <thesweeheng@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The abreq structure is currently allocated on the stack. This is broken
if the underlying algorithm is asynchronous. This patch changes it so
that it's taken from the private context instead which has been enlarged
accordingly.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Unfortunately the generic chaining hasn't been ported to all architectures
yet, and notably not s390. So this patch restores the chainging that we've
been using previously which does work everywhere.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The scatterwalk infrastructure is used by algorithms so it needs to
move out of crypto for future users that may live in drivers/crypto
or asm/*/crypto.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch changes gcm/authenc to return EBADMSG instead of EINVAL for
ICV mismatches. This convention has already been adopted by IPsec.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The crypto_aead convention for ICVs is to include it directly in the
output. If we decided to change this in future then we would make
the ICV (if the algorithm has an explicit one) available in the
request itself.
For now no algorithm needs this so this patch changes gcm to conform
to this convention. It also adjusts the tcrypt aead tests to take
this into account.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently the gcm(aes) tests have to be taken together with all other
ciphers. This patch makes it available by itself at number 35.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The previous code incorrectly included the hash in the verification which
also meant that we'd crash and burn when it comes to actually verifying
the hash since we'd go past the end of the SG list.
This patch fixes that by subtracting authsize from cryptlen at the start.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Having enckeylen as a template parameter makes it a pain for hardware
devices that implement ciphers with many key sizes since each one would
have to be registered separately.
Since the authenc algorithm is mainly used for legacy purposes where its
key is going to be constructed out of two separate keys, we can in fact
embed this value into the key itself.
This patch does this by prepending an rtnetlink header to the key that
contains the encryption key length.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As it is authsize is an algorithm paramter which cannot be changed at
run-time. This is inconvenient because hardware that implements such
algorithms would have to register each authsize that they support
separately.
Since authsize is a property common to all AEAD algorithms, we can add
a function setauthsize that sets it at run-time, just like setkey.
This patch does exactly that and also changes authenc so that authsize
is no longer a parameter of its template.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Since alignment masks are always one less than a power of two, we can
use binary or to find their maximum.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
These utilities implemented in lib/hexdump.c are more handy, please use this.
Signed-off-by: Denis Cheng <crquan@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add test vectors to tcrypt for AES in CBC mode for key sizes 192 and 256.
The test vectors are copied from NIST SP800-38A.
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds a large AES CTR mode test vector. The test vector is
4100 bytes in size. It was generated using a C++ program that called
Crypto++.
Note that this patch increases considerably the size of "struct
cipher_testvec" and hence the size of tcrypt.ko.
Signed-off-by: Tan Swee Heng <thesweeheng@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently the number of entries in a cipher test vector template is
limited by TVMEMSIZE/sizeof(struct cipher_testvec). This patch
circumvents the problem by pointing cipher_tv to each entry in the
template, rather than the template itself.
Signed-off-by: Tan Swee Heng <thesweeheng@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When the data spans across a page boundary, CTR may incorrectly process
a partial block in the middle because the blkcipher walking code may
supply partial blocks in the middle as long as the total length of the
supplied data is more than a block. CTR is supposed to return any unused
partial block in that case to the walker.
This patch fixes this by doing exactly that, returning partial blocks to
the walker unless we received less than a block-worth of data to start
with.
This also allows us to optimise the bulk of the processing since we no
longer have to worry about partial blocks until the very end.
Thanks to Tan Swee Heng for fixes and actually testing this :)
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add GCM/GMAC support to cryptoapi.
GCM (Galois/Counter Mode) is an AEAD mode of operations for any block cipher
with a block size of 16. The typical example is AES-GCM.
Signed-off-by: Mikko Herranen <mh1@iki.fi>
Reviewed-by: Mika Kukkonen <mika.kukkonen@nsn.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add AEAD support to tcrypt, needed by GCM.
Signed-off-by: Mikko Herranen <mh1@iki.fi>
Reviewed-by: Mika Kukkonen <mika.kukkonen@nsn.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Analogously to camellia7 patch, move
"absorb kw2 to other subkeys" and "absorb kw4 to other subkeys"
code parts into camellia_setup_tail(). This further reduces
source and object code size at the cost of two brances
in key setup code.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Move "key XOR is end of F-function" code part into
camellia_setup_tail(), it is sufficiently similar
between camellia_setup128 and camellia_setup256.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
unifies encrypt/decrypt routines for different key lengths.
This reduces module size by ~25%, with tiny (less than 1%)
speed impact.
Also collapses encrypt/decrypt into more readable
(visually shorter) form using macros.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Remove unused macro params.
Use (u8)(expr) instead of (expr) & 0xff,
helps gcc to realize how to use simpler commands.
Move CAMELLIA_FLS macro closer to encrypt/decrypt routines.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch replaces the custom xor in CBC with the generic crypto_xor.
It changes the operations for in-place encryption slightly to avoid
calling crypto_xor with tmpbuf since it is not necessarily aligned.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
All common block ciphers have a block size that's a power of 2. In fact,
all of our block ciphers obey this rule.
If we require this then CBC can be optimised to avoid an expensive divide
on in-place decryption.
I've also changed the saving of the first IV in the in-place decryption
case to the last IV because that lets us use walk->iv (which is already
aligned) for the xor operation where alignment is required.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
With the addition of more stream ciphers we need to curb the proliferation
of ad-hoc xor functions. This patch creates a generic pair of functions,
crypto_inc and crypto_xor which does big-endian increment and exclusive or,
respectively.
For optimum performance, they both use u32 operations so alignment must be
as that of u32 even though the arguments are of type u8 *.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Up until now we have ablkcipher algorithms have been identified as
type BLKCIPHER with the ASYNC bit set. This is suboptimal because
ablkcipher refers to two things. On the one hand it refers to the
top-level ablkcipher interface with requests. On the other hand it
refers to and algorithm type underneath.
As it is you cannot request a synchronous block cipher algorithm
with the ablkcipher interface on top. This is a problem because
we want to be able to eventually phase out the blkcipher top-level
interface.
This patch fixes this by making ABLKCIPHER its own type, just as
we have distinct types for HASH and DIGEST. The type it associated
with the algorithm implementation only.
Which top-level interface is used for synchronous block ciphers is
then determined by the mask that's used. If it's a specific mask
then the old blkcipher interface is given, otherwise we go with the
new ablkcipher interface.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch converts the crypto scatterwalk code to use the generic
scatterlist chaining rather the version specific to crypto.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Resubmitting this patch which extends sha256_generic.c to support SHA-224 as
described in FIPS 180-2 and RFC 3874. HMAC-SHA-224 as described in RFC4231
is then supported through the hmac interface.
Patch includes test vectors for SHA-224 and HMAC-SHA-224.
SHA-224 chould be chosen as a hash algorithm when 112 bits of security
strength is required.
Patch generated against the 2.6.24-rc1 kernel and tested against
2.6.24-rc1-git14 which includes fix for scatter gather implementation for HMAC.
Signed-off-by: Jonathan Lynch <jonathan.lynch@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The setkey() function can be shared with the generic algorithm.
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
NO other block mode is M by default.
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The setkey() function can be shared with the generic algorithm.
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch exports four tables and the set_key() routine. This ressources
can be shared by other AES implementations (aes-x86_64 for instance).
The decryption key has been turned around (deckey[0] is the first piece
of the key instead of deckey[keylen+20]). The encrypt/decrypt functions
are looking now identical (except they are using different tables and
key).
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds countersize to CTR mode.
The template is now ctr(algo,noncesize,ivsize,countersize).
For example, ctr(aes,4,8,4) indicates the counterblock
will be composed of a salt/nonce that is 4 bytes, an iv
that is 8 bytes and the counter is 4 bytes.
When noncesize + ivsize < blocksize, CTR initializes the
last block - ivsize - noncesize portion of the block to
zero. Otherwise the counter block is composed of the IV
(and nonce if necessary).
If noncesize + ivsize == blocksize, then this indicates that
user is passing in entire counterblock. Thus countersize
indicates the amount of bytes in counterblock to use as
the counter for incrementing. CTR will increment counter
portion by 1, and begin encryption with that value.
Note that CTR assumes the counter portion of the block that
will be incremented is stored in big endian.
Signed-off-by: Joy Latten <latten@austin.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Move huge unrolled pieces of code (3 screenfuls) at the end of
128/256 key setup routines into common camellia_setup_tail(),
convert it to loop there.
Loop is still unrolled six times, so performance hit is very small,
code size win is big.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Acked-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Optimize GETU32 to use 4-byte memcpy (modern gcc will convert
such memcpy to single move instruction on i386).
Original GETU32 did four byte fetches, and shifted/XORed those.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Acked-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Rename some macros to shorter names: CAMELLIA_RR8 -> ROR8,
making it easier to understand that it is just a right rotation,
nothing camellia-specific in it.
CAMELLIA_SUBKEY_L() -> SUBKEY_L() - just shorter.
Move be32 <-> cpu conversions out of en/decrypt128/256 and into
camellia_en/decrypt - no reason to have that code duplicated twice.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Acked-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Move code blocks around so that related pieces are closer together:
e.g. CAMELLIA_ROUNDSM macro does not need to be separated
from the rest of the code by huge array of constants.
Remove unused macros (COPY4WORD, SWAP4WORD, XOR4WORD[2])
Drop SUBL(), SUBR() macros which only obscure things.
Same for CAMELLIA_SP1110() macro and KEY_TABLE_TYPE typedef.
Remove useless comments:
/* encryption */ -- well it's obvious enough already!
void camellia_encrypt128(...)
Combine swap with copying at the beginning/end of encrypt/decrypt.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Acked-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently twofish cipher key setup code
has unrolled loops - approximately 70-100
instructions are repeated 40 times.
As a result, twofish module is the biggest module
in crypto/*.
Unrolling produces x2.5 more code (+18k on i386), and speeds up key
setup by 7%:
unrolled: twofish_setkey/sec: 41128
loop: twofish_setkey/sec: 38148
CALC_K256: ~100 insns each
CALC_K192: ~90 insns
CALC_K: ~70 insns
Attached patch removes this unrolling.
$ size */twofish_common.o
text data bss dec hex filename
37920 0 0 37920 9420 crypto.org/twofish_common.o
13209 0 0 13209 3399 crypto/twofish_common.o
Run tested (modprobe tcrypt reports ok). Please apply.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This three defines are used in all AES related hardware.
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
HIFN driver update to use DES weak key checks (exported in this patch).
Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch creates include/crypto/des.h for common macros shared between
DES implementations.
Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch implements CTR mode for IPsec.
It is based off of RFC 3686.
Please note:
1. CTR turns a block cipher into a stream cipher.
Encryption is done in blocks, however the last block
may be a partial block.
A "counter block" is encrypted, creating a keystream
that is xor'ed with the plaintext. The counter portion
of the counter block is incremented after each block
of plaintext is encrypted.
Decryption is performed in same manner.
2. The CTR counterblock is composed of,
nonce + IV + counter
The size of the counterblock is equivalent to the
blocksize of the cipher.
sizeof(nonce) + sizeof(IV) + sizeof(counter) = blocksize
The CTR template requires the name of the cipher
algorithm, the sizeof the nonce, and the sizeof the iv.
ctr(cipher,sizeof_nonce,sizeof_iv)
So for example,
ctr(aes,4,8)
specifies the counterblock will be composed of 4 bytes
from a nonce, 8 bytes from the iv, and 4 bytes for counter
since aes has a blocksize of 16 bytes.
3. The counter portion of the counter block is stored
in big endian for conformance to rfc 3686.
Signed-off-by: Joy Latten <latten@austin.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As it is crypto_remove_spawn may try to unregister an instance which is
yet to be registered. This patch fixes this by checking whether the
instance has been registered before attempting to remove it.
It also removes a bogus cra_destroy check in crypto_register_instance as
1) it's outside the mutex;
2) we have a check in __crypto_register_alg already.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
It seems that newer versions of gcc have regressed in their abilities to
analyse initialisations. This patch moves the initialisations up to avoid
the warnings.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Not architecture specific code should not #include <asm/scatterlist.h>.
This patch therefore either replaces them with
#include <linux/scatterlist.h> or simply removes them if they were
unused.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This patch moves the sg_init_table out of the timing loops for hash
algorithms so that it doesn't impact on the speed test results.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
hmac_setkey(), hmac_init(), and hmac_final() have
a singular on-stack scatterlist. Initialit is
using sg_init_one() instead of using sg_set_buf().
Signed-off-by: David S. Miller <davem@davemloft.net>
Crypto now uses SG helper functions. Fix hmac_digest to use those
functions correctly and fix the oops associated with it.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Most drivers need to set length and offset as well, so may as well fold
those three lines into one.
Add sg_assign_page() for those two locations that only needed to set
the page, where the offset/length is set outside of the function context.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Convert the subdirectory "crypto" to UTF-8. The files changed are
<crypto/fcrypt.c> and <crypto/api.c>.
Signed-off-by: John Anthony Kazos Jr. <jakj@j-a-k-j.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
There are currently several SHA implementations that all define their own
initialization vectors and size values. Since this values are idential
move them to a header file under include/crypto.
Signed-off-by: Jan Glauber <jang@de.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Loading the crypto algorithm by the alias instead of by module directly
has the advantage that all possible implementations of this algorithm
are loaded automatically and the crypto API can choose the best one
depending on its priority.
Additionally it ensures that the generic implementation as well as the
HW driver (if available) is loaded in case the HW driver needs the
generic version as fallback in corner cases.
Also remove the probe for sha1 in padlock's init code.
Quote from Herbert:
The probe is actually pointless since we can always probe when
the algorithm is actually used which does not lead to dead-locks
like this.
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Loading the crypto algorithm by the alias instead of by module directly
has the advantage that all possible implementations of this algorithm
are loaded automatically and the crypto API can choose the best one
depending on its priority.
Additionally it ensures that the generic implementation as well as the
HW driver (if available) is loaded in case the HW driver needs the
generic version as fallback in corner cases.
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Loading the crypto algorithm by the alias instead of by module directly
has the advantage that all possible implementations of this algorithm
are loaded automatically and the crypto API can choose the best one
depending on its priority.
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the helper blkcipher_walk_virt_block which is similar to
blkcipher_walk_virt but uses a supplied block size instead of the block
size of the block cipher. This is useful for CTR where the block size is
1 but we still want to walk by the block size of the underlying cipher.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Now that the block size is no longer a multiple of the alignment, we need to
increase the kmalloc amount in blkcipher_next_slow to use the aligned block
size.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds a comment to explain why we compare the cra_driver_name of
the algorithm being registered against the cra_name of a larval as opposed
to the cra_driver_name of the larval.
In fact larvals have only one name, cra_name which is the name that was
requested by the user. The test here is simply trying to find out whether
the algorithm being registered can or can not satisfy the larval.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Previously we assumed for convenience that the block size is a multiple of
the algorithm's required alignment. With the pending addition of CTR this
will no longer be the case as the block size will be 1 due to it being a
stream cipher. However, the alignment requirement will be that of the
underlying implementation which will most likely be greater than 1.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
We do not allow spaces in algorithm names or parameters. Thanks to Joy Latten
for pointing this out.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As Joy Latten points out, inner algorithm parameters will miss the closing
bracket which will also cause the outer algorithm to terminate prematurely.
This patch fixes that also kills the WARN_ON if the number of parameters
exceed the maximum as that is a user error.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
XTS currently considered to be the successor of the LRW mode by the IEEE1619
workgroup. LRW was discarded, because it was not secure if the encyption key
itself is encrypted with LRW.
XTS does not have this problem. The implementation is pretty straightforward,
a new function was added to gf128mul to handle GF(128) elements in ble format.
Four testvectors from the specification
http://grouper.ieee.org/groups/1619/email/pdf00086.pdf
were added, and they verify on my system.
Signed-off-by: Rik Snel <rsnel@cube.dyndns.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use max in blkcipher_get_spot() instead of open coding it.
Signed-off-by: Ingo Oeser <ioe-lkml@rameria.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When scatterwalk is built as a module digest.c was broken because it
requires the crypto_km_types structure which is in scatterwalk. This
patch removes the crypto_km_types structure by encoding the logic into
crypto_kmap_type directly.
In fact, this even saves a few bytes of code (not to mention the data
structure itself) on i386 which is about the only place where it's
needed.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the authenc algorithm which constructs an AEAD algorithm
from an asynchronous block cipher and a hash. The construction is done
by concatenating the encrypted result from the cipher with the output
from the hash, as is used by the IPsec ESP protocol.
The authenc algorithm exists as a template with four parameters:
authenc(auth, authsize, enc, enckeylen).
The authentication algorithm, the authentication size (i.e., truncating
the output of the authentication algorithm), the encryption algorithm,
and the encryption key length. Both the size field and the key length
field are in bytes. For example, AES-128 with SHA1-HMAC would be
represented by
authenc(hmac(sha1), 12, cbc(aes), 16)
The key for the authenc algorithm is the concatenation of the keys for
the authentication algorithm with the encryption algorithm. For the
above example, if a key of length 36 bytes is given, then hmac(sha1)
would receive the first 20 bytes while the last 16 would be given to
cbc(aes).
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the function scatterwalk_map_and_copy which reads or
writes a chunk of data from a scatterlist at a given offset. It will
be used by authenc which would read/write the authentication data at
the end of the cipher/plain text.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The scatterwalk code is only used by algorithms that can be built as
a module. Therefore we can move it into algapi.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Since not everyone needs a queue pointer and those who need it can
always get it from the context anyway the queue pointer in the
common alg object is redundant.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch ensures that kernel.h and slab.h are included for
the setkey_unaligned function. It also breaks a couple of
long lines.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds support for having multiple parameters to
a template, separated by a comma. It also adds support
for integer parameters in addition to the current algorithm
parameter type.
This will be used by the authenc template which will have
four parameters: the authentication algorithm, the encryption
algorithm, the authentication size and the encryption key
length.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds crypto_aead which is the interface for AEAD
(Authenticated Encryption with Associated Data) algorithms.
AEAD algorithms perform authentication and encryption in one
step. Traditionally users (such as IPsec) would use two
different crypto algorithms to perform these. With AEAD
this comes down to one algorithm and one operation.
Of course if traditional algorithms were used we'd still
be doing two operations underneath. However, real AEAD
algorithms may allow the underlying operations to be
optimised as well.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds support for the SEED cipher (RFC4269).
This patch have been used in few VPN appliance vendors in Korea for
several years. And it was verified by KISA, who developed the
algorithm itself.
As its importance in Korean banking industry, it would be great
if linux incorporates the support.
Signed-off-by: Hye-Shik Chang <perky@FreeBSD.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Other options requiring specific block cipher algorithms already have
the appropriate select's.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Fix dma_wait_for_async_tx to not loop forever in the case where a
dependency chain is longer than two entries. This condition will not
happen with current in-kernel drivers, but fix it for future drivers.
Found-by: Saeed Bishara <saeed.bishara@gmail.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The previous patch had the conditional inverted. This patch fixes it
so that we return the original position if it does not straddle a page.
Thanks to Bob Gilligan for spotting this.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The function blkcipher_get_spot tries to return a buffer of
the specified length that does not straddle a page. It has
an off-by-one bug so it may advance a page unnecessarily.
What's worse, one of its callers doesn't provide a buffer
that's sufficiently long for this operation.
This patch fixes both problems. Thanks to Bob Gilligan for
diagnosing this problem and providing a fix.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
setkey_unaligned() commited in ca7c39385c
overwrites unallocated memory in the following memset() because
I used the wrong buffer length.
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Andrew Morton:
[async_memcpy] is very wrong if both ASYNC_TX_KMAP_DST and
ASYNC_TX_KMAP_SRC can ever be set. We'll end up using the same kmap
slot for both src add dest and we get either corrupted data or a BUG.
Evgeniy Polyakov:
Btw, shouldn't it always be kmap_atomic() even if flag is not set.
That pages are usual one returned by alloc_page().
So fix the usage of kmap_atomic and kill the ASYNC_TX_KMAP_DST and
ASYNC_TX_KMAP_SRC flags.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Simple and stupid - just use the same code from another place in the kernel.
Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The async_tx api provides methods for describing a chain of asynchronous
bulk memory transfers/transforms with support for inter-transactional
dependencies. It is implemented as a dmaengine client that smooths over
the details of different hardware offload engine implementations. Code
that is written to the api can optimize for asynchronous operation and the
api will fit the chain of operations to the available offload resources.
I imagine that any piece of ADMA hardware would register with the
'async_*' subsystem, and a call to async_X would be routed as
appropriate, or be run in-line. - Neil Brown
async_tx exploits the capabilities of struct dma_async_tx_descriptor to
provide an api of the following general format:
struct dma_async_tx_descriptor *
async_<operation>(..., struct dma_async_tx_descriptor *depend_tx,
dma_async_tx_callback cb_fn, void *cb_param)
{
struct dma_chan *chan = async_tx_find_channel(depend_tx, <operation>);
struct dma_device *device = chan ? chan->device : NULL;
int int_en = cb_fn ? 1 : 0;
struct dma_async_tx_descriptor *tx = device ?
device->device_prep_dma_<operation>(chan, len, int_en) : NULL;
if (tx) { /* run <operation> asynchronously */
...
tx->tx_set_dest(addr, tx, index);
...
tx->tx_set_src(addr, tx, index);
...
async_tx_submit(chan, tx, flags, depend_tx, cb_fn, cb_param);
} else { /* run <operation> synchronously */
...
<operation>
...
async_tx_sync_epilog(flags, depend_tx, cb_fn, cb_param);
}
return tx;
}
async_tx_find_channel() returns a capable channel from its pool. The
channel pool is organized as a per-cpu array of channel pointers. The
async_tx_rebalance() routine is tasked with managing these arrays. In the
uniprocessor case async_tx_rebalance() tries to spread responsibility
evenly over channels of similar capabilities. For example if there are two
copy+xor channels, one will handle copy operations and the other will
handle xor. In the SMP case async_tx_rebalance() attempts to spread the
operations evenly over the cpus, e.g. cpu0 gets copy channel0 and xor
channel0 while cpu1 gets copy channel 1 and xor channel 1. When a
dependency is specified async_tx_find_channel defaults to keeping the
operation on the same channel. A xor->copy->xor chain will stay on one
channel if it supports both operation types, otherwise the transaction will
transition between a copy and a xor resource.
Currently the raid5 implementation in the MD raid456 driver has been
converted to the async_tx api. A driver for the offload engines on the
Intel Xscale series of I/O processors, iop-adma, is provided in a later
commit. With the iop-adma driver and async_tx, raid456 is able to offload
copy, xor, and xor-zero-sum operations to hardware engines.
On iop342 tiobench showed higher throughput for sequential writes (20 - 30%
improvement) and sequential reads to a degraded array (40 - 55%
improvement). For the other cases performance was roughly equal, +/- a few
percentage points. On a x86-smp platform the performance of the async_tx
implementation (in synchronous mode) was also +/- a few percentage points
of the original implementation. According to 'top' on iop342 CPU
utilization drops from ~50% to ~15% during a 'resync' while the speed
according to /proc/mdstat doubles from ~25 MB/s to ~50 MB/s.
The tiobench command line used for testing was: tiobench --size 2048
--block 4096 --block 131072 --dir /mnt/raid --numruns 5
* iop342 had 1GB of memory available
Details:
* if CONFIG_DMA_ENGINE=n the asynchronous path is compiled away by making
async_tx_find_channel a static inline routine that always returns NULL
* when a callback is specified for a given transaction an interrupt will
fire at operation completion time and the callback will occur in a
tasklet. if the the channel does not support interrupts then a live
polling wait will be performed
* the api is written as a dmaengine client that requests all available
channels
* In support of dependencies the api implicitly schedules channel-switch
interrupts. The interrupt triggers the cleanup tasklet which causes
pending operations to be scheduled on the next channel
* Xor engines treat an xor destination address differently than a software
xor routine. To the software routine the destination address is an implied
source, whereas engines treat it as a write-only destination. This patch
modifies the xor_blocks routine to take a an explicit destination address
to mirror the hardware.
Changelog:
* fixed a leftover debug print
* don't allow callbacks in async_interrupt_cond
* fixed xor_block changes
* fixed usage of ASYNC_TX_XOR_DROP_DEST
* drop dma mapping methods, suggested by Chris Leech
* printk warning fixups from Andrew Morton
* don't use inline in C files, Adrian Bunk
* select the API when MD is enabled
* BUG_ON xor source counts <= 1
* implicitly handle hardware concerns like channel switching and
interrupts, Neil Brown
* remove the per operation type list, and distribute operation capabilities
evenly amongst the available channels
* simplify async_tx_find_channel to optimize the fast path
* introduce the channel_table_initialized flag to prevent early calls to
the api
* reorganize the code to mimic crypto
* include mm.h as not all archs include it in dma-mapping.h
* make the Kconfig options non-user visible, Adrian Bunk
* move async_tx under crypto since it is meant as 'core' functionality, and
the two may share algorithms in the future
* move large inline functions into c files
* checkpatch.pl fixes
* gpl v2 only correction
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-By: NeilBrown <neilb@suse.de>
The async_tx api tries to use a dma engine for an operation, but will fall
back to an optimized software routine otherwise. Xor support is
implemented using the raid5 xor routines. For organizational purposes this
routine is moved to a common area.
The following fixes are also made:
* rename xor_block => xor_blocks, suggested by Adrian Bunk
* ensure that xor.o initializes before md.o in the built-in case
* checkpatch.pl fixes
* mark calibrate_xor_blocks __init, Adrian Bunk
Cc: Adrian Bunk <bunk@stusta.de>
Cc: NeilBrown <neilb@suse.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Evgeniy's hifn driver and probably mine don't use ablkcipher->queue at all.
The show method of ablkcipher will access this field without checking if it
is valid.
Signed-off-by: Sebastian Siewior <linux-crypto@ml.breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
setkey() in {cipher,blkcipher,ablkcipher,hash}.c does not respect the
requested alignment by the algorithm. This patch fixes it. The extra
memory is allocated by kmalloc() with GFP_ATOMIC flag.
Signed-off-by: Sebastian Siewior <linux-crypto@ml.breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Right now when a larval matures or when it dies of an error we
only wake up one waiter. This would cause other waiters to timeout
unnecessarily. This patch changes it to use complete_all to wake
up all waiters.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use menuconfigs instead of menus, so the whole menu can be disabled at once
instead of going through all options.
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Make sure that cryptd is marked as nonfreezable and does not hold up the
freezer.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The function crypto_mod_put first frees the algorithm and then drops
the reference to its module. Unfortunately we read the module pointer
which after freeing the algorithm and that pointer sits inside the
object that we just freed.
So this patch reads the module pointer out before we free the object.
Thanks to Luca Tettamanti for reporting this.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The return value of crypto_hash_final isn't checked in test_hash_cycles.
This patch corrects this. Thanks to Eric Sesterhenn for reporting this.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
By the time kthread_run returns the param may have already been freed
so writing the returned thread_struct pointer to param is wrong.
In fact, we don't need it in param anyway so this patch simply puts it
on the stack.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the cryptd module which is a template that takes a
synchronous software crypto algorithm and converts it to an asynchronous
one by executing it in a kernel thread.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As it is whenever a new algorithm with the same name is registered
users of the old algorithm will be removed so that they can take
advantage of the new algorithm. This presents a problem when the
new algorithm is not equivalent to the old algorithm. In particular,
the new algorithm might only function on top of the existing one.
Hence we should not remove users unless they can make use of the
new algorithm.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch allows the use of nested templates by allowing the use of
brackets inside a template parameter.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the mid-level interface for asynchronous block ciphers.
It also includes a generic queueing mechanism that can be used by other
asynchronous crypto operations in future.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch passes the type/mask along when constructing instances of
templates. This is in preparation for templates that may support
multiple types of instances depending on what is requested. For example,
the planned software async crypto driver will use this construct.
For the moment this allows us to check whether the instance constructed
is of the correct type and avoid returning success if the type does not
match.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch converts the tcrypt module to use the asynchronous block cipher
interface. As all synchronous block ciphers can be used through the async
interface, tcrypt is still able to test them.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the frontend interface for asynchronous block ciphers.
In addition to the usual block cipher parameters, there is a callback
function pointer and a data pointer. The callback will be invoked only
if the encrypt/decrypt handlers return -EINPROGRESS. In other words,
if the return value of zero the completion handler (or the equivalent
code) needs to be invoked by the caller.
The request structure is allocated and freed by the caller. Its size
is determined by calling crypto_ablkcipher_reqsize(). The helpers
ablkcipher_request_alloc/ablkcipher_request_free can be used to manage
the memory for a request.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The proc functions were incorrectly marked as used rather than unused.
They may be unused if proc is disabled.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
After 13 years of use, it looks like my email address is finally going
to disappear. While this is likely to drop the amount of incoming spam
greatly ;-), it may also affect more appropriate messages, so let's
update my email address in various places. In addition, Host AP mailing
list is subscribers-only and linux-wireless can also be used for
discussing issues related to this driver which is now shown in
MAINTAINERS.
Signed-off-by: Jouni Malinen <j@w1.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
On platforms where flush_dcache_page is needed we're currently flushing
the next page right than the one we've just processed. This patch fixes
the off-by-one error.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In the scatterwalk_copychunks loop, We should be advancing by
len_this_page and not nbytes. The latter is the total length.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch fixes loading the tcrypt module while deflate isn't available
at all (isn't build).
Signed-off-by: Sebastian Siewior <linux-crypto@ml.breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In the loop in scatterwalk_copychunks(), if walk->offset is zero,
then scatterwalk_pagedone rounds that up to the nearest page boundary:
walk->offset += PAGE_SIZE - 1;
walk->offset &= PAGE_MASK;
which is a no-op in this case, so we don't advance to the next element
of the scatterlist array:
if (walk->offset >= walk->sg->offset + walk->sg->length)
scatterwalk_start(walk, sg_next(walk->sg));
and we end up copying the same data twice.
It appears that other callers of scatterwalk_{page}done first advance
walk->offset, so I believe that's the correct thing to do here.
This caused a bug in NFS when run with krb5p security, which would
cause some writes to fail with permissions errors--for example, writes
of less than 8 bytes (the des blocksize) at the start of a file.
A git-bisect shows the bug was originally introduced by
5c64097aa0, first in 2.6.19-rc1.
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Many struct file_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data. In addition it'll catch accidental writes at compile time to
these shared resources.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds the code of Camellia code for testing module.
Signed-off-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the main code of Camellia cipher algorithm.
Signed-off-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the Kconfig entry for Camellia.
Signed-off-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds support for multiple frontend types for each backend
algorithm by passing the type and mask through to the backend type
init function.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The crypto_comp conversion missed the last remaining crypto_alloc_tfm
call. This patch replaces it with crypto_alloc_comp.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add a crypto module to provide FCrypt encryption as used by RxRPC.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add PCBC crypto template support as used by RxRPC.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds tests for SHA384 HMAC and SHA512 HMAC to the tcrypt module. Test data was taken from
RFC4231. This patch is a follow-up to the discovery (bug 7646) that the kernel SHA384 HMAC
implementation was not generating proper SHA384 HMACs.
Signed-off-by: Andrew Donofrio <linuxbugzilla@kriptik.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Using blkcipher/hash crypto operations in hard IRQ context can lead
to random memory corruption due to the reuse of kmap_atomic slots.
Since crypto operations were never meant to be used in hard IRQ
contexts, this patch checks for such usage and returns an error
before kmap_atomic is performed.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch moves the config options for the s390 crypto instructions
to the standard "Hardware crypto devices" menu. In addition some
cleanup has been done: use a flag for supported keylengths, add a
warning about machien limitation, return ENOTSUPP in case the
hardware has no support, remove superfluous printks and update
email addresses.
Signed-off-by: Jan Glauber <jan.glauber@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Remove useless includes of linux/io.h, don't even try to build iomap_copy
on uml (it doesn't have readb() et.al., so...)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The SHA384 block size should be 128 bytes, not 96 bytes. This was
spotted by Andrew Donofrio.
Fortunately the block size isn't actually used anywhere so this typo
has had no real impact.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Main module, this implements the Liskov Rivest Wagner block cipher mode
in the new blockcipher API. The implementation is based on ecb.c.
The LRW-32-AES specification I used can be found at:
http://grouper.ieee.org/groups/1619/email/pdf00017.pdf
It implements the optimization specified as optional in the
specification, and in addition it uses optimized multiplication
routines from gf128mul.c.
Since gf128mul.[ch] is not tested on bigendian, this cipher mode
may currently fail badly on bigendian machines.
Signed-off-by: Rik Snel <rsnel@cube.dyndns.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
A lot of cypher modes need multiplications in GF(2^128). LRW, ABL, GCM...
I use functions from this library in my LRW implementation and I will
also use them in my ABL (Arbitrary Block Length, an unencumbered (correct
me if I am wrong, wide block cipher mode).
Elements of GF(2^128) must be presented as u128 *, it encourages automatic
and proper alignment.
The library contains support for two different representations of GF(2^128),
see the comment in gf128mul.h. There different levels of optimization
(memory/speed tradeoff).
The code is based on work by Dr Brian Gladman. Notable changes:
- deletion of two optimization modes
- change from u32 to u64 for faster handling on 64bit machines
- support for 'bbe' representation in addition to the, already implemented,
'lle' representation.
- move 'inline void' functions from header to 'static void' in the
source file
- update to use the linux coding style conventions
The original can be found at:
http://fp.gladman.plus.com/AES/modes.vc8.19-06-06.zip
The copyright (and GPL statement) of the original author is preserved.
Signed-off-by: Rik Snel <rsnel@cube.dyndns.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch removes the following no longer used functions:
- api.c: crypto_alg_available()
- digest.c: crypto_digest_init()
- digest.c: crypto_digest_update()
- digest.c: crypto_digest_final()
- digest.c: crypto_digest_digest()
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
On Tue, Nov 14, 2006 at 01:41:25AM -0800, Andrew Morton wrote:
>...
> Changes since 2.6.19-rc5-mm2:
>...
> git-cryptodev.patch
>...
> git trees
>...
This patch makes some needlessly global code static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This is core code of XCBC.
XCBC is an algorithm that forms a MAC algorithm out of a cipher algorithm.
For example, AES-XCBC-MAC is a MAC algorithm based on the AES cipher
algorithm.
Signed-off-by: Kazunori MIYAZAWA <miyazawa@linux-ipv6.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Pass the work_struct pointer to the work function rather than context data.
The work function can use container_of() to work out the data.
For the cases where the container of the work_struct may go away the moment the
pending bit is cleared, it is made possible to defer the release of the
structure by deferring the clearing of the pending bit.
To make this work, an extra flag is introduced into the management side of the
work_struct. This governs auto-release of the structure upon execution.
Ordinarily, the work queue executor would release the work_struct for further
scheduling or deallocation by clearing the pending bit prior to jumping to the
work function. This means that, unless the driver makes some guarantee itself
that the work_struct won't go away, the work function may not access anything
else in the work_struct or its container lest they be deallocated.. This is a
problem if the auxiliary data is taken away (as done by the last patch).
However, if the pending bit is *not* cleared before jumping to the work
function, then the work function *may* access the work_struct and its container
with no problems. But then the work function must itself release the
work_struct by calling work_release().
In most cases, automatic release is fine, so this is the default. Special
initiators exist for the non-auto-release case (ending in _NAR).
Signed-Off-By: David Howells <dhowells@redhat.com>
Since cryptomgr is the only way to construct algorithm instances
for now it makes sense to let the templates depend on it as
otherwise it may be left off inadvertently.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch makes crypto_alloc_base() return proper return value.
- If kzalloc() failure happens within __crypto_alloc_tfm(),
crypto_alloc_base() returns NULL. But crypto_alloc_base()
is supposed to return error code as pointer. So this patch
makes it return -ENOMEM in that case.
- crypto_alloc_base() is suppose to return -EINTR, if it is
interrupted by signal. But it may not return -EINTR.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The error return values are truncated by unlikely so we need to
save it first. Thanks to Kyle Moffett for spotting this.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The crypto_hash_update call in hmac_init gave the number 1
instead of the length of the sg list in bytes. This is a
missed conversion from the digest => hash change.
As tcrypt only tests crypto_hash_digest it didn't catch this.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch adds the crypto_comp type to complete the compile-time checking
conversion. The functions crypto_has_alg and crypto_has_cipher, etc. are
also added to replace crypto_alg_available.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch removes the old HMAC implementation now that nobody uses it
anymore.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch converts tcrypt to use the new HMAC template rather than the
hard-coded version of HMAC. It also converts all digest users to use
the new cipher interface.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch rewrites HMAC as a crypto template. This means that HMAC is no
longer a hard-coded part of the API. It's now a template that generates
standard digest algorithms like any other.
The old HMAC is preserved until all current users are converted.
The same structure can be used by other MACs such as AES-XCBC-MAC.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The existing digest user interface is inadequate for support asynchronous
operations. For one it doesn't return a value to indicate success or
failure, nor does it take a per-operation descriptor which is essential
for the issuing of requests while other requests are still outstanding.
This patch is the first in a series of steps to remodel the interface
for asynchronous operations.
For the ease of transition the new interface will be known as "hash"
while the old one will remain as "digest".
This patch also changes sg_next to allow chaining.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Mark the parts of the cipher interface that have been replaced by
block ciphers as deprecated. Thanks to Andrew Morton for suggesting
doing this before removing them completely.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds block cipher algorithms for S390. Once all users of the
old cipher type have been converted the existing CBC/ECB non-block cipher
operations will be removed.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds two block cipher algorithms, CBC and ECB. These
are implemented as templates on top of existing single-block cipher
algorithms. They invoke the single-block cipher through the new
encrypt_one/decrypt_one interface.
This also optimises the in-place encryption and decryption to remove
the cost of an IV copy each round.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the new type of block ciphers. Unlike current cipher
algorithms which operate on a single block at a time, block ciphers
operate on an arbitrarily long linear area of data. As it is block-based,
it will skip any data remaining at the end which cannot form a block.
The block cipher has one major difference when compared to the existing
block cipher implementation. The sg walking is now performed by the
algorithm rather than the cipher mid-layer. This is needed for drivers
that directly support sg lists. It also improves performance for all
algorithms as it reduces the total number of indirect calls by one.
In future the existing cipher algorithm will be converted to only have
a single-block interface. This will be done after all existing users
have switched over to the new block cipher type.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch prepares the scatterwalk code for use by the new block cipher
type.
Firstly it halves the size of scatter_walk on 32-bit platforms. This
is important as we allocate at least two of these objects on the stack
for each block cipher operation.
It also exports the symbols since the block cipher code can be built as
a module.
Finally there is a hack in scatterwalk_unmap that relies on progress
being made. Unfortunately, for hardware crypto we can't guarantee
progress to be made since the hardware can fail.
So this also gets rid of the hack by not advancing the address returned
by scatterwalk_map.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds two new operations for the simple cipher that encrypts or
decrypts a single block at a time. This will be the main interface after
the existing block operations have moved over to the new block ciphers.
It also adds the crypto_cipher type which is currently only used on the
new operations but will be extended to setkey as well once existing users
have been converted to use block ciphers where applicable.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the crypto_type structure which will be used for all new
crypto algorithm types, beginning with block ciphers.
The primary purpose of this abstraction is to allow different crypto_type
objects for crypto algorithms of the same type, in particular, there will
be a different crypto_type objects for asynchronous algorithms.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The sleeping flag used to determine whether crypto_yield can actually
yield is really a per-operation flag rather than a per-tfm flag. This
patch changes crypto_yield to take a flag directly so that we can start
using a per-operation flag instead the tfm flag.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Up until now all crypto transforms have been of the same type, struct
crypto_tfm, regardless of whether they are ciphers, digests, or other
types. As a result of that, we check the types at run-time before
each crypto operation.
This is rather cumbersome. We could instead use different C types for
each crypto type to ensure that the correct types are used at compile
time. That is, we would have crypto_cipher/crypto_digest instead of
just crypto_tfm. The appropriate type would then be required for the
actual operations such as crypto_digest_digest.
Now that we have the type/mask fields when looking up algorithms, it
is easy to request for an algorithm of the precise type that the user
wants. However, crypto_alloc_tfm currently does not expose these new
attributes.
This patch introduces the function crypto_alloc_base which will carry
these new parameters. It will be renamed to crypto_alloc_tfm once
all existing users have been converted.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the asynchronous flag and changes all existing users to
only look up algorithms that are synchronous.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the helpers crypto_get_attr_alg and crypto_alloc_instance
which can be used by simple one-argument templates like hmac to process
input parameters and allocate instances.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch makes IV operations on ECB fail through nocrypt_iv rather than
calling BUG(). This is needed to generalise CBC/ECB using the template
mechanism.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Now that crc32c has been fixed to conform with standard digest semantics,
we can use test_hash for it. I've turned the last test into a chunky
test.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When the final result location is unaligned, we store the digest in a
temporary buffer before copying it to the final location. Currently
that buffer sits on the stack. This patch moves it to an area in the
tfm, just like the CBC IV buffer.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Now that the tfm is passed directly to setkey instead of the ctx, we no
longer need to pass the &tfm->crt_flags pointer.
This patch also gets rid of a few unnecessary checks on the key length
for ciphers as the cipher layer guarantees that the key length is within
the bounds specified by the algorithm.
Rather than testing dia_setkey every time, this patch does it only once
during crypto_alloc_tfm. The redundant check from crypto_digest_setkey
is also removed.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The convention for setkey is that once it is set it should not change,
in particular, init must not wipe out the key set by it. In fact, init
should always be used after setkey before any digestion is performed.
The only user of crc32c that sets the key is tcrypt. This patch adds
the necessary init calls there.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Crypto modules should be loadable by their .cra_driver_name, so
we should make MODULE_ALIAS()es with these names. This patch adds
aliases for SHA1 and SHA256 only as that's what we need for
PadLock-SHA driver.
Signed-off-by: Michal Ludvig <michal@logix.cz>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Spawns lock a specific crypto algorithm in place. They can then be used
with crypto_spawn_tfm to allocate a tfm for that algorithm. When the base
algorithm of a spawn is deregistered, all its spawns will be automatically
removed.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch also adds the infrastructure to pick an algorithm based on
their type. For example, this allows you to select the encryption
algorithm "aes", instead of any algorithm registered under the name
"aes". For now this is only accessible internally. Eventually it
will be made available through crypto_alloc_tfm.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The cryptomgr module is a simple manager of crypto algorithm instances.
It ensures that parameterised algorithms of the type tmpl(alg) (e.g.,
cbc(aes)) are always created.
This is meant to satisfy the needs for most users. For more complex
cases such as deeper combinations or multiple parameters, a netlink
module will be created which allows arbitrary expressions to be parsed
in user-space.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a notifier chain for algorithm/template registration events.
This will be used to register compound algorithms such as cbc(aes). In
future this will also be passed onto user-space through netlink.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
A crypto_template generates a crypto_alg object when given a set of
parameters. this patch adds the basic data structure fo templates
and code to handle their registration/deregistration.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The crypto API is made up of the part facing users such as IPsec and the
low-level part which is used by cryptographic entities such as algorithms.
This patch splits out the latter so that the two APIs are more clearly
delineated. As a bonus the low-level API can now be modularised if all
algorithms are built as modules.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Up until now we've relied on module reference counting to ensure that the
crypto_alg structures don't disappear from under us. This was good enough
as long as each crypto_alg came from exactly one module.
However, with parameterised crypto algorithms a crypto_alg object may need
two or more modules to operate. This means that we need to count the
references to the crypto_alg object directly.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The functions crypto_alg_get and crypto_alg_put operates on the crypto
modules rather than the algorithms. Therefore it makes sense to call
them crypto_mod_get and crypto_alg_put respectively.
This is needed because we need to have real algorithm reference counters
for parameterised algorithms as they can be unregistered from below by
when their parameter algorithms are themselves unregistered.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a proper driver name and priority to the generic c
implemtation to allow coexistance of c and assembler modules.
Signed-off-by: Joachim Fritschi <jfritschi@freenet.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch splits up the twofish crypto routine into a common part ( key
setup ) which will be uses by all twofish crypto modules ( generic-c , i586
assembler and x86_64 assembler ) and generic-c part. It also creates a new
header file which will be used by all 3 modules.
This eliminates all code duplication.
Correctness was verified with the tcrypt module and automated test scripts.
Signed-off-by: Joachim Fritschi <jfritschi@freenet.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
It makes no sense to build tcrypt into the kernel. In fact, now that
the driver init function's return status is being checked, it is in
fact harmful to do so.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds speed tests (benchmarks) for digest algorithms.
Tests are run with different buffer sizes (16 bytes, ... 8 kBytes)
and with each buffer multiple tests are run with different update()
sizes (e.g. hash 64 bytes buffer in four 16 byte updates).
There is no correctness checking of the result and all tests and
algorithms use the same input buffer.
Signed-off-by: Michal Ludvig <michal@logix.cz>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Intentionaly return -EAGAIN from module_init() to ensure
it doesn't stay loaded in the kernel. The module does all
its work from init() and doesn't offer any runtime
functionality => we don't need it in the memory, do we?
Signed-off-by: Michal Ludvig <michal@logix.cz>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
We already allow asynchronous removal of existing algorithm modules. By
allowing the replacement of existing algorithms, we can replace algorithms
without having to wait for for all existing users to complete.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
We do need to change these names now and even more so in future with
instantiated algorithms. So let's stop lying to the compiler and get
rid of the const modifiers.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the hooks cra_init/cra_exit which are called during a tfm's
construction and destruction respectively. This will be used by the instances
to allocate child tfm's.
For now this lets us get rid of the coa_init/coa_exit functions which are
used for exactly that purpose (unlike the dia_init function which is called
for each transaction).
In fact the coa_exit path is currently buggy as it may get called twice
when an error is encountered during initialisation.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Fix a few omissions in passing TFM instead of CTX to algorithms.
Signed-off-by: Michal Ludvig <michal@logix.cz>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Up until now algorithms have been happy to get a context pointer since
they know everything that's in the tfm already (e.g., alignment, block
size).
However, once we have parameterised algorithms, such information will
be specific to each tfm. So the algorithm API needs to be changed to
pass the tfm structure instead of the context pointer.
This patch is basically a text substitution. The only tricky bit is
the assembly routines that need to get the context pointer offset
through asm-offsets.h.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Various digest algorithms operate one block at a time and therefore
keep a temporary buffer of partial blocks. This buffer does not need
to be initialised since there is a counter which indicates what is and
isn't valid in it.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Some hash modules load/store data words directly. The digest layer
should pass properly aligned buffer to update()/final() method. This
patch also add cra_alignmask to some hash modules.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
On 64-bit platform, reading 64-bit keys (which is supposed to be
32-bit aligned) at a time will result in unaligned access.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The AES setkey routine writes 64 bytes to the E_KEY area even though
there are only 60 bytes there. It is in fact safe since E_KEY is
immediately follwed by D_KEY which is initialised afterwards. However,
doing this may trigger undefined behaviour and makes Coverity unhappy.
So by combining E_KEY and D_KEY into one array we sidestep this issue
altogether.
This problem was reported by Adrian Bunk.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Force 32-bit alignment on keys in tcrypt test vectors. Also rearrange the
structure to prevent unnecessary padding.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The "des3_ede" and "serpent" lack cra_alignmask.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
this patch converts crypto/ to kzalloc usage.
Compile tested with allyesconfig.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Since tfm contexts can contain arbitrary types we should provide at least
natural alignment (__attribute__ ((__aligned__))) for them. In particular,
this is needed on the Xscale which is a 32-bit architecture with a u64 type
that requires 64-bit alignment. This problem was reported by Ronen Shitrit.
The crypto_tfm structure's size was 44 bytes on 32-bit architectures and
80 bytes on 64-bit architectures. So adding this requirement only means
that we have to add an extra 4 bytes on 32-bit architectures.
On i386 the natural alignment is 16 bytes which also benefits the VIA
Padlock as it no longer has to manually align its context structure to
128 bits.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
A bunch of asm/bug.h includes are both not needed (since it will get
pulled anyway) and bogus (since they are done too early). Removed.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Many cipher implementations use 4-byte/8-byte loads/stores which require
alignment on some architectures. This patch explicitly sets the alignment
requirements for them.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The cipher code path may allocate up to two blocks of data on the stack.
Therefore we need to place limits on the maximum block size.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Since the temporary buffer is used as an argument to cia_decrypt, it must be
aligned by cra_alignmask. This bug was found by linux@horizon.com.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch avoids shifting the count left and right needlessly for each
call to sha1_update(). It instead can be done only once at the end in
sha1_final().
Keeping the previous test example (sha1_update() successively called with
len=64), a 1.3% performance increase can be observed on i386, or 0.2% on
ARM. The generated code is also smaller on ARM.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch gives more descriptive names to the variables i and j.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The current code unconditionally copy the first block for every call to
sha1_update(). This can be avoided if there is no pending partial block.
This is always the case on the first call to sha1_update() (if the length
is >= 64 of course.
Furthermore, temp does need to be called if sha_transform is never invoked.
Also consolidate the sha_transform calls into one to reduce code size.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As the Crypto API now allows multiple implementations to be registered
for the same algorithm, we no longer have to play tricks with Kconfig
to select the right AES implementation.
This patch sets the driver name and priority for all the AES
implementations and removes the Kconfig conditions on the C implementation
for AES.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This is the first step on the road towards asynchronous support in
the Crypto API. It adds support for having multiple crypto_alg objects
for the same algorithm registered in the system.
For example, each device driver would register a crypto_alg object
for each algorithm that it supports. While at the same time the
user may load software implementations of those same algorithms.
Users of the Crypto API may then select a specific implementation
by name, or choose any implementation for a given algorithm with
the highest priority.
The priority field is a 32-bit signed integer. In future it will be
possible to modify it from user-space.
This also provides a solution to the problem of selecting amongst
various AES implementations, that is, aes vs. aes-i586 vs. aes-padlock.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
A lot of crypto code needs to read/write a 32-bit/64-bit words in a
specific gender. Many of them open code them by reading/writing one
byte at a time. This patch converts all the applicable usages over
to use the standard byte order macros.
This is based on a previous patch by Denis Vlasenko.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Sanitize some s390 Kconfig options. We have ARCH_S390, ARCH_S390X,
ARCH_S390_31, 64BIT, S390_SUPPORT and COMPAT. Replace these 6 options by
S390, 64BIT and COMPAT.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add new test vectors to the AES test suite for AES CBC and AES with plaintext
larger than AES blocksize.
Signed-off-by: Jan Glauber <jan.glauber@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add support for the hardware accelerated AES crypto algorithm.
Signed-off-by: Jan Glauber <jan.glauber@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add support for the hardware accelerated sha256 crypto algorithm.
Signed-off-by: Jan Glauber <jan.glauber@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Replace all references to z990 by s390 in the in-kernel crypto files in
arch/s390/crypto. The code is not specific to a particular machine (z990) but
to the s390 platform. Big diff, does nothing..
Signed-off-by: Jan Glauber <jan.glauber@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The cipher code relies on the fact that the block size is a multiple
of the required alignment. So we should check this at the time of
algorith registration. We also ensure that the block size is bounded
by the page size.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch rewrites various occurences of &sg[0] where sg is an array
of length one to simply sg.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch uses sg_set_buf/sg_init_one in some places where it was
duplicated.
Signed-off-by: David Hardeman <david@2gen.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Greg KH <greg@kroah.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The boundary check in the standard multi-block cipher processors are
broken when nbytes is not a multiple of bsize. In those cases it will
always process an extra block.
This patch corrects the check so that it processes at most nbytes of
data.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The crypto layer currently uses in_atomic() to determine whether it is
allowed to sleep. This is incorrect since spin locks don't always cause
in_atomic() to return true.
Instead of that, this patch returns to an earlier idea of a per-tfm flag
which determines whether sleeping is allowed. Unlike the earlier version,
the default is to not allow sleeping. This ensures that no existing code
can break.
As usual, this flag may either be set through crypto_alloc_tfm(), or
just before a specific crypto operation.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The XTEA implementation was incorrect due to a misinterpretation of
operator precedence. Because of the wide-spread nature of this
error, the erroneous implementation will be kept, albeit under the
new name of XETA.
Signed-off-by: Aaron Grothe <ajgrothe@yahoo.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
`gcc -W' likes to complain if the static keyword is not at the beginning of
the declaration. This patch fixes all remaining occurrences of "inline
static" up with "static inline" in the entire kernel tree (140 occurrences in
47 files).
While making this change I came across a few lines with trailing whitespace
that I also fixed up, I have also added or removed a blank line or two here
and there, but there are no functional changes in the patch.
Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I've made a new implementation of DES to replace the old one in the kernel.
It provides faster encryption on all tested processors apart from the original
Pentium, and key setup is many times faster.
Speed relative to old kernel implementation
Processor des_setkey des_encrypt des3_ede_setkey des3_ede_encrypt
Pentium
120Mhz 6.8 0.82 7.2 0.86
Pentium III
1.266Ghz 5.6 1.19 5.8 1.34
Pentium M
1.3Ghz 5.7 1.15 6.0 1.31
Pentium 4
2.266Ghz 5.8 1.24 6.0 1.40
Pentium 4E
3Ghz 5.4 1.27 5.5 1.48
StrongARM 1110
206Mhz 4.3 1.03 4.4 1.14
Athlon XP
2Ghz 7.8 1.44 8.1 1.61
Athlon 64
2Ghz 7.8 1.34 8.3 1.49
Signed-off-by: Dag Arne Osvik <da@osvik.no>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The iv field in des_ctx/des3_ede_ctx/serpent_ctx has never been used.
This was noticed by Dag Arne Osvik.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implementation:
===============
The encrypt/decrypt code is based on an x86 implementation I did a while
ago which I never published. This unpublished implementation does
include an assembler based key schedule and precomputed tables. For
simplicity and best acceptance, however, I took Gladman's in-kernel code
for table generation and key schedule for the kernel port of my
assembler code and modified this code to produce the key schedule as
required by my assembler implementation. File locations and Kconfig are
kept similar to the i586 AES assembler implementation.
It may seem a little bit strange to use 32 bit I/O and registers in the
assembler implementation but this gives the best code size. My
implementation takes one instruction more per round compared to
Gladman's x86 assembler but it doesn't require any stack for local
variables or saved registers and it is less serialized than Gladman's
code.
Note that all comparisons to Gladman's code were done after my code was
implemented. I did only use FIPS PUB 197 for the implementation so my
implementation is independent work.
If anybody has a better assembler solution for x86_64 I'll be pleased to
have my code replaced with the better solution.
Testing:
========
The implementation passes the in-kernel crypto testing module and I'm
running it without any problems on my laptop where it is mainly used for
dm-crypt.
Microbenchmark:
===============
The microbenchmark was done in userspace with similar compile flags as
used during kernel compile.
Encrypt/decrypt is about 35% faster than the generic C implementation.
As the generic C as well as my assembler implementation are both table
I don't really expect that there is much room for further
improvements though I'll be glad to be corrected here.
The key schedule is about 5% slower than the generic C implementation.
This is due to the fact that some more work has to be done in the key
schedule routine to fit the schedule to the assembler implementation.
Code Size:
==========
Encrypt and decrypt are together about 2.1 Kbytes smaller than the
generic C implementation which is important with regard to L1 cache
usage. The key schedule routine is about 100 bytes larger than the
generic C implementation.
Data Size:
==========
There's no difference in data size requirements between the assembler
implementation and the generic C implementation.
License:
========
Gladmans's code is dual BSD/GPL whereas my assembler code is GPLv2 only
(I'm not going to change the license for my code). So I had to change
the module license for the x86_64 aes module from 'Dual BSD/GPL' to
'GPL' to reflect the most restrictive license within the module.
Signed-off-by: Andreas Steinmetz <ast@domdv.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
As far as I'm aware there's a general concensus that functions that are
responsible for freeing resources should be able to cope with being passed
a NULL pointer. This makes sense as it removes the need for all callers to
check for NULL, thus elliminating the bugs that happen when some forget
(safer to just check centrally in the freeing function) and it also makes
for smaller code all over due to the lack of all those NULL checks.
This patch makes it safe to pass the crypto_free_tfm() function a NULL
pointer. Once this patch is applied we can start removing the NULL checks
from the callers.
Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Even though cit_iv is now always aligned, the user can still supply an
unaligned iv through crypto_cipher_encrypt_iv/crypto_cipher_decrypt_iv.
This patch will check the alignment of the user-supplied iv and copy
it if necessary.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch ensures that cit_iv is aligned according to cra_alignmask
by allocating it as part of the tfm structure. As a side effect the
crypto layer will also guarantee that the tfm ctx area has enough space
to be aligned by cra_alignmask. This allows us to remove the extra
space reservation from the Padlock driver.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch makes a needlessly global function static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The VIA Padlock device requires the input and output buffers to
be aligned on 16-byte boundaries. This patch adds the alignmask
attribute for low-level cipher implementations to indicate their
alignment requirements.
The mid-level crypt() function will copy the input/output buffers
if they are not aligned correctly before they are passed to the
low-level implementation.
Strictly speaking, some of the software implementations require
the buffers to be aligned on 4-byte boundaries as they do 32-bit
loads. However, it is not clear whether it is better to copy
the buffers or pay the penalty for unaligned loads/stores.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds hooks for cipher algorithms to implement multi-block
ECB/CBC operations directly. This is expected to provide significant
performance boots to the VIA Padlock.
It could also be used for improving software implementations such as
AES where operating on multiple blocks at a time may enable certain
optimisations.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The VIA Padlock device is able to perform much better when multiple
blocks are fed to it at once. As this device offers an exceptional
throughput rate it is worthwhile to optimise the infrastructure
specifically for it.
We shift the existing page-sized fast path down to the CBC/ECB functions.
We can then replace the CBC/ECB functions with functions provided by the
underlying algorithm that performs the multi-block operations.
As a side-effect this improves the performance of large cipher operations
for all existing algorithm implementations. I've measured the gain to be
around 5% for 3DES and 15% for AES.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Checking a pointer for NULL before calling kfree() on it is redundant.
This patch removes such checks from crypto/
Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
After using this facility for a while to test my changes to the
cipher crypt() layer, I realised that I should've listend to Dave
and made this thing use CPU cycle counters :) As it is it's too
jittery for me to feel safe about relying on the results.
So here is a patch to make it use CPU cycles by default but fall
back to jiffies if the user specifies a non-zero sec value.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The existing keys used in the speed tests do not pass the 3DES quality check.
This patch makes it use the template keys instead.
Other algorithms can supply template keys through the same interface if needed.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
From: Reyk Floeter <reyk@vantronix.net>
I recently had the requirement to do some benchmarking on cryptoapi, and
I found reyk's very useful performance test patch [1].
However, I could not find any discussion on why that extension (or
something providing a similar feature but different implementation) was
not merged into mainline. If there was such a discussion, can someone
please point me to the archive[s]?
I've now merged the old patch into 2.6.12-rc1, the result can be found
attached to this email.
[1] http://lists.logix.cz/pipermail/padlock/2004/000010.html
Signed-off-by: Harald Welte <laforge@gnumonks.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
It seems that bad code tends to get copied (see test_cipher_speed). So let's
kill this idiom before it spreads any further.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The netlink gfp_any() problem made me double-check the uses of in_softirq()
in crypto/*. It seems to me that we should be checking in_atomic() instead
of in_softirq() in crypto_yield. Otherwise people calling the crypto ops
with spin locks held or preemption disabled will get burnt, right?
Signed-off-by: David S. Miller <davem@davemloft.net>
We want to make possible, for the user, to enable the i586 AES implementation.
This requires a restructure.
- Add a CONFIG_UML_X86 to notify that we are building a UML for i386.
- Rename CONFIG_64_BIT to CONFIG_64BIT as is used for all other archs
- Tell crypto/Kconfig that UML_X86 is as good as X86
- Tell it that it must exclude not X86_64 but 64BIT, which will give the
same results.
- Tell kbuild to descend down into arch/i386/crypto/ to build what's needed.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In the deflate_[compress|uncompress|pcompress] functions we call the
zlib_[in|de]flateReset function at the beginning. This is OK. But when we
unload the deflate module we don't call zlib_[in|de]flateEnd to free all
the zlib internal data. It looks like a bug for me. Please, consider the
attached patch.
Signed-off-by: Artem B. Bityuckiy <dedekind@infradead.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!