Age | Commit message (Collapse) | Author |
|
This can give a speedup from 17% to 66%, depending on the input size
(larger speedup for larger inputs). It seems like even the "optimized"
multiply is slow enough to really cause a slowdown, especially for large
inputs where it is called a lot.
|
|
Even though aes::hazmat::cipher_round uses aes-ni instructions under the
hood, simply loading the data (and the keys!) takes a significant amount
of time. Sadly, there's no way that aes exposes that lets you re-use the
"loaded" keys.
By implementing aes4/aes10 directly with _mm_aesenc, we can keep the
keys properly aligned.
We still keep the software backend as fallback, using the software
implementation of the aes crate.
This gives a ~70% speedup.
|
|
We only ever use this function for small factors, either 2 (in
Block::exp), or 0-7 (in e, after the modulo 8). Therefore, for those
small values, we hard-code how they are computed by manually unrolling
the loop/recursion.
This gives around 30% more throughput.
|
|
|
|
adds more performance benefit
|
|
doesn't change performance, but is nicer to read
|
|
speeds up encryption by a bit
|
|
|
|
|
|
|
|
This vastly speeds up the encipher/decipher functions, as we no longer
keep computing key_i * (1 << exponent) over and over again.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|