aboutsummaryrefslogtreecommitdiff
path: root/src/lib.rs
AgeCommit message (Collapse)Author
2025-04-11use simd instructionsDaniel Schadt
(requires nightly compiler)
2025-04-11move hot comparison out of E::evalDaniel Schadt
Most of the time, especially in the hot loop, we're falling into the lower branch with j != -1. Doing this check in advance gives around 10% speedup. Now, the code for j == -1 is directly in e(), as we never use E::new(-1, ...) anyway.
2025-04-10only have a single AesImpl instanceDaniel Schadt
When I first wrote the aesenc/aes4/aes10 functions, I didn't know yet how they were going to be used, so I sticked to the spec as much as possible. As it turns out, they are always used with the same keys, so it's enough to "initialize" the AES once, and then re-use for multiple E computations. It's also beginning a lot to look like all of those functions should actually be methods, which is something we can fix in the future (and unite decipher/encipher). Anyway, the speedup here is around 38% for the 1KiB benchmark, and 4% for the 16KiB benchmark.
2025-04-10pre-multiply keysDaniel Schadt
This can give a speedup from 17% to 66%, depending on the input size (larger speedup for larger inputs). It seems like even the "optimized" multiply is slow enough to really cause a slowdown, especially for large inputs where it is called a lot.
2025-04-10implement aes4 and aes10 with native instructionsDaniel Schadt
Even though aes::hazmat::cipher_round uses aes-ni instructions under the hood, simply loading the data (and the keys!) takes a significant amount of time. Sadly, there's no way that aes exposes that lets you re-use the "loaded" keys. By implementing aes4/aes10 directly with _mm_aesenc, we can keep the keys properly aligned. We still keep the software backend as fallback, using the software implementation of the aes crate. This gives a ~70% speedup.
2025-04-10don't pass arrays of keys to aes4 and aes10Daniel Schadt
2025-04-10precompute e(0, 0, key)Daniel Schadt
adds more performance benefit
2025-04-10rewrite aesenc to work in-placeDaniel Schadt
speeds up encryption by a bit
2025-04-09change aez_prf to write into a bufferDaniel Schadt
2025-04-09expose non-vec APIDaniel Schadt
2025-04-09rewrite algorithm to work in-placeDaniel Schadt
2025-04-09speed up computation of successive e valuesDaniel Schadt
This vastly speeds up the encipher/decipher functions, as we no longer keep computing key_i * (1 << exponent) over and over again.
2025-04-09speed up zero appendageDaniel Schadt
2025-04-09fix overflow for long messagesDaniel Schadt
2025-04-08add test case for empty messageDaniel Schadt
2025-04-08use constant_time_eq in decryption functionDaniel Schadt
2025-04-08revert test case reportingDaniel Schadt
2025-04-08add documentationDaniel Schadt
2025-04-05use proper Block struct and operator overloadingDaniel Schadt
2025-04-04first working version!Daniel Schadt