aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-04-22push version to 0.2.0HEADv0.2.0masterDaniel Schadt
2025-04-22add notes about fuzzing to readmeDaniel Schadt
2025-04-17add keywords/categories/badgesDaniel Schadt
2025-04-17fuzz against slow aez-ref, not fast aez-niDaniel Schadt
Two reasons: First, this allows us to test more of the algorithm, as the (slow) reference implementation supports multiple associated data items, large values for tau, ... Second, this avoids the segfault crash, which is a limit of the fast implementation (the assumption there is that data is aligned properly, and even a read out-of-bounds will not cause a segfault).
2025-04-16fuzz against aez crateDaniel Schadt
I just want to ensure that we get the same encrypted values as the reference (which seems fine), but for some reason, I get a lot of crashes in aez: AddressSanitizer:DEADLYSIGNAL ================================================================= ==15467==ERROR: AddressSanitizer: SEGV on unknown address 0x7b34b0420000 (pc 0x6371fcd8f682 bp 0x7ffceb91abf0 sp 0x7ffceb91a950 T0) ==15467==The signal is caused by a READ memory access. #0 0x6371fcd8f682 in _mm_loadu_si128 /usr/lib/gcc/x86_64-pc-linux-gnu/14.2.1/include/emmintrin.h:706:10 #1 0x6371fcd8f682 in loadu /home/daniel/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/aez-0.0.7/aez5-impls/aesni/encrypt.c:107:46 #2 0x6371fcd8f682 in cipher_aez_core /home/daniel/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/aez-0.0.7/aez5-impls/aesni/encrypt.c:572:32 #3 0x6371fcd8d581 in aez::Aez::encrypt::h56048920113a17d9 /home/daniel/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/aez-0.0.7/src/lib.rs:118:13 The crash
2025-04-15slightly speed up aez_prfDaniel Schadt
It doesn't matter much because we barely expect tau > 16, but if somebody decides to use aez as a way to generate a lot of pseudorandom bytes, then oh well. With this change, we make better use of SIMD block xor'ing if available.
2025-04-15add documentation about feature flagsDaniel Schadt
2025-04-15make portable_simd optionalDaniel Schadt
2025-04-11merge {de,en}cipher_aez_{tiny,core}Daniel Schadt
2025-04-11add decryption benchmarkDaniel Schadt
2025-04-11optimize e(-1) callDaniel Schadt
2025-04-11add comment about AES NI instructionsDaniel Schadt
2025-04-11don't always allocate a vec for tweaksDaniel Schadt
2025-04-11use simd instructionsDaniel Schadt
(requires nightly compiler)
2025-04-11roll Block::mul back upDaniel Schadt
I've unrolled this earlier to speed up the computation for the commonly used factors, but now we're precomputing the values anyway so there's no reason to keep the code ugly.
2025-04-11manually compute Block ^ BlockDaniel Schadt
This gives around 30% speedup, presumably because casting to the int is more expensive than I thought. This operation is used so frequently in the hot loop that even a tiny speedup can add up quickly.
2025-04-11move hot comparison out of E::evalDaniel Schadt
Most of the time, especially in the hot loop, we're falling into the lower branch with j != -1. Doing this check in advance gives around 10% speedup. Now, the code for j == -1 is directly in e(), as we never use E::new(-1, ...) anyway.
2025-04-10only have a single AesImpl instanceDaniel Schadt
When I first wrote the aesenc/aes4/aes10 functions, I didn't know yet how they were going to be used, so I sticked to the spec as much as possible. As it turns out, they are always used with the same keys, so it's enough to "initialize" the AES once, and then re-use for multiple E computations. It's also beginning a lot to look like all of those functions should actually be methods, which is something we can fix in the future (and unite decipher/encipher). Anyway, the speedup here is around 38% for the 1KiB benchmark, and 4% for the 16KiB benchmark.
2025-04-10pre-multiply keysDaniel Schadt
This can give a speedup from 17% to 66%, depending on the input size (larger speedup for larger inputs). It seems like even the "optimized" multiply is slow enough to really cause a slowdown, especially for large inputs where it is called a lot.
2025-04-10implement aes4 and aes10 with native instructionsDaniel Schadt
Even though aes::hazmat::cipher_round uses aes-ni instructions under the hood, simply loading the data (and the keys!) takes a significant amount of time. Sadly, there's no way that aes exposes that lets you re-use the "loaded" keys. By implementing aes4/aes10 directly with _mm_aesenc, we can keep the keys properly aligned. We still keep the software backend as fallback, using the software implementation of the aes crate. This gives a ~70% speedup.
2025-04-10unroll Block::mulDaniel Schadt
We only ever use this function for small factors, either 2 (in Block::exp), or 0-7 (in e, after the modulo 8). Therefore, for those small values, we hard-code how they are computed by manually unrolling the loop/recursion. This gives around 30% more throughput.
2025-04-10don't pass arrays of keys to aes4 and aes10Daniel Schadt
2025-04-10precompute e(0, 0, key)Daniel Schadt
adds more performance benefit
2025-04-10rewrite Block::clipDaniel Schadt
doesn't change performance, but is nicer to read
2025-04-10add first benchmarkDaniel Schadt
2025-04-10rewrite aesenc to work in-placeDaniel Schadt
speeds up encryption by a bit
2025-04-09add repository linkv0.1.0Daniel Schadt
2025-04-09change aez_prf to write into a bufferDaniel Schadt
2025-04-09add first fuzz binaryDaniel Schadt
2025-04-09expose non-vec APIDaniel Schadt
2025-04-09rewrite algorithm to work in-placeDaniel Schadt
2025-04-09speed up computation of successive e valuesDaniel Schadt
This vastly speeds up the encipher/decipher functions, as we no longer keep computing key_i * (1 << exponent) over and over again.
2025-04-09speed up multiplicationDaniel Schadt
2025-04-09speed up zero appendageDaniel Schadt
2025-04-09fix overflow for long messagesDaniel Schadt
2025-04-08add test case for empty messageDaniel Schadt
2025-04-08use constant_time_eq in decryption functionDaniel Schadt
2025-04-08revert test case reportingDaniel Schadt
2025-04-08add documentationDaniel Schadt
2025-04-05use proper Block struct and operator overloadingDaniel Schadt
2025-04-04first working version!Daniel Schadt