zears - AEZ v5 implementation in Rust

Age	Commit message (Collapse)	Author
2025-04-16	fuzz against aez crate	Daniel Schadt
	I just want to ensure that we get the same encrypted values as the reference (which seems fine), but for some reason, I get a lot of crashes in aez: AddressSanitizer:DEADLYSIGNAL ================================================================= ==15467==ERROR: AddressSanitizer: SEGV on unknown address 0x7b34b0420000 (pc 0x6371fcd8f682 bp 0x7ffceb91abf0 sp 0x7ffceb91a950 T0) ==15467==The signal is caused by a READ memory access. #0 0x6371fcd8f682 in _mm_loadu_si128 /usr/lib/gcc/x86_64-pc-linux-gnu/14.2.1/include/emmintrin.h:706:10 #1 0x6371fcd8f682 in loadu /home/daniel/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/aez-0.0.7/aez5-impls/aesni/encrypt.c:107:46 #2 0x6371fcd8f682 in cipher_aez_core /home/daniel/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/aez-0.0.7/aez5-impls/aesni/encrypt.c:572:32 #3 0x6371fcd8d581 in aez::Aez::encrypt::h56048920113a17d9 /home/daniel/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/aez-0.0.7/src/lib.rs:118:13 The crash
2025-04-15	slightly speed up aez_prf	Daniel Schadt
	It doesn't matter much because we barely expect tau > 16, but if somebody decides to use aez as a way to generate a lot of pseudorandom bytes, then oh well. With this change, we make better use of SIMD block xor'ing if available.
2025-04-15	add documentation about feature flags	Daniel Schadt

2025-04-15	make portable_simd optional	Daniel Schadt

2025-04-11	merge {de,en}cipher_aez_{tiny,core}	Daniel Schadt

2025-04-11	add decryption benchmark	Daniel Schadt

2025-04-11	optimize e(-1) call	Daniel Schadt

2025-04-11	add comment about AES NI instructions	Daniel Schadt

2025-04-11	don't always allocate a vec for tweaks	Daniel Schadt

2025-04-11	use simd instructions	Daniel Schadt
	(requires nightly compiler)
2025-04-11	roll Block::mul back up	Daniel Schadt
	I've unrolled this earlier to speed up the computation for the commonly used factors, but now we're precomputing the values anyway so there's no reason to keep the code ugly.
2025-04-11	manually compute Block ^ Block	Daniel Schadt
	This gives around 30% speedup, presumably because casting to the int is more expensive than I thought. This operation is used so frequently in the hot loop that even a tiny speedup can add up quickly.
2025-04-11	move hot comparison out of E::eval	Daniel Schadt
	Most of the time, especially in the hot loop, we're falling into the lower branch with j != -1. Doing this check in advance gives around 10% speedup. Now, the code for j == -1 is directly in e(), as we never use E::new(-1, ...) anyway.
2025-04-10	only have a single AesImpl instance	Daniel Schadt
	When I first wrote the aesenc/aes4/aes10 functions, I didn't know yet how they were going to be used, so I sticked to the spec as much as possible. As it turns out, they are always used with the same keys, so it's enough to "initialize" the AES once, and then re-use for multiple E computations. It's also beginning a lot to look like all of those functions should actually be methods, which is something we can fix in the future (and unite decipher/encipher). Anyway, the speedup here is around 38% for the 1KiB benchmark, and 4% for the 16KiB benchmark.
2025-04-10	pre-multiply keys	Daniel Schadt
	This can give a speedup from 17% to 66%, depending on the input size (larger speedup for larger inputs). It seems like even the "optimized" multiply is slow enough to really cause a slowdown, especially for large inputs where it is called a lot.
2025-04-10	implement aes4 and aes10 with native instructions	Daniel Schadt
	Even though aes::hazmat::cipher_round uses aes-ni instructions under the hood, simply loading the data (and the keys!) takes a significant amount of time. Sadly, there's no way that aes exposes that lets you re-use the "loaded" keys. By implementing aes4/aes10 directly with _mm_aesenc, we can keep the keys properly aligned. We still keep the software backend as fallback, using the software implementation of the aes crate. This gives a ~70% speedup.
2025-04-10	unroll Block::mul	Daniel Schadt
	We only ever use this function for small factors, either 2 (in Block::exp), or 0-7 (in e, after the modulo 8). Therefore, for those small values, we hard-code how they are computed by manually unrolling the loop/recursion. This gives around 30% more throughput.
2025-04-10	don't pass arrays of keys to aes4 and aes10	Daniel Schadt

2025-04-10	precompute e(0, 0, key)	Daniel Schadt
	adds more performance benefit
2025-04-10	rewrite Block::clip	Daniel Schadt
	doesn't change performance, but is nicer to read
2025-04-10	add first benchmark	Daniel Schadt

2025-04-10	rewrite aesenc to work in-place	Daniel Schadt
	speeds up encryption by a bit
2025-04-09	add repository linkv0.1.0	Daniel Schadt

2025-04-09	change aez_prf to write into a buffer	Daniel Schadt

2025-04-09	add first fuzz binary	Daniel Schadt

2025-04-09	expose non-vec API	Daniel Schadt

2025-04-09	rewrite algorithm to work in-place	Daniel Schadt

2025-04-09	speed up computation of successive e values	Daniel Schadt
	This vastly speeds up the encipher/decipher functions, as we no longer keep computing key_i * (1 << exponent) over and over again.
2025-04-09	speed up multiplication	Daniel Schadt

2025-04-09	speed up zero appendage	Daniel Schadt

2025-04-09	fix overflow for long messages	Daniel Schadt

2025-04-08	add test case for empty message	Daniel Schadt

2025-04-08	use constant_time_eq in decryption function	Daniel Schadt

2025-04-08	revert test case reporting	Daniel Schadt

2025-04-08	add documentation	Daniel Schadt

2025-04-05	use proper Block struct and operator overloading	Daniel Schadt

2025-04-04	first working version!	Daniel Schadt