I don't like that the author seems to be confusing Blake3's MAC mode with HMAC. Just because it's a hash with a key that's safe to use as a MAC (and acts as a PRF) doesn't mean it's HMAC, there's no double hashing nor is there the use of the HMAC-specific padding constants. HMAC is a particular standard, not any hash-based MAC.
I agree that XChaCha20+Blake3 is a good construction.
It's also "easy" to make an XChaCha20-Blake3-SIV construction, by using the Blake3 MAC of the plaintext as the Nonce for XChaCha20 as well as the tag. That makes it a deterministic key-committing authenticated encryption with associated data. If you stick a 256-bit random nonce in the AAD, then you have a full Authenticated Hedged Encryption with Associated Data (AHEAD) system. The disadvantage is (as always with SIV-like constructions) that it's two-pass, so not suitable for streaming encryption, but still good for quite a few uses and a bit safer than non-deterministic AEADs.
That's a valid point on wording, that I tried to toe carefully with the "effectively an HMAC", as this post was written for a more general audience and I didn't want to go too far into the details, I wrote it to have something to point to when the inevitable questions roll in.
Balancing the lies-to-children is always a hard job
What's your perspective on reusing the same key between XChaCha20 and Blake3 invocations? While not necessarily a footgun in practice, reusing the same key between two contexts is generally avoided as a matter of good cryptographic hygiene.
You could either require a single larger key that's (effectively) a key for each cipher concatenated with one-another, or use one 256-bit key from which you derive independent keys via a pass through the hashing function. I personally prefer something closer to the former, since it requires less cryptography in general and there aren't many situations where you're sweating the size difference between 512-bit keys vs. 256-bit keys.
I suspect most systems implementing XChaCha20+Blake3 just reuse the key, which is probably fine... but it leaves a bad taste in my mouth and seems like just another potential risk that's not all that difficult to sidestep. And we're going to feel really dumb one day if it turns out there is a way to exploit that kind of key reuse after all.
In the library I wrote this post for, the encryption and mac keys are separate, with both the keys being randomly generated, and then persisted to disk encrypted with an argon2 produced key (with some tertiary validation), which is what I generally recommend. The use of Blake3 as the mac _should_ avoid most of the problem with using a key that's derived from a password, but I very much like not leaving that door open.
Sure, that's what I was getting at with "a pass through the hashing function" though I could have been more specific. Again, I slightly prefer just using a single larger two-part key (or two separate keys) if reasonable, mostly out of a desire to require the minimum necessary cryptographic constructs.
But the point of my question wasn't necessarily to debate those two approaches as much as it was to politely bring up the detail of using independent keys for each cipher. I'm not sure I'd necessarily call it a footgun as much as I would call it proper hygiene.
I guess I'm just saying it would be super weird to see a modern cryptosystem share keys for any pair of constructions, because the standard pattern here is to start with some root secret (usually a DH agreement, maybe with a PSK mixed in) and then just HKDF out all the specific secrets needed to drive the rest of the system. I don't know of something that could really blow up if you used a Blake KMAC with the same key as ChaCha20, but if you saw that in a real design, you'd assume other things were wrong with it, right?
HKDF requires HMAC, and you've got Blake3's KDF mode as well. Personally I'd get 2 keys using Blake3's KDF, then use one for Blake3's MAC and the other for XChaCha20.
I'm getting myself into a lot of trouble today randomly throwing around terms, like as if HKDF meant "any secure hash based KDF, like HKDF but you know with Blake3".
Yep. I know you know better, but since it's essentially the same carelessness I first complained about I figured I should be consistent and complain again!
One of the nice things about Blake3 is that it does have a KDF mode built in. And a MAC mode. And there's a (stalled) proposal to build in an AHEAD mode, though IMO it needs more academic security analysis (IIRC that's part of why it stalled). It might not be as interestingly innovative as the sponge construction Keccak introduced, but it's a very versatile primitive with excellent software performance.
No, you're 100% right to call it out! I went into this thinking that much of this discussion was pedantic and missed the point that a generic composition of a good stream cipher and a hash MAC has benefits --- and leave it mostly agreeing that the formalisms that people seemed hung up on are indeed pretty important.
The formalisms are important. I wish they weren't, and this stuff was all ready to go and easy to use correctly. But there's sadly no such commonly available system for that, everything has tradeoffs and many of them are subtle.
The SIV modes are great, they're much easier to use. When used in a full AHEAD construction (where you stick a random nonce in the AAD of a deterministic AEAD) you get nearly a "best of both worlds" non-deterministic encryption but without the catastrophic failure properties of something like GCM. But they're inherently 2-pass. So the user might have to deal with "chunking" their data, which can be annoying if they're streaming, etc. And since there are two passes over the plaintext with two different algorithms (one for the MAC to make the SIV, one to encrypt) that's two "traces" a side-channel observing attacker has which can make some attacks more powerful.
I agree that XChaCha20+Blake3 is a good construction.
It's also "easy" to make an XChaCha20-Blake3-SIV construction, by using the Blake3 MAC of the plaintext as the Nonce for XChaCha20 as well as the tag. That makes it a deterministic key-committing authenticated encryption with associated data. If you stick a 256-bit random nonce in the AAD, then you have a full Authenticated Hedged Encryption with Associated Data (AHEAD) system. The disadvantage is (as always with SIV-like constructions) that it's two-pass, so not suitable for streaming encryption, but still good for quite a few uses and a bit safer than non-deterministic AEADs.