RFC 9861: KangarooTwelve and TurboSHAKE

indolering · 2025-10-23T22:54:29 1761260069

These Keccak derived hashes are my personal favorite hashes. The Keccak sponge function is a more advanced mathematical foundation of hashing than what we had before. The individual mixing functions were also carefully chosen to do very different things. My favorite attribute, however, is that all the cryptanalysis done on SHA3 maps directly to Keccak (unlike Blake's follow-ons).

coppsilgold · 2025-10-24T00:20:11 1761265211

Unfortunately a state of 25 64-bit words isn't likely get CPU intrinsics anytime soon.

adrian_b · 2025-10-24T06:31:25 1761287485

That is less than 4 of the 32 software-visible vector registers of an AMD Zen 4 or Zen 5 CPU, or of the future Intel CPUs that will reintroduce AVX-512.

There is no difficulty in defining AVX-512 instructions that would operate on a hash state of this size.

The real amount of 64-bit registers in a modern CPU is well above one thousand and the implementation of the SHA-3 functions is very efficient in hardware, so adding instructions for these hashes would have a very modest cost.

indolering · 2025-10-25T02:31:39 1761359499

Keccak is core to the SHA3 standard. If speed is a concern, there are hand tuned assembly versions and hardware implementations out there.

RainyDayTmrw · 2025-10-24T01:28:31 1761269311

Can anyone share real-world examples of where and why one would use these, please?

adrian_b · 2025-10-24T06:47:31 1761288451

The forms of Keccak that were initially standardized in SHA-3 were secure but significantly slower than possible.

This had the consequence that for many applications where speed is important the existing very efficient implementations of BLAKE2b-512 or of the faster but less secure BLAKE3 have been preferred, or SHA-256 or SHA-512, on the CPUs where these are implemented in hardware.

However, it is also possible to use Keccak in modes of operation where it is as fast or faster than any other comparable hash (e.g. by using parallelizable tree hashing, like the BLAKE derivatives). Previously these modes were less known, because they were not standardized and because the existing reference implementations were less polished than those of the BLAKE derivatives.

After being included in standards like this RFC, it can be hoped that these good secure hashes will become more widely available.

Recent ARM-based CPUs have instructions for the core functions of Keccak, while on AMD/Intel CPUs with SHA-512 Keccak is rather fast even without dedicated instructions. Therefore on such CPUs KangarooTwelve and TurboSHAKE can be very fast right now, when using an appropriate implementation.

For instance I use BLAKE2b-512 for file integrity checking, frequently (i.e. at least a few times per day) running it over hundreds of GB or over many TB of data. Now, when I have an AVX-512 capable CPU, i.e. a Zen 5, I should experiment with implementing an optimized KangarooTwelve, because it should be much faster on such a CPU.

oconnor663 · 2025-10-24T21:29:19 1761341359

If you want to try an optimized AVX-512 implementation of KangarooTwelve on the command line, you can `cargo install k12sum`. On my machine it's neck-and-neck with `b3sum --no-mmap` (which does not use threads).

Edit: Oh it looks like another option, `KeccakSum`, was released a couple months ago? https://github.com/XKCP/K12/commit/5271b58c990c1ac33c1097b4e...

indolering · 2025-10-25T21:37:02 1761428222

Note that the tweaks to Blake2 and Blake3 weakened the cryptanalysis done for the SHA3 hashing contest such that it is not directly applicable. That doesn't mean they are not well respected and probably fine, it's just that we have less confidence in them than K12 and other sibling Keccak hashes.

I would love to hear how the benchmarking goes!

adrian_b · 2025-10-25T09:26:38 1761384398

EDIT: Typo

I have noticed too late that I have incorrectly written "AMD/Intel CPUs with SHA-512", but I have meant AMD/Intel CPUs with AVX-512".

(There are a few recent Intel CPUs that have SHA-512, i.e. Lunar Lake and the desktop variant of Arrow Lake, i.e. Arrow Lake S, but this has nothing to do with Keccak.)

indolering · 2025-10-24T05:10:52 1761282652

Anywhere you need a high assurance and high speed hash function. And because of the sponge design, it can be the heart of lots of cryptographic protocols.

RainyDayTmrw · 2025-10-25T01:08:46 1761354526

The edit window passed, so let me add: where and why one would use an extensible-output function in particular.

adrian_b · 2025-10-25T09:37:16 1761385036

This is useful when you use a secure hash function as a key derivation function for some communication protocol or file or disk encryption method. This is one of the most frequent use cases for a secure hash function.

In such cases, you hash some random value or a salted password or a shared secret value obtained by some kind of Diffie-Hellman exchange, and you must generate a variable number of secret keys, depending on the target protocol. You typically need to generate at least an encryption key and an authentication key, but frequently you need to generate even more keys, e.g. in a communication protocol you may use different keys for the 2 directions of communication, or you may change the keys after a certain amount of time or of transmitted data.

When you have just a hash function with fixed output size, e.g. SHA-512, you must transform it in a hash function with extensible output size, to get all the keys that you need.

Typically this is done by hashing repeatedly the secret value, but each time concatenated with some distinct data, e.g. the value of a counter. Like for any cryptographic operation, every detail of the implementation matters and it is easy to make mistakes.

If you already have available a hash function that has been designed to provide extensible output, then you can use it as it is and you do not have to bother with designing an ad-hoc method for extending the size of its output and with the analysis of its correctness.

RainyDayTmrw · 2025-10-25T23:47:13 1761436033

Thanks for the information.

This sounds like something I would use HKDF for. But, to your point, it's nice to be able to build the design with a fewer number of primitives, and likely more performant, too.

indolering · 2025-10-25T02:19:09 1761358749

It enables you to build a cipher.