Have a look at rendezvous hashing (https://en.wikipedia.org/wiki/Rendezvous_hash...

Snawoot · 2025-10-03T13:44:20 1759499060

I also double that rendezvous hashing suggestion. Article mentions that it has O(n) time where n is number of nodes. I made a library[1] which makes rendezvous hashing more practical for a larger number of nodes (or weight shares), making it O(1) amortized running time with a bit of tradeoff: distributed elements are pre-aggregated into clusters (slots) before passing them through HRW.

[1]: https://pkg.go.dev/github.com/SenseUnit/ahrw

karakot · 2025-10-03T15:13:43 1759504423

Does it really matter? Here, n is a very small number, which is almost a constant. I'd assume the iteration over the n space is negligible compared to the other parts of a request to a node.

eru · 2025-10-04T07:06:48 1759561608

Yes, different applications have different trade-offs.

gopalv · 2025-10-03T15:49:57 1759506597

> if you are into load balancing, you might also want to look into the 'power of 2 choices'.

You can do that better if you don't use a random number for the hash, instead flip a coin (well, check a bit of the hash of a hash), to make sure hash expansion works well.

This trick means that when you go from N -> N+1, all the keys move to the N+1 bucket instead of being rearranged across all of them.

I've seen this two decades ago and after seeing your comment, felt like getting Claude to recreate what I remembered from back then & write a fake paper [1] out of it.

See the MSB bit in the implementation.

That said, consistent hashes can split ranges by traffic not popularity, so back when I worked in this, the Membase protocol used capacity & traffic load to split the virtual buckets across real machines.

Hot partition rebalancing is hard with a fixed algorithm.

[1] - https://github.com/t3rmin4t0r/magic-partitioning/blob/main/M...

eru · 2025-10-04T07:07:52 1759561672

> This trick means that when you go from N -> N+1, all the keys move to the N+1 bucket instead of being rearranged across all of them.

Isn't that how rendezvous hashing (and consistent hashing) already work?