Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Have a look at rendezvous hashing (https://en.wikipedia.org/wiki/Rendezvous_hashing). It's simpler, and more general than 'consistent hashing'. Eg you don't have to muck around with virtual nodes. Everything just works out, even for small numbers of targets.

It's also easier to come up with an exact weighted version of rendezvous hashing. See https://en.wikipedia.org/wiki/Rendezvous_hashing#Weighted_re... for the weighted variant.

Faintly related: if you are into load balancing, you might also want to look into the 'power of 2 choices'. See eg https://www.eecs.harvard.edu/~michaelm/postscripts/mythesis.... or this HN discussion at https://news.ycombinator.com/item?id=37143376

The basic idea is that you can vastly improve on random assignment for load balancing by instead picking two servers at random, and assigning to the less loaded one.

It's an interesting topic in itself, but there's also ways to combine it with consistent hashing / rendezvous hashing.



I also double that rendezvous hashing suggestion. Article mentions that it has O(n) time where n is number of nodes. I made a library[1] which makes rendezvous hashing more practical for a larger number of nodes (or weight shares), making it O(1) amortized running time with a bit of tradeoff: distributed elements are pre-aggregated into clusters (slots) before passing them through HRW.

[1]: https://pkg.go.dev/github.com/SenseUnit/ahrw


Does it really matter? Here, n is a very small number, which is almost a constant. I'd assume the iteration over the n space is negligible compared to the other parts of a request to a node.


Yes, different applications have different trade-offs.


> if you are into load balancing, you might also want to look into the 'power of 2 choices'.

You can do that better if you don't use a random number for the hash, instead flip a coin (well, check a bit of the hash of a hash), to make sure hash expansion works well.

This trick means that when you go from N -> N+1, all the keys move to the N+1 bucket instead of being rearranged across all of them.

I've seen this two decades ago and after seeing your comment, felt like getting Claude to recreate what I remembered from back then & write a fake paper [1] out of it.

See the MSB bit in the implementation.

That said, consistent hashes can split ranges by traffic not popularity, so back when I worked in this, the Membase protocol used capacity & traffic load to split the virtual buckets across real machines.

Hot partition rebalancing is hard with a fixed algorithm.

[1] - https://github.com/t3rmin4t0r/magic-partitioning/blob/main/M...


> This trick means that when you go from N -> N+1, all the keys move to the N+1 bucket instead of being rearranged across all of them.

Isn't that how rendezvous hashing (and consistent hashing) already work?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: