Instead of doing all this complicated thing, how about simply following a Raft-l...

nvanbenschoten · on July 19, 2022

I don't think this scheme provides the "monotonic reads" property discussed in the blog post. Specifically, it would be possible for a reader to observe a new value from r2 (who received a timely heartbeat), then to later observe an older value from r3 (who received a delayed heartbeat). This would be a violation of linearizability, which mandates that operations appear to take place atomically, regardless of which replica is consulted behind the scenes. This is important because linearizability is compositional, so users of CockroachDB and internal systems within CockroachDB can both use global tables as a building block without needing to design around subtle race conditions.

However, for the sake of discussion, this is an interesting point on the design spectrum! A scheme that provides read-your-writes but not monotonic reads is essentially what you would get if you took global tables as described in this blog post, but then never had read-only transactions commit-wait. It's a trade-off we considered early in the design of this work and one that we may consider exposing in the future for select use cases. Here's the relevant section of the original design proposal, if you're interested: https://github.com/cockroachdb/cockroach/blob/master/docs/RF....

brickbrd · on July 21, 2022

Thanks. Yes this explanation is something I can agree with. It does not provide monotonic reads.

remram · on July 20, 2022

> until that write op has been applied to the log of all the replicas, not just the quorum

That removes all the fault tolerance. What do you do if you never get the acknowledgement from all replicas?

brickbrd · on July 21, 2022

That question doesn’t make much sense. If you have quorum then eventually repairs will kick in and will get replicated everywhere.

So it can tolerate up to N/2 failures just like other consensus system. Because this is basically Raft.