Preferences

_benedict
Joined 684 karma

  1. I can’t speak for the authors, but I have been lucky enough to be collaborating with them on behalf of the Apache Cassandra project, to refine and prove the correctness of the Accord protocol - a derivative of EPaxos we have integrated into the database.

    It would be fantastic if such a project could be pursued for this variant, which has the distinction of being the only “real world” implementation.

    Either way, thank you for the original EPaxos paper - it has been a privilege to convert its intuitions into a practical system.

  2. “Paxos” is a term that can mean many different things, so it’s better not to get too attached to any one meaning especially in different contexts.

    Multi Paxos is commonly used (especially in industry) as short hand for multi decree Paxos (in contrast to single decree Paxos), but “Paxos” most often refers to the family of protocols, all of which are typically implemented with a leader. It is confusing of course because single decree Paxos is used to implement EPaxos (and its derivatives).

    It’s worth noting also that Lamport is (supposedly) on the record as having intended “Paxos” to refer to the protocol incorporating the leader optimisation.

  3. I remember there being sufficient documentary evidence in the entrance/shop/museum bit to conclude it was most likely created by the very people who “discovered” it, to serve as a tourist attraction.
  4. Thank you for linking the source material, unfortunately it badly contradicts you. It clearly shows that the _very first_ list of ten suggested search terms contained (pretty heavily) sexualised suggestions.
  5. No, the ruling expressly refers to the list as non exhaustive, but given the other related references to misconduct (including the use of inappropriate language) it was not reasonable to infer that this example was gross misconduct.
  6. Worth noting, not quite "everyone" does this. Cassandra uses "leaderless" (single decree) paxos, which has some advantages and some disadvantages (for instance, 1RT WAN reads from any region).

    I agree with you that Paxos is simpler than Raft. The problem with Paxos IMO is that Lamport's original paper is impenetrable; lots of later writing is easier to understand, including those that describe more complex protocols. The intuitions are actually pretty straightforward, and transfer to all of the extensions to Paxos (which are not as straightforwardly compatible with Raft).

    Raft may have helped more people get comfortable with distributed consensus, and sped its adoption, but being a sort of dangling branch of the tech tree I wonder if this may have stalled progress beyond it.

  7. Not even a coincidence really, it's a very different kind of system. It's an implementation of Hermes with network layer integration. Hermes is designed with very different goals in mind, specifically within-DC consensus with minimal failures (with the caveat I am not intimately familiar):

    - Every replica must acknowledge a write, which is undesirable in a WAN setting, due to having to wait for replies from the furthest region

    - At most one concurrent "read-modify-write" operation may succeed, so peak throughput is limited by request latency

    - Failure of any replica requires reconfiguration for any request to succeed (equivalent to leader election), so the leaderless property here does not improve tail latencies, indeed it is likely harmed by exposing your workload to more required reconfigurations

    Cassandra is designed for multiple (usually quite far apart) DC deployments that want to maximise availability and minimise latency, and where failure is expected. Here a quorum system is typically preferable for request latency.

  8. Revolutionary may be an overstatement, it just affords different system characteristics. There's plenty of literature on the topic though, starting generally with EPaxos[1]. The protocol that we are developing is for Apache Cassandra, is called Accord[2], and forms the basis of our new distributed transaction feature [3]. I will note that the whitepaper linked in [3] is a bit out of date, and there was a bug in the protocol specification at that time. We hope to publish an updated paper in a proper venue in the near future.

    [1] https://www.cs.cmu.edu/~dga/papers/epaxos-sosp2013.pdf [2] https://github.com/apache/cassandra-accord [3] https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-15...

  9. Do you anywhere elaborate what you mean by leaderless, and how this affects the semantics and guarantees you offer?

    So far as I understand both Kafka and Pulsar use (leader-based) consensus protocols to deliver some of their features and guarantees, so to match these you must either have developed a leaderless consensus protocol, or modify the guarantees you offer, or else have a leader-based consensus protocol you utilise still?

    From one of your other answers, you mention you rely on Apache Bookkeeper, which appears to be leader-based?

    I ask because I am aware of only one industry leaderless consensus protocol under development (and I am working on it), and it is always fun to hear about related work.

  10. > If you really want to get silly, x2iedn.32xl is 128 vCPU / 4 TiB RAM, and you get 3.8 TiB of local NVMe

    This doesn't affect availability - except insofar as unavailability might be caused by insufficient capacity, which is not the typical definition.

    > Depending on your availability SLOs, of course

    Yes, exactly. Which is the point the GP was making. You generally make the trade-off in question not for performance, but because you have SLOs demanding higher availability. If you do not have these SLOs, then of course you don't want to make that trade-off.

  11. I think you may have misunderstood the GP and are perhaps misusing terminology. You cannot meaningfully scale vertically to improve write availability, and if you care about availability a single machine (and often a primary/secondary setup) is insufficient.

    Even if you only care about scaling reads, eventually the 1:N write:read replica ratio will become too costly to maintain, and long before you reach that point you likely sacrifice real-time isolation guarantees to maintain your write availability and throughput.

  12. This doesn’t seem to provide higher write availability, and if the read replicas are consistent with the write replica this design must surely degrade write availability as it improves read availability, since the write replica must update all the read replicas.

    This also doesn’t appear to describe a higher durability design at all by normal definitions (in the context of databases at least) if it’s async…?

  13. Fair enough!
  14. Ha, as it happens there's documentary evidence online from Diego himself, that he was not influenced by VSR.

    https://groups.google.com/g/raft-dev/c/cBNLTZT2q8o

  15. A good example to illustrate this perhaps is Babbage. He invented the computer first, but nobody using computers today was influenced by him, impressive though his achievements were! Nor would we say that computers are a kind of Babbage “analytical engine”. We say they are a kind of computer.
  16. I don’t interpret those words that way. I see that as a recognition of the VSR paper, as had been recently highlighted in VSR revisited at the time of publication. I guess you would have to ask the author if VSR had actually influenced his work, it’s certainly possible, but not the inference I would make from that snippet.

    The paper references Paxos something like 100 times, versus 3 for VSR. It defines itself as a more understandable alternative to Paxos, so it was certainly influenced both by the existence and relevance of Paxos, and also in opposition to its apparent difficulty.

  17. I know it was published first, we’ve talked about this before :)

    But, I’m not sure what was published first decide what’s a variant of what. I would say that, given the breadth of research into variants of Paxos and the ways it can be modified, it is most meaningful today to say they’re all variants of Paxos.

    VSR having had little to no research or industry application until recently has a pretty weak claim. It does not appear to have influenced either Paxos or Raft. Raft was influenced by Paxos, and even VSR revisited discusses it in relation to these protocols.

  18. TigerBeetle uses VSR, which is basically a variant of MultiPaxos/Raft.
  19. I’m not sure how a written constitution that is anyway interpreted by “the prevailing political elite class” is functionally much different?
  20. I think the problem is that exploit can mean multiple things, and it’s obviously true that companies want to exploit everything in the non-pejorative sense.

    The problem is transforming the word’s meaning in the next sentence to imply they use the resource/personnel unfairly, which is demonstrably untrue as you point out (though it’s certainly the case that for some companies both meanings apply)

This user hasn’t submitted anything.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal