This post makes a great case for how universal logs are in data systems. It was strange to me that there was no log-as-service with the qualities that make it suitable for building higher-level systems like durable execution: conditional appends (as called out by the post!), support very large numbers of logs, allow pushing high throughputs with strict ordering, and just generally provide a simple serverless experience like object storage. This led to https://s2.dev/ which is now available in preview.
It was interesting to learn how Restate links events for a key, with key-level logical logs multiplexed over partitioned physical logs. I imagine this is implemented with a leader per physical log, so you can consistently maintain an index. A log service supporting conditional appends allows such a leader to act like the log is local to it, despite offering replicated durability.
Leadership can be an important optimization for most systems, but shared logs also allow for multi-writer systems pretty easily. We blogged about this pattern https://s2.dev/blog/kv-store
> It was strange to me that there was no log-as-service with the qualities that make it suitable for building higher-level systems like durable execution
Indeed. We are trying to democratize that secret sauce. Since it is backed by object storage, the latencies are not what AWS enjoys with its internal Journal service, but we intend to get there with a NVMe-based tier later. In the meantime there is an existing large market for event streaming where a "truly serverless" (https://erikbern.com/2021/04/19/software-infrastructure-2.0-...) API has been missing.
very exciting. this is the future. i am working on a very similar concept. every database is a log at its core, so the log, which is the highest performance part of the system, is buried behind many layers of much lower performing cruft. edge persistence with log-per-user application patterns opens up so many possibilities.
I just want a recognized standard format for write ahead logs. Start with replicating data between OLTP and OLAP databases with minimal glue code, and start moving other systems to a similar structure, like Kafka, then new things we haven’t thought of yet.
The structure for the write head logs needs to different between systems. For Postgres, the WAL is a record of writes with new blocks. It can't be used without knowing the Postgres disk format. I don't think it can be used to construct logical changes.
Using a standard format, converting things into logical data, would be significantly slower. It is important that WAL be fast because it is the bottleneck in transactions. It would make more sense to have a separate change streaming service.
This is why we didn't actually call it logs as a service, but streams :P I meant to refer to the log abstraction this post talks about, see links therein. Observability events are but one kind of data you may want as a stream of durable records.
It was interesting to learn how Restate links events for a key, with key-level logical logs multiplexed over partitioned physical logs. I imagine this is implemented with a leader per physical log, so you can consistently maintain an index. A log service supporting conditional appends allows such a leader to act like the log is local to it, despite offering replicated durability.
Leadership can be an important optimization for most systems, but shared logs also allow for multi-writer systems pretty easily. We blogged about this pattern https://s2.dev/blog/kv-store