The database only supports CRUD. So while the CDC stream is the truth, it's very low level. We build higher-level event types (as in event sourcing) for the same reason we build any higher-level abstraction: it gives us a language in which to talk about business rules. Kleppmann makes this point in his book and it was something of an aha moment for me.
It is a bit circular, CRUD's elements create, read, update, and delete were chosen to represent the core features of a persistence layer.
https://www.postgresql.org/docs/current/transaction-iso.html
I'm pretty sure journalled filesystem recycle the journal. There are log-structured filesystem but they aren't used much beyond low-level flash.
If a transaction log is replayed, then an identical set of relations will be obtained. Ergo, the log is the prime form of the database.
It’s that simple.
This is time consuming, so we optimized it by creating "base versions" every month. So a client only needs to download the latest base version and the apply the deltas since then...
Forensic accounting, incidentally, is when something went badly wrong and outside accountants have to go back through the old ledgers, and maybe old invoices and payments and reconstruct the books. FTX had to do that after the bankruptcy to find out where the money went and where it was supposed to go.
Often written to tape, for obvious reasons.
It’s curious that over those projections, we then build event stores for CQRS/ES systems, ledgers etc, with their own projections mediated by application code.
But look underneath too. The journaled filesystem on which the database resides also has a log representation, and under that, a modern SSD is using an adaptive log structure to balance block writes.
It’s been a long time since we wrote an application event stream linearly straight to media, and although I appreciate the separate concerns that each of these layers addresses, I’d probably struggle to justify them all from first principles to even a slightly more Socratic version of myself.