Transactions Are Not Locks (2022)

128 points Oct 21, 2023

aeonik Oct 21, 2023

Don't store the state like this.

Just append an update to an immutable data store, and program the logic to unfold from the latest records.

See Diatomic: https://docs.datomic.com/pro/getting-started/brief-overview....

Spiwux Oct 22, 2023

Are you suggesting everybody should use event sourcing instead of using transactions? Because the general consensus on event sourcing is "don't use it unless you absolutely have to."

kmeisthax Oct 22, 2023

If you maintained financial records the way that database example code says you should, you'd be a terrible accountant.

Standard practice for accountancy is to record each transaction twice: once for the debited account and again for the credited account. You even create accounts for expenses and income; which are explicitly allowed to go negative (otherwise you couldn't record income). This is known as double-entry accounting, and it's equivalent to what you deride as "event sourcing".

To put it back into the language of databases, double-entry accounting is the third normal form of accounting. You wouldn't store a "sum total of blog posts on a website" row in your database schema now would you? Why would you have a "sum total of transactions" or single entries for your transactions that only show that money came in but not where it went?

crdrost Oct 22, 2023

I think you are objecting to a poorly chosen example and that is okay. It _is_ a poorly chosen example.

The exact pedantry that you are making is kind of poorly chosen, in double-entry bookkeeping you would say that you record the transaction proper either once or n + 1 times, depending on how pedantic you are being. (Double-entry bookkeeping discovered the feature that in filesystems is called “journaling,” you always write first to the transaction log, which is the “1” if you answer 1, then you update the n account ledger pages involved in the transaction with their individual deltas plus the transaction log ID.) A real bank also allows accounts to go negative all the time, which is another criticism.

The “event sourcing” is actually just the transaction log, which is not double entry, which is why I say that the pedantry is poorly chosen. The ledger, in combination with “closing the books for the year,” actually instantiates a generic construction for persisting data structures, see lectures at [1], where you take periodic snapshots of a value to limit your worst-case reconstruction time while storing a tractable buffer of deltas. The point of the ledger thus is to denormalize the data to make it efficient to answer the question, “how did this client's balance change over time?” without seeking over the whole transaction log.

[1] https://courses.csail.mit.edu/6.851/spring21/lectures/

strken Oct 22, 2023

Unfortunately financial systems are the one example everyone uses for transactions, and also the most obvious place where event sourcing is close to mandatory.

civilitty Oct 22, 2023

Event sourcing as a system wide architecture is very difficult to pull off well but it works perfectly fine when applied to a narrow use case.

It can be as simple as a Postgres append only table keyed by an (id, version) tuple with a JSON column for the change event. You don’t need to pull in the whole enterprise pattern with aggregates, projections, etc to get the essence of event sourcing.

galkk Oct 22, 2023

I suspect that parent suggested to use something like lsm trees as storage layer. They are append only. Event sourcing usually operates on much higher abstaction levels

(P.s. personally I don’t believe in event sourcing at all. This is a great idea that just doesn’t seem to work in practice).

convolvatron Oct 21, 2023

i like this model, but i think the goal was to enforce a constraint (non-negative balance) and fail the operation, reporting that result to the requester. how do you do that in the monotonic style?

btown Oct 22, 2023

In theory, in event sourcing, a reader would enumerate every proposed transaction, in order, and skip any invalid ones (or ones invalidated by a separate invalidation stream). This is expensive, so a materialized Balances table might be created and maintained - not by the database per se, but by a dedicated stream ingesting process. The user might also asynchronously see a set of failed operations - synchronous is harder, but doable if you block on polling for the stream processing result, or have an “await key” primitive. Or just wait a second or two to refresh the page.

There’s a lot of machinery involved, and a subtle logic change or error could invalidate balances for your entire client base. The transactional approach, with event sourcing used as an audit utility used by humans to verify correctness, is often a better path.

maxbond Oct 22, 2023

This episode of Signals & Threads discusses a system (an exchange for OTC securities) which is both event sourced and which has constraints. It works in the same way you've described modulo some implementation details. https://signalsandthreads.com/state-machine-replication-and-...

Most of us are never going to work with a system with the scale and SLA to necessitate this approach, but it's an enjoyable interview nonetheless.

Salgat Oct 22, 2023

A separate invalidation stream sounds like a really bad idea and is asking for race conditions. Just put the invalidation events on the same stream with the information necessary to revert the original transaction (such as the balance to restore in addition to stating the original transaction is invalidated); it's okay to denormalize on the same stream in these situations.

convolvatron Oct 22, 2023

having a separate 'acceptor' log onto the same history actually works. thanks.

pico303 Oct 22, 2023

You don’t. The original suggestion isn’t a good one for a constrained system like bank balances.

jimbob45 Oct 21, 2023

That sounds like you’re suggesting Event Sourcing?

yowlingcat Oct 22, 2023

This is not just an ignorant idea but a stupid idea. Just because you can't figure out the right way to do it in vanilla SQL doesn't mean it can't be done [1]. This strawman is just a different form of the NoSQL strawman [2].

Edit: Ah, looks like I've struck a nerve with the event sourcing Datomic fanboys. Enjoy using your golden hammer and turning every problem into a nail!

By the way, if you ever make my way into my organization and waste my time with this nonsense, I WILL fire you. Nobody should tolerate your desire to solve already-solved problems with poorer quality solutions you have selected for their novelty. At that point, I'll kindly ask you to please explain to me what a write ahead log is and what it does, because if you don't, I'll happily explain it for you.

[1] https://brandur.org/idempotency-keys

[2] https://www.youtube.com/watch?v=b2F-DItXtZs

paranoidrobot Oct 22, 2023

Re your edit: I suspect that down voter's issue with your post is the aggressive nature of it.

karmakaze Oct 22, 2023

For complete/correctness:

  select balance from accounts where name = 'A' for update;

> Now, other transactions won’t be able to read this row until our transaction is committed or rolled back (for update can only be used inside a transaction).

This is true for SERIALIZABLE isolation. Depending on the configured isolation level (e.g. READ COMMITTED aka snapshot) other concurrent transactions can read (but not update) the locked row, which can lead to write skew.

maxbond Oct 22, 2023

ETA: I misread, I thought you were referring to the table lock section, but you were referring to the row lock section. You're correct, FOR UPDATE locks do not prevent concurrent reads, which could cause a concurrency problem if you're not very careful (and Murphy's law applies).

---

The article is correct. That section contains this snippet:

    lock table accounts;

The default lock mode for LOCK is ACCESS EXCLUSIVE, which will prevent concurrent reads.

    ACCESS EXCLUSIVE (AccessExclusiveLock)

    Conflicts with locks of all modes [...]. This mode guarantees that the holder is the only transaction accessing the table in any way.

https://www.postgresql.org/docs/current/explicit-locking.htm...

yencabulator Oct 22, 2023

The part that has "for update" doesn't have "lock table".

maxbond Oct 22, 2023

My bad, thanks.

Alpi Oct 22, 2023

For update actually does block concurrent SELECT FOR UPDATES, so the article is correct

karmakaze Oct 22, 2023

Just so everyone's in agreement. Yes, FOR UPDATE blocks concurrent SELECT...FOR UPDATE (as well as SELECT...FOR SHARE/IN SHARE MODE). What the article said though was reads:

> Now, other transactions won’t be able to read this row until our transaction is committed or rolled back (for update can only be used inside a transaction).

Finally if transaction isolation is SERIALIZABLE, then a row lock can block normal SELECT reads. This is a consequence of SERIALIZABLE isolation preventing write skew. Transactions can be run in parallel, but there has to be some ordering of transactions that produces the same result as if each transaction was one run one at a time.

Alpi Oct 22, 2023

FOR UPDATE locks the row for other blocking transactions (including another select for update) The weaker form that only locks updates and deletes is called FOR SHARE. Transaction isolation levels do not make a difference here.

https://www.postgresql.org/docs/current/explicit-locking.htm...

So the article is correct that SELECT FOR UPDATE will ensure that another concurrent SELECT FOR UPDATE transaction never acquire the same row (it will block), though nothing prevents other non-blocking selects to query this row concurrently.

You can think of it as usual locks - only the threads that explicitly use the same lock have the mutual exclusion guarantees. If there’s another thread that does not acquire the lock and tries to access the critical section - it will be able to do so.

withinboredom Oct 22, 2023

If you have external side-effects, and your transaction fails, undo whatever you did in the external system. Don't just roll-back your database, but forget to "request your money back" from that system.

And remember, don't use transactions when you cannot compensate the failure. For example, if you were to transfer money to an external bank, you probably can't just "undo" the sending of money. Your bank is on the hook for that money, so if, for example, the account you withdrew money from would now be negative ... you probably want it to still be negative if a failure happens ... instead of staying zero after rolling back the transaction. Otherwise, that other bank is going to come asking for that money you said you sent them.

maxbond Oct 22, 2023

I'd caution against holding a transaction for the duration of an API call to a foreign system. API calls can take a very long time, especially if you are retrying on timeouts. Long lived transactions are dangerous, because it prevents you from releasing locks, which interferes with concurrent transactions and with vacuuming. I've seen it bring down production before.

Consider for example the following scenario; you are experiencing higher than normal load, leading you to send a lot of requests to this other system. Their scaling is flawed, and your traffic actually brings them down. This leads to timeouts and long transaction times. Now you are holding lots of locks for a very long time, and you database is under a heavy load - you're now in danger of overloading your database and going down.

I'd suggest commiting a record representing your intention to integrate with this other system, executing your API call, and then updating your state (assuming the call is successful). (For the example above, circuit breakers would also be an important mitigation.)

JoshGlazebrook Oct 22, 2023

This is a major pain point with dealing with Stripe. If you want to write a record in your local db (to record a cancellation), then call the stripe api to say cancel a subscription, if the api call fails, you can rollback your db changes by failing the transaction.

If the api call succeeds, but your transaction fails to write to the db and rollsback, then what?

You then have to rely on the Stripe webhooks to "sync" your local db state with what stripe has, oh but don't forget stripe webhooks don't have any guaranteed order in which they are delivered, so just ignore all of the event data, you have to fetch the current state of the stripe object your event is associated with and make sure the handling of these events are idempotent.

It's not wise to rely solely on Stripe being the source of truth, but there is no way to really implement 2 phase commit with external services like this. Maybe you fire off an event into a pubsub or queue to handle the actual writing to your db, but what if the webhook gets to your service first?

Shit is hard.

yencabulator Oct 22, 2023

You record a want-to-cancel state, and then only after Stripe acknowledges the cancellation you transition from want-to-cancel to cancelled. State machines.

tynorf Oct 22, 2023

You might be interested in this: https://brandur.org/idempotency-keys

withinboredom Oct 22, 2023

Any sync needs two phases:

- active state transitions (webhooks)

- reconciliation (api)

You can do reconciliation without active state transitions, but it usually takes awhile (slow and scheduled). However, it’s a requirement with webhooks because you don’t know if they are in-order or delayed or just broken. Thus, you need to reconcile your state every so often.

cryptonector Oct 22, 2023

Don't look before you jump, instead look after you jump, because then you can rollback and undo the hurt :)

You can do this with deferred triggers, for example. Here's where a COMMIT trigger would be handy -- something that runs when the client's COMMIT begins executing (though other triggers could still run in the COMMIT trigger were to execute any DMLs).

CaliforniaKarl Oct 21, 2023

For the row-locking example, wouldn’t it be better to include both A and B in the SELECT, so that the FOR UPDATE can lock both records? Even if using strict serialization, I wonder if FOR UPDATE-ing both rows would help stop (or delay) other conflicting transactions.

CraigJPerry Oct 22, 2023

Yeah without including A and B there’s a risk of deadlock

rowls66 Oct 22, 2023

To avoid deadlock, you need to ensure that locking happens in the same order in all concurrent transactions. So account A must be locked before account B and never in the opposite order. So if account B is the account being debited and therefore the account that needs to have a sufficient balance, then account A must be locked first. If account A is the account to be debited, then explicitly locking account B is not necessary. It will be locked when its updated.

karmakaze Oct 22, 2023

I think I agree but was hard to parse the message. My summary would be to always lock A and B in the same order (e.g. smallest account number first) regardless of which is being reduced or increased. In addition, an UPDATE statement counts as lock so doesn't need a SELECT ... FOR UPDATE if it's the 2nd one that you want to do.

CraigJPerry Oct 22, 2023

If transaction 1 and 2 are running the example and the ordering is something like:

    txn1 = db.begin()

    # implicit exclusive row lock established here on A row, exclusive rather than shared read because we say for update
    currBalance1 = txn1.query('select balance from accounts where name = A for update’)

    … <snipped some of the example code> …

    txn1.execute('update accounts set balance = balance - $amount where name = A')

    # Concurrently tx2 begins, tx1 has locked row A so far
    txn2 = db.begin()

    # we’re exclusively locking row B in txn2
    currBalance2 = txn2.query('select balance from accounts where name = B for update’)

    # This is the first point in transaction 1 where we take an exclusive lock on B, however we are deadlocked now
    txn1.execute('update accounts set balance = balance + $amount where name = B')

The solution is what the parent suggested, take the exclusive locks on rows A&B at the same time because if transaction 2 is locking B&C you can’t afford to do 2 separate for update calls because b might get locked as you’ve just locked A and were about to lock B

    select balance from accounts where name = 'A' or name = ‘B’ for update

karmakaze Oct 22, 2023

When it comes to row locks, there isn't a same time. Which row gets locked first if left unspecified can lead to deadlocks. All I was suggesting was:

  select balance from accounts where name = 'A' for update
  select balance from accounts where name = 'B' for update

where the two statements are run with smaller 'name' first, or

  select balance from accounts where name = 'A' or name = ‘B’ order by name for update

Of course, I'd use the primary key (likely an id rather than name).

By locking the accounts in smallest->largest name (or id), you can afford to do 2 separate for update calls. Even when done in a single statement specify the canonical ordering of locks to avoid deadlocks. For example, 3 transactions locking (A, B), (B, C), (C, A):

  tx1: lock A, lock B
  tx2: lock B, lock C
  tx3: lock A, lock C

This is the lock order regardless of which of A/B/C is being debited or credited. As a final detail, make sure the ORDER BY that you specify is on a unique index or else there could/will be range locks on the index and won't strictly be using row locks.

gavinray Oct 22, 2023

In databases, "locks" have a very specific meaning.

What engineers would call a "lock", is referred to as a "latch" in DBMS internals terminology.

A "lock" then is a higher level abstraction that functions as a logical lock rather than a fine-grained physical one.

Locks are generally used to control access to database objects like rows, pages, tables, etc, to ensure data integrity and transactional consistency.

"Latches" are low-level synchronization primitives used to protect access to internal DB structures.

nuttingd Oct 26, 2023

The check constraint solution isn't a solution to the concurrency issue: it only prevents a negative balance.

The balance would still be wrong for the whole range where balance > 0. Two (positive) deposits would not store the correct ending balance.

paulddraper Oct 22, 2023

Bank accounts... It's always bank accounts

bastawhiz Oct 22, 2023

In fairness, it's a great example. Moving money between accounts requires updating the state of at least two records. Moreover, nearly everyone working with databases has one or at least understands how a bank account should work. And as far as relevant examples go, I can't think of anything else where the correctness of the operation is quite as critical.

ludston Oct 22, 2023

It's a deceptive example because banks are distributed systems (in that they interoperate with other banks/financial systems, and often have distributed internal database systems across multiple products too). Regular db transactions are insufficient for ensuring consistency across a distributed system, primarily because one of the systems may fail to commit their transaction after the other has succeeded. Even if there is only a single database, protecting data against concurrency issues is still non-trivial even with transactions.

bastawhiz Oct 23, 2023

I mean, if your argument is that banks don't use the primitives of a standard RDBMS because their needs aren't serviced by a standard RDBMS, of course it's a bad example because you're talking about how to do something in SQL which isn't done in SQL. But that's not a "bank accounts" problem, that's a "how banks with a non trivial number of systems work" problem. If you were implementing a system for managing bank accounts in a single RDBMS, regular transactions would work.

ludston Oct 24, 2023

The problem is that the industry standard of banking is eventually consistent. Whilst having atomic transactions is necessary, it's not sufficient: Using banks as a teaching metaphor for RDBMS systems is like using animals as a teaching metaphor for polymorphism. From a naive point of view it works, but it does more harm than good because students walk away thinking that they have a solution to a problem that isn't actually sufficient for it.

yencabulator Oct 22, 2023

It's a bad example because bank accounts don't work like this. Accounting etc use an append-only ledger.

throwawaybkkg Oct 22, 2023

There is no one true way that bank accounts work. I work for a big bank and I've come to accept that there are valid cases for UPDATING and even DELETING transactions in your Core Banking system. It is true that your general ledger is append-only, but that doesn't apply to what happens before you close your books.

paulddraper Oct 22, 2023

Yeah, there's a reason and that's the reason.

I just wish it were applicable to something else.

comvidyarthi Oct 22, 2023

I don’t understand the title of the article. The article essentially talks about what RepeatableRead isolation levels of Postgresql cannot do and how to remedy that by Serialize Isolation. But both need lock at some level, and tradeoff performance guarantees for weaker/stronger consistency guarantees.

karmakaze Oct 22, 2023

It's not a great article, as some of the clarification comments and discussions are surfacing. It is an import topic. Often reading the database documentation directly around Transaction Isolation and Locking sections will be more precise, complete and be specific for the database/version you are using.

grahamlee Oct 22, 2023

Lamport’s time/clocks paper solves this exact problem, using this exact example, without locks.

juped Oct 22, 2023

Why did the author introduce the weird broken "move a negative amount of money" behavior and then never follow up on its brokenness? That was too distracting.

redkin Oct 22, 2023 (dead)

RDNI Oct 22, 2023 (dead)

SubiculumCode Oct 21, 2023 (dead)

This item has no comments currently.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous