Transactions and ACID
Transactions group operations together so they either all succeed or all fail.
ACID Properties
| Property | Definition | Without It |
|---|---|---|
| Atomicity | All or nothing | Partial updates on crash |
| Consistency | Data stays valid | Broken invariants |
| Isolation | Concurrent transactions do not interfere | Race conditions |
| Durability | Committed data survives crashes | Data loss |
Atomicity
If a transaction fails halfway through, the database rolls back everything. Consider a money transfer that debits one account and credits another. If the server crashes after the debit but before the credit, the entire transaction is rolled back. Without atomicity, the first account would lose money that the second account never receives.
Consistency
The database moves from one valid state to another. Application logic and database constraints define what "valid" means.
Isolation
Two transactions running concurrently do not interfere with each other. Different isolation levels provide different guarantees.
Durability
Once a commit confirmation is received, that data is safe. Databases achieve this with write-ahead logging (WAL): writes go to a log on disk before the actual data files. On crash recovery, the database replays the log.
Isolation Levels
| Level | Dirty Reads | Non-Repeatable Reads | Phantom Reads |
|---|---|---|---|
| Read Uncommitted | Possible | Possible | Possible |
| Read Committed | Prevented | Possible | Possible |
| Repeatable Read | Prevented | Prevented | Possible |
| Serializable | Prevented | Prevented | Prevented |
Anomaly Types
Dirty Read: Reading data from another transaction that has not committed yet. If that transaction rolls back, the read data never existed. For example, Transaction 1 updates a balance to 100 but has not committed. Transaction 2 reads that balance and sees 100. Then Transaction 1 rolls back, meaning that value of 100 never actually existed. Transaction 2 now has invalid data.
Non-Repeatable Read: Reading the same row twice yields different values because another transaction modified it. Transaction 1 reads a balance of 100. Transaction 2 then updates that balance to 200 and commits. When Transaction 1 reads the same balance again, it now sees 200 instead of 100.
Phantom Read: Running the same query twice yields different rows because another transaction inserted or deleted rows. Transaction 1 counts rows where age exceeds 20 and gets 5. Transaction 2 inserts a new row with age 25 and commits. When Transaction 1 runs the same count again, it now returns 6 due to the phantom row.
Isolation Level Selection
| Level | Performance | Safety | Default In |
|---|---|---|---|
| Read Uncommitted | Fastest | Lowest | Not recommended |
| Read Committed | Good | Adequate for most applications | PostgreSQL, Oracle |
| Repeatable Read | Moderate | Good | MySQL |
| Serializable | Slowest | Highest | Use when required |
Most applications work correctly with Read Committed. Use Serializable for financial transactions or scenarios where race conditions would be critical.
Isolation Implementation
Two-Phase Locking (2PL)
Traditional approach using locks.
- Shared lock: Multiple transactions can read, but no writes allowed
- Exclusive lock: Only one transaction can access
Two phases:
- Growing: Acquire all required locks
- Shrinking: Release locks (only after commit or abort)
Deadlocks can occur when transactions wait for each other. The database must detect and terminate one transaction.
Snapshot Isolation (MVCC)
Modern approach using multiple versions of each row.
- Writers do not block readers
- Readers do not block writers
- Each transaction sees a consistent snapshot from when it started
PostgreSQL, Oracle, and MySQL InnoDB use MVCC.
Serializable Snapshot Isolation (SSI)
Combines benefits of both approaches:
- Allow transactions to run concurrently
- Detect conflicts at commit time
- Abort conflicting transactions
Optimistic concurrency: assume conflicts are rare, detect them when they occur.
Distributed Transactions
Two-Phase Commit (2PC)
Phase 1 - Prepare:
- Coordinator asks all participants: "Can you commit?"
- Each participant prepares (writes to disk) and votes yes/no
Phase 2 - Commit:
- If everyone voted yes: Coordinator tells everyone to commit
- If anyone voted no: Everyone aborts
Limitations:
- If the coordinator crashes after Phase 1, participants are stuck waiting
- Participants hold locks during the entire process
- Single point of failure
2PC is correct but slow. Use when distributed ACID guarantees are required.
Saga Pattern
For microservices, Sagas are often more practical:
- Each service performs its local transaction
- If a step fails, run compensating transactions in reverse
For example, a travel booking saga might book a hotel, then book a flight, then book a car. If the car booking fails, the saga runs compensating transactions in reverse order: cancel the flight reservation, then cancel the hotel reservation.
Trade-off: Provides eventual consistency, not ACID. More practical for loosely-coupled services.
Common Problems and Solutions
Lost Updates
Two transactions read a value, modify it, and write back. One update is lost. For example, both Transaction 1 and Transaction 2 read a balance of 100. Transaction 1 adds 50 and writes 150. Transaction 2 adds 20 and writes 120. The final balance is 120, but it should be 170 since both additions should have been applied.
Solutions:
- Atomic operations: Use database operations that combine read and write, such as incrementing the balance directly in the update statement
- Explicit locking: Lock the row when reading so other transactions must wait
- Compare-and-set: Include the original value as a condition in the update, failing if another transaction changed it
Write Skew
Two transactions read overlapping data, make decisions based on it, then write without seeing each other.
Example: Two doctors both check if at least one doctor is on call. Both see yes. Both go off call. Now no one is on call.
Solution: SELECT FOR UPDATE to lock the rows being used for decisions.
Best Practices
-
Keep transactions short. Long transactions hold locks and increase contention.
-
Never wait for user input inside a transaction.
-
Understand the isolation level in use. Know which anomalies are possible.
-
Design idempotent operations. If a transaction may be retried, ensure running it twice does not corrupt data.
-
Evaluate consistency requirements. Social media likes can be eventually consistent. Bank transfers cannot.
Use Case Recommendations
| Scenario | Recommendation |
|---|---|
| E-commerce checkout | Strong consistency for inventory |
| Social media likes | Eventual consistency acceptable |
| Financial transfers | Serializable isolation |
| Analytics queries | Snapshot isolation (avoid blocking writes) |
| Microservices | Saga pattern, accept eventual consistency |
| Single database | Regular transactions with appropriate isolation |