OLTP – Phase 10 Transactions, Concurrency, REPL, and Server
The database is fully functional — it can store data, run SQL, use indexes, and survive crashes. But it’s missing three things that make a database a database:
- Transactions: grouping multiple operations into an atomic unit (BEGIN/COMMIT/ROLLBACK)
- Concurrency control: preventing multiple connections from corrupting each other’s data
- An interface: a way for users and applications to actually connect and run queries
This phase ties everything together. The WAL from Phase 9 gave us durability (the “D” in ACID). Now we add atomicity (“A”) via transactions, and isolation (“I”) via locking.
In PostgreSQL, these are xact.c (transactions), lock.c (lock manager), psql (REPL), and postmaster.c (server).
Full Source Code
The code referenced in this post can be found in https://gitlab.com/kimserey.lam/db-learn.
Transactions
A transaction groups multiple operations into a single atomic unit: either all changes commit, or none of them do.
Transaction Manager
1
2
3
4
5
6
7
8
9
10
11
type Transaction struct {
ID uint64
State TxnState // Active, Committed, Aborted
Records []*wal.LogRecord // tracked for undo on abort
}
type TxnManager struct {
nextID uint64 // starts at 1; ID 0 = auto-commit
walMgr *wal.WALManager
active map[uint64]*Transaction
}
The lifecycle:
1
2
3
4
BEGIN → TxnManager.Begin() → assigns unique ID, state = Active
INSERT → writes WAL record with TxnID, tracks record on Transaction
INSERT → same
COMMIT → writes LogCommit to WAL, fsyncs, state = Committed
Or, on rollback:
1
2
3
4
BEGIN → TxnManager.Begin()
INSERT → tracked on Transaction.Records
INSERT → tracked on Transaction.Records
ROLLBACK → undo records in REVERSE order, write LogAbort to WAL
Why Reverse Order for Undo
Consider this transaction:
1
2
3
4
BEGIN;
INSERT INTO users VALUES (1, 'alice'); -- slot 0
INSERT INTO users VALUES (2, 'bob'); -- slot 1
ROLLBACK;
We must undo slot 1 first, then slot 0. If we went forward (undo slot 0, then slot 1), the page state would be wrong — deleting slot 0 first could shift the meaning of slot 1.
Auto-Commit
Without an explicit BEGIN, each statement runs with TxnID=0. No WAL commit record, no locks, no undo tracking. This is exactly the Phase 9 behavior — full backward compatibility.
TxnID Flows Through Table Methods
The Table’s mutation methods now accept a *txn.Transaction:
1
2
func (t *Table) InsertRow(tx *txn.Transaction, values []tuple.Value) error
func (t *Table) DeleteRow(tx *txn.Transaction, pageNum uint32, slot uint16) error
Inside, the transaction ID is written into the WAL record, and the record is tracked on the Transaction for potential undo:
1
2
3
4
serialize values → find page → insert tuple
→ WAL.Append(LogInsert, TxnID=tx.ID)
→ tx.TrackRecord(rec)
→ page.SetLSN(lsn) → unpin dirty
Pass nil for auto-commit (TxnID defaults to 0, no tracking).
Undo Methods
When aborting a transaction, we need to reverse changes without writing new WAL records (the LogAbort record covers it). Two dedicated methods handle this:
1
2
func (t *Table) UndoInsert(pageNum, slotIndex) error // deletes the tuple
func (t *Table) UndoDelete(pageNum, data) error // re-inserts old data
These go directly through the buffer pool — no WAL append, no LSN update. The Session provides these as a callback to the TxnManager’s Abort method.
Recovery With Transactions
Phase 9’s recovery only did redo (replay forward). With transactions, we also need to undo incomplete transactions — those that were active when the crash happened (no COMMIT or ABORT in the WAL).
Recovery now has two passes:
1
2
3
4
5
6
7
8
9
10
Pass 1 (Redo — forward):
For each record:
- LogCommit/LogAbort: note the TxnID as resolved
- Data records: redo if page LSN < record LSN (same as before)
Pass 2 (Undo — backward):
For each record in REVERSE:
- Skip if TxnID=0 (auto-commit) or TxnID is committed/aborted
- LogInsert: undo by deleting the tuple
- LogDelete: undo by re-inserting the old data
This handles the crash scenario:
1
2
3
4
5
6
7
8
9
10
Timeline:
1. BEGIN (TxnID=1)
2. INSERT alice → WAL record, page modified
3. INSERT bob → WAL record, page modified
4. ⚡ CRASH — no COMMIT record in WAL
Recovery:
Pass 1 (redo): replays both inserts (page might already have them)
Pass 2 (undo): TxnID=1 has no commit → undo bob, then undo alice
Result: table is empty, as if the transaction never happened ✓
Lock Manager
Prevents concurrent transactions from corrupting each other’s data. Uses table-level locking with two modes:
1
2
SharedLock: multiple readers allowed simultaneously
ExclusiveLock: single writer, blocks all other locks
Compatibility matrix:
1
2
3
| Shared | Exclusive
Shared | ✓ allowed | ✗ blocked
Exclusive | ✗ blocked | ✗ blocked
Implementation uses sync.Cond for blocking:
1
2
3
4
5
type lockEntry struct {
cond *sync.Cond
shared map[uint64]bool // txnIDs holding shared locks
exclusive uint64 // txnID holding exclusive, 0 = none
}
When a lock can’t be acquired, the goroutine calls cond.Wait() and sleeps until another goroutine releases and calls cond.Broadcast().
Lock upgrade is supported: if a transaction holds the only shared lock on a resource, it can upgrade to exclusive without deadlocking.
When Locks Are Acquired
The Session acquires locks based on the statement type, but only within an explicit transaction:
1
2
3
4
SELECT → SharedLock on table
INSERT → ExclusiveLock on table
UPDATE → ExclusiveLock on table
DELETE → ExclusiveLock on table
Auto-commit statements (no BEGIN) don’t acquire locks — they’re single operations and don’t need isolation from themselves.
All locks are released on COMMIT or ROLLBACK via ReleaseAll(txnID).
Session
A Session is per-connection state that wraps the Executor with transaction and lock awareness:
1
2
3
4
5
6
7
type Session struct {
catalog *catalog.Catalog
executor *Executor // shared, stateless engine
txnMgr *txn.TxnManager // shared across all sessions
lockMgr *concurrency.LockManager // shared across all sessions
currentTxn *txn.Transaction // this session's active transaction
}
The REPL creates one Session. The TCP server creates one Session per connection. The Catalog, TxnManager, and LockManager are shared — only currentTxn is per-session.
1
2
3
Session.Execute("BEGIN") → creates transaction, stores in currentTxn
Session.Execute("INSERT..") → acquires lock, delegates to executor with txn
Session.Execute("COMMIT") → commits via TxnManager, releases locks
If a connection drops mid-transaction, Session.Cleanup() aborts the transaction and releases all locks.
REPL
Interactive command-line interface, similar to psql:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
$ ./toydb --data /tmp/mydb
toydb — a learning database engine
Type SQL statements, or \q to exit.
toydb> CREATE TABLE users (id INT, name VARCHAR(50));
CREATE TABLE
toydb> INSERT INTO users VALUES (1, 'alice');
INSERT 0 1
toydb> BEGIN;
BEGIN
toydb(txn)> INSERT INTO users VALUES (2, 'bob');
INSERT 0 1
toydb(txn)> ROLLBACK;
ROLLBACK
toydb> SELECT * FROM users;
id | name
----+-------
1 | alice
(1 row)
The prompt changes to toydb(txn)> when inside a transaction — a visual cue that changes haven’t been committed yet.
SELECT results are formatted as psql-style ASCII tables with column headers, separator lines, and a row count footer.
TCP Server
A network server that accepts multiple concurrent client connections:
1
2
$ ./toydb --server --data /tmp/mydb --port 5433
toydb server listening on :5433
Each connection gets its own goroutine and Session. Clients send SQL as text lines; the server responds with a simple text protocol:
1
2
3
4
5
6
7
8
Client: INSERT INTO users VALUES (1, 'alice');\n
Server: OK INSERT 0 1\n.\n
Client: SELECT * FROM users;\n
Server: COLUMNS id|name\nROW 1|alice\nOK (1 row)\n.\n
Client: bad sql\n
Server: ERROR: line 1: expected statement...\n.\n
The .\n sentinel marks the end of a response, so clients know when to stop reading. Errors are prefixed with ERROR:.
Entry Point
1
2
3
4
Usage: toydb [flags]
--data DIR data directory (default "data")
--server run as TCP server instead of REPL
--port PORT server port (default "5433")
The startup sequence:
1
2
3
4
1. OpenCatalog(dataDir) → opens heap files, WAL, runs recovery
2. NewTxnManager(walMgr) → transaction manager
3. NewLockManager() → lock manager
4. Launch REPL or TCP server with shared catalog/txnMgr/lockMgr
On shutdown (REPL exit or server stop), Catalog.Close() flushes all dirty pages and truncates the WAL.
How the Pieces Connect
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
┌─────────────┐
│ cmd/toydb │
│ main.go │
└──────┬──────┘
│
┌────────────┴────────────┐
│ │
┌─────┴─────┐ ┌──────┴──────┐
│ REPL │ │ TCP Server │
│ (1 session)│ │ (N sessions)│
└─────┬─────┘ └──────┬──────┘
│ │
└────────────┬────────────┘
│
┌──────┴──────┐
│ Session │ per-connection state
│ (currentTxn)│
└──────┬──────┘
│
┌────────────┼────────────┐
│ │ │
┌─────┴─────┐ ┌───┴───┐ ┌─────┴──────┐
│ Executor │ │TxnMgr │ │ LockMgr │
│ (parse+ │ │(BEGIN, │ │ (Acquire, │
│ execute) │ │COMMIT, │ │ Release) │
└─────┬─────┘ │ABORT) │ └────────────┘
│ └───┬───┘
┌─────┴─────┐ │
│ Catalog │ │
│ (tables) │ │
└─────┬─────┘ │
│ │
┌─────┴─────┐ ┌───┴───┐
│Buffer Pool│ │ WAL │
└───────────┘ └───────┘
Limitations
-
Table-level locking: locks the entire table, not individual rows. This is simple but means two transactions can’t write to different rows of the same table concurrently. PostgreSQL uses row-level locking.
-
No deadlock detection: if two transactions acquire locks in conflicting order, they’ll deadlock forever. PostgreSQL detects deadlocks with a timeout and aborts one transaction.
-
No MVCC: reads block on writes and vice versa. PostgreSQL uses Multi-Version Concurrency Control so readers never block writers — each transaction sees a snapshot of the data.
-
No isolation levels: there’s only one behavior (serializable via locking). PostgreSQL supports READ COMMITTED, REPEATABLE READ, and SERIALIZABLE.
-
Single-statement transactions only for DDL: CREATE TABLE and CREATE INDEX don’t participate in transactions — they always auto-commit.
-
No prepared statements or parameterized queries: the TCP protocol sends raw SQL text.
PostgreSQL Comparison
| Feature | Our implementation | PostgreSQL |
|---|---|---|
| Transactions | BEGIN/COMMIT/ROLLBACK with WAL | Full ACID with savepoints, nested transactions |
| Locking | Table-level shared/exclusive | Row-level with multiple lock modes |
| Concurrency | Lock-based blocking | MVCC (readers never block writers) |
| Deadlock | Not detected | Detected + auto-abort |
| Protocol | Simple text lines | PostgreSQL wire protocol (binary) |
| Connections | goroutine per connection | Process per connection (or pooled) |
| REPL | Basic line input | psql with readline, tab completion, \d commands |
Files Created/Modified in Phase 10
| File | Status | Purpose |
|---|---|---|
txn/transaction.go |
NEW | Transaction, TxnManager |
txn/transaction_test.go |
NEW | 6 tests: begin, commit, abort, undo order, double-commit |
concurrency/lockmanager.go |
NEW | LockManager with shared/exclusive locks |
concurrency/lockmanager_test.go |
NEW | 7 tests: compatibility, blocking, release, upgrade |
executor/session.go |
NEW | Per-connection state with transaction awareness |
executor/session_test.go |
NEW | 7 tests: begin/commit, rollback, auto-commit, cleanup |
repl/repl.go |
NEW | Interactive REPL with ASCII table formatting |
server/server.go |
NEW | TCP server, goroutine per connection |
server/protocol.go |
NEW | Text wire protocol encoding |
cmd/toydb/main.go |
NEW | Entry point with –server/–data/–port flags |
parser/token.go |
MODIFIED | Added BEGIN, COMMIT, ROLLBACK tokens |
parser/ast.go |
MODIFIED | Added BeginStmt, CommitStmt, RollbackStmt |
parser/parser.go |
MODIFIED | Parse BEGIN/COMMIT/ROLLBACK |
catalog/table.go |
MODIFIED | txn parameter on mutations, UndoInsert/UndoDelete |
catalog/catalog.go |
MODIFIED | WALManager() getter |
executor/executor.go |
MODIFIED | ExecuteStmt with txn context |
wal/recovery.go |
MODIFIED | Added undo pass for uncommitted transactions |
What We Built
Ten phases, from raw bytes on disk to a working database with SQL, indexing, crash recovery, transactions, and a network server. The complete stack:
1
2
3
4
5
6
7
8
9
10
Phase 1: Pages and heap files (raw storage)
Phase 2: Tuple serialization (row format)
Phase 3: Tables and catalog (schema management)
Phase 4: Sequential scan and filtering (query execution)
Phase 5: Buffer pool (memory management)
Phase 6: SQL parser (human interface)
Phase 7: Query executor (SQL in, results out)
Phase 8: B-tree index (fast lookups)
Phase 9: Write-ahead log (crash recovery / durability)
Phase 10: Transactions, locks, REPL (ACID + user interface)
Observations
- This final phase is where the database starts feeling complete because it adds transactional behavior, concurrency control, and user-facing interfaces.
- Transaction support introduces the unit of work around multiple changes, with commit and rollback deciding whether those changes become durable or are undone.
- Concurrency is kept deliberately simple with shared and exclusive table-level locks. That is enough to explain the core idea without introducing the full complexity of row-level MVCC.
- The REPL and the server matter because they make all previous phases accessible. Without them, the database engine exists mostly as an internal library rather than something you can directly interact with.