OLTP – Phase 7 Query Executor (The Integration Phase)
Before this phase, the parser and storage engine were separate systems. The parser could turn SQL into an AST, and the storage engine could read/write rows — but they didn’t talk to each other. Queries were still built manually in Go code.
Phase 7 connects everything. The Executor takes a SQL string, parses it, looks up the table in the catalog, converts the AST to an execution plan, runs it, and returns results.
After this phase, you have a working database:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
"SELECT name FROM users WHERE age > 25;"
│
▼
Executor.Execute()
│
┌──────────┴──────────┐
▼ ▼
parser.Parse() catalog.GetTable()
│ │
▼ ▼
AST (Phase 6) Table (Phase 3)
│ │
▼ ▼
convertExpr() SeqScanNode (Phase 4)
│ │
▼ ▼
executor.Expression BufferPool (Phase 5)
│
▼
HeapFile (Phase 1)
In PostgreSQL, this is execMain.c and pquery.c.
Here is the roadmap for the phases to come:
- Phase 7: Query executor
- Phase 8: B-tree index
- Phase 9: Write-ahead log
- Phase 10: Transactions, concurrency, REPL, and server
Full Source Code
The code referenced in this post can be found in https://gitlab.com/kimserey.lam/db-learn.
The Executor
1
2
3
4
5
type Executor struct {
Catalog *catalog.Catalog
}
func (e *Executor) Execute(sql string) (*Result, error)
One method does everything. It parses the SQL, dispatches by statement type, and returns a Result:
1
2
3
4
5
6
7
8
9
10
11
12
switch s := stmt.(type) {
case *parser.CreateTableStmt:
return e.executeCreateTable(s)
case *parser.InsertStmt:
return e.executeInsert(s)
case *parser.SelectStmt:
return e.executeSelect(s)
case *parser.UpdateStmt:
return e.executeUpdate(s)
case *parser.DeleteStmt:
return e.executeDelete(s)
}
The Result struct
1
2
3
4
5
type Result struct {
Columns []string // column names (SELECT only)
Rows [][]tuple.Value // row data (SELECT only)
Message string // status message (DDL/DML)
}
SELECT returns Columns and Rows. Everything else returns a Message:
"CREATE TABLE""INSERT 0 3"(inserted 3 rows)"UPDATE 1"(updated 1 row)"DELETE 2"(deleted 2 rows)
Important: the AST is never evaluated
The AST is a blueprint, not something that runs. The executor reads it and then discards it. Structural pieces (table names, column names, type names) are used directly — the executor just pulls the string out of the AST node and passes it to the catalog or schema. Dynamic pieces (WHERE clauses, SET values) are converted into executor expressions, which ARE evaluated against rows. The AST itself has no Eval method and never touches data.
How each statement executes
CREATE TABLE
1
CREATE TABLE users (id INT, name VARCHAR(50), age INT);
- Convert AST column definitions to
tuple.ColumnDef(map type names like “INT” totuple.TypeInt32) - Build a
tuple.Schema - Call
catalog.CreateTable(name, schema)— creates the heap file on disk
INSERT
1
INSERT INTO users VALUES (1, 'alice', 30);
- Look up the table in the catalog (to get the schema)
- For each row of values, convert AST literals to
tuple.Valuebased on the column type - Call
table.InsertRow(values)— serializes and stores via the buffer pool
The conversion uses the schema to produce the right type:
IntLit{1}+ column type INT →IntValue(int32(1))IntLit{1}+ column type BIGINT →BigIntValue(int64(1))StringLit{"alice"}→StringValue("alice")
SELECT
1
SELECT name FROM users WHERE age > 28;
- Look up the table
- Determine column indexes for projection (
name→ index 1) - Convert the WHERE clause AST to executor expressions
- Build a
SeqScanNodewith the filter and projection - Iterate the scan, collect all matching rows into the Result
For SELECT *, no projection — all columns are returned.
UPDATE
1
UPDATE users SET age = 31 WHERE name = 'alice';
This is the most complex statement because it modifies existing data:
Phase 1 — Scan: Walk through every page and slot. For each row, deserialize it, evaluate the WHERE filter. If it matches, build the new row (copy original values, replace the SET columns), and save the location (page number + slot).
Phase 2 — Apply: For each collected update, delete the old tuple from its page and insert the new row via InsertRow. The new row may land on a different page.
The two-phase approach avoids modifying data while scanning it, which could cause rows to be visited twice or skipped.
DELETE
1
DELETE FROM users WHERE id = 2;
Walk through every page and slot. For matching rows, call page.DeleteTuple(slot) which marks the slot as deleted (zeroes the item pointer). The page is marked dirty so the change gets flushed to disk.
Unlike UPDATE, DELETE can modify in place during the scan because it only removes rows — it doesn’t add new ones that could affect the scan.
Converting AST to executor expressions
The parser produces parser.BinaryExpr{Op: ">"}. The executor needs executor.CompareExpr{Op: OpGt}. The convertExpr function bridges this gap:
1
2
3
4
5
parser.ColumnRef{"age"} → executor.ColumnRef{Name: "age"}
parser.IntLit{25} → executor.Literal{Value: IntValue(25)}
parser.BinaryExpr{Op: ">"} → executor.CompareExpr{Op: OpGt}
parser.BinaryExpr{Op: "AND"} → executor.LogicalExpr{Op: OpAnd}
parser.NotExpr{...} → executor.NotExpr{...}
The conversion is recursive — it walks the AST tree and builds the executor expression tree with the same shape.
What this looks like in action
From the integration test:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
SQL: CREATE TABLE users (id INT, name VARCHAR(50), age INT);
→ CREATE TABLE
SQL: INSERT INTO users VALUES (1, 'alice', 30);
→ INSERT 0 1
SQL: SELECT * FROM users;
Columns: [id name age]
Row 0: id=1, name=alice, age=30
Row 1: id=2, name=bob, age=25
Row 2: id=3, name=carol, age=35
(3 rows)
SQL: SELECT name FROM users WHERE age > 28;
Columns: [name]
Row 0: name=alice
Row 1: name=carol
(2 rows)
SQL: UPDATE users SET age = 31 WHERE name = 'alice';
→ UPDATE 1
SQL: DELETE FROM users WHERE id = 2;
→ DELETE 1
SQL: SELECT * FROM users;
Columns: [id name age]
Row 0: id=3, name=carol, age=35
Row 1: id=1, name=alice, age=31
(2 rows)
SQL string in, structured results out. The full stack is working: parsing → catalog lookup → expression conversion → sequential scan with filtering → projection → results.
Notice that after UPDATE and DELETE, alice appears after carol. That’s because UPDATE deletes the old row and inserts a new one at the end. The ordering changed, but the data is correct. A real database would use ORDER BY to control output order.
How Phase 7 connects all previous phases
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Phase 7: Executor.Execute("SELECT name FROM users WHERE age > 28")
│
├── parser.Parse() ← Phase 6
│ → SelectStmt{Columns, From, Where}
│
├── catalog.GetTable("users") ← Phase 3
│ → Table{Schema, Pool, FileID}
│
├── convertExpr(Where)
│ parser.BinaryExpr{">"} ← Phase 6 AST
│ → executor.CompareExpr{OpGt} ← Phase 4 expression
│
├── SeqScanNode.Execute() ← Phase 4
│ └── TableIterator.Next()
│ └── Pool.FetchPage() ← Phase 5
│ └── HeapFile.ReadPage() ← Phase 1
│ └── Deserialize() ← Phase 2
│
└── Result{Columns: ["name"], Rows: [...]}
Every phase contributes. The Executor is the glue that connects them.
PostgreSQL comparison
PostgreSQL’s executor is far more sophisticated:
- Query planner: between parsing and execution, PostgreSQL has an optimizer that chooses between sequential scan, index scan, hash join, merge join, etc. We always use sequential scan.
- Prepared statements: PostgreSQL can parse once and execute many times with different parameters. We parse every time.
- MVCC for UPDATE: PostgreSQL doesn’t delete-and-reinsert. It marks the old tuple as invisible and creates a new version, allowing concurrent readers to still see the old data. We do simple delete-and-insert.
- Transaction context: every PostgreSQL query runs inside a transaction. We have no transactions yet.
- Type coercion: PostgreSQL automatically converts between compatible types (int vs bigint, text vs varchar). We require exact type matches.
Files created in Phase 7
| File | Purpose | Lines |
|---|---|---|
executor/result.go |
Result struct for query output | ~12 |
executor/executor.go |
Executor: parse → dispatch → execute → result | ~290 |
executor/integration_test.go |
9 tests: full SQL pipeline, errors, edge cases | ~250 |
What’s next
Phase 8 (B-Tree Index) adds an index data structure that enables O(log n) lookups instead of the O(n) sequential scan we use for everything today. For WHERE id = 42 on a million-row table, an index scan reads 3-4 pages instead of all of them.
Observations
- This is the integration phase where the parser output finally becomes something executable against the storage engine.
- The SQL text is parsed into an AST, but the AST is never evaluated directly. It is only a blueprint.
- The executor dispatches based on statement type: some operations act through the catalog, while others read or mutate rows in tables.
- The dynamic pieces of the AST, especially expressions, are converted into executor expressions that can actually be evaluated against row values.
- That separation is useful because it keeps parsing and execution loosely coupled: the parser describes intent, and the executor decides how to carry it out.