Mar 13th, 2020 - written by Kimserey with .
Few months ago we talked about the Basket of Apple problem where we represented a tree of rules to distribute apples in baskets. Trees are quite common in software, a simpler example is the example of managers and employees where employees can be managers and managers themselves can have managers. In today’s post, we will look at how we can represent a tree structure in SQLite and how we can retrieve the nodes of a tree using Common Table Expressions (CTE).
Taking the example of managers and employees, we create the following table:
1
2
3
4
5
CREATE TABLE employee (
id INT PRIMARY KEY,
name TEXT NOT NULL,
manager_id INT REFERENCES employee(id)
)
We have a table of employees which might have managers. If an employee has a manager the manager_id
will contain the id
.
We can then insert the following data:
1
2
3
4
5
6
7
8
9
INSERT INTO employee (id, name, manager_id) VALUES (1, 'kim', NULL);
INSERT INTO employee (id, name, manager_id) VALUES (2, 'tom', 1);
INSERT INTO employee (id, name, manager_id) VALUES (3, 'sam', 1);
INSERT INTO employee (id, name, manager_id) VALUES (4, 'yoan', NULL);
INSERT INTO employee (id, name, manager_id) VALUES (5, 'jon', 2);
INSERT INTO employee (id, name, manager_id) VALUES (6, 'pier', 2);
INSERT INTO employee (id, name, manager_id) VALUES (7, 'zoe', 4);
INSERT INTO employee (id, name, manager_id) VALUES (8, 'ian', 1);
INSERT INTO employee (id, name, manager_id) VALUES (9, 'loic', 4);
This results in the following table content:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
> SELECT * FROM employee
+----+------+------------+
| id | name | manager_id |
+----+------+------------+
| 1 | kim | <null> |
| 2 | tom | 1 |
| 3 | sam | 1 |
| 4 | yoan | <null> |
| 5 | jon | 2 |
| 6 | pier | 2 |
| 7 | zoe | 4 |
| 8 | ian | 1 |
| 9 | loic | 4 |
+----+------+------------+
Now that we have a table filled with employees, we are able to retrieve employees separately but if we want to retrieve all the nodes for a specific root, we need to perform a recursive query where we select employees starting from the first node and recursively selection child nodes. A recursive query in SQLite can be executed using CTE.
With nodes stored in the database, we want to be able to retrieve all nodes and children nodes given a root.
For example givenid = 1
, we want to retrieve all children and children of children, etc…
Getting children of children is inherently recursive.
To perform the recursion, we would:
This assumes that no infinite loop are present in the nodes.
A CTE
query is done using the WITH RECURSIVE
expression in SQLite. The composition of the query can be done as followed:
1
2
3
4
5
6
7
8
WITH RECURSIVE [cte_table_name](columns...) AS (
[initial_select]
UNION or UNION ALL
[recursive_select - where cte_table_name can be used to join]
)
[query_on_cte_table]
The WITH RECURSIVE ... AS ()
creates a temporary table containing the result of the recursive operation, which can then be queried on as a regular table. The initial_select
is the anchor selection setting the first roots for recursion, while the recursive_select
is the selection executed during the recursion. The overal selection is a compound select where subsequent queries are assembled using either UNION
or UNION ALL
. The difference between UNION
and UNION ALL
would be weither duplicate row are removed or kept.
Following this construct we can come up with a recursive query to get all direct and indirect reports of (1, kim)
:
1
2
3
4
5
6
7
8
9
10
11
12
13
WITH RECURSIVE cte_employee (id, name, manager_id) AS (
SELECT e.id, e.name, e.manager_id
FROM employee e
WHERE e.id = 1
UNION ALL
SELECT e.id, e.name, e.manager_id
FROM employee e
JOIN cte_employee c ON c.id = e.manager_id
)
SELECT * FROM cte_employee;
This will yield the following result:
1
2
3
4
5
6
7
8
9
10
+----+------+------------+
| id | name | manager_id |
+----+------+------------+
| 1 | kim | <null> |
| 2 | tom | 1 |
| 3 | sam | 1 |
| 8 | ian | 1 |
| 5 | jon | 2 |
| 6 | pier | 2 |
+----+------+------------+
We can see that with a single query we are able to get the complete tree.
CTE
uses a queue plus a single value table to execute the recursion. It takes the following approach to construct its content:
initial_select
is executed, the result of the select is put in the table and in a queue,recursive_select
is executed with the CTE
table acting as if it was a table of a single value (the dequeued value),recursive_select
is saved on the table and placed in the queue,Taking our example we can decompose the steps with:
initial_select
is executed, and returns (1, kim)
- places it in a queue and in the content table,(1, kim)
, executes recursive_select
acting as if cte_employee
only contains (1, kim)
, returns (2, tom), (3, sam), (8, ian)
- places the 3 values in the queue and in the content table,(2, tom)
, executes recursive_select
acting as if cte_employee
only contains (2, tom)
, returns (5, jon), (6, pier)
- places the 2 values in the queue and in the content table,(3, sam)
, similarly executes the recursive_select
, returns no result
,(8, ian)
, then (5, jon)
and finally (6, pier)
,And that concludes today’s post!
In today’s post we looked at how we could represent tree structures in SQLite tables and how we could retrieve nodes from our table using a recursive call done with Common Table Expression. We started by looking at what a tree would look like in a SQLite table, with roots and nodes, then we used a recursive CTE
query to get the nodes where we looked at how we could define a recursive query and the approach SQLite takes to construct its content. Hope you liked this post and I see you on the next one!