Sep 11th, 2020 - written by Kimserey with .
Window functions are calcultions done accross a set of rows in relation to the current row. In Postgres, we can use window functions with the keywords OVER
to calculate useful aggregate functions like average, ranking or count over a partition of the data. In today’s post we will look at example of window function usage.
A window function can be specified with any aggregate function combined with OVER (PARTITION BY ... ORDER BY ...)
.
The aggregate functions in Postgres are:
avg()
count()
max()
min()
sum()
rank()
(only available as window function)For example, the following query:
1
SELECT id, name, salary, dept, avg(salary) OVER (PARTITION BY dept)::numeric(10, 2) FROM employees LIMIT 15;
would return the each employee data in the employees
table together with the average salary where that employee is in.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
+------+-----------+----------+----------------------+---------+
| id | name | salary | dept | avg |
|------+-----------+----------+----------------------+---------|
| 51 | Danyelle | 6881 | Accounting | 5791.18 |
| 44 | Caril | 6239 | Accounting | 5791.18 |
| 4 | Irma | 5168 | Accounting | 5791.18 |
| 74 | Dilly | 6327 | Accounting | 5791.18 |
| 6 | Nico | 7158 | Accounting | 5791.18 |
| 70 | Hillyer | 2831 | Accounting | 5791.18 |
| 34 | Ingrid | 1108 | Accounting | 5791.18 |
| 29 | Ailbert | 5617 | Accounting | 5791.18 |
| 58 | Erica | 7492 | Accounting | 5791.18 |
| 56 | Sigfried | 8222 | Accounting | 5791.18 |
| 55 | Nady | 6660 | Accounting | 5791.18 |
| 20 | Aimil | 6431 | Business Development | 5985.33 |
| 26 | Benyamin | 2937 | Business Development | 5985.33 |
| 8 | Stephenie | 7913 | Business Development | 5985.33 |
| 39 | Sela | 5120 | Business Development | 5985.33 |
+------+-----------+----------+----------------------+---------+
We can see that the avg
computed was partitioned by departments.
The PARTITION BY
is optional, if not provided the window is the whole table:
1
SELECT id, name, salary, dept, avg(salary) OVER ()::numeric(10, 2) FROM employees LIMIT 15;
avg
will then be the average for the whole table salary:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
+------+-----------+----------+--------------------------+---------+
| id | name | salary | dept | avg |
|------+-----------+----------+--------------------------+---------|
| 1 | Geri | 5449 | Sales | 5719.95 |
| 2 | Alane | 4709 | Research and Development | 5719.95 |
| 3 | Shirleen | 9051 | Marketing | 5719.95 |
| 4 | Irma | 5168 | Accounting | 5719.95 |
| 5 | Juliane | 1496 | Human Resources | 5719.95 |
| 6 | Nico | 7158 | Accounting | 5719.95 |
| 7 | Emilio | 9248 | Product Management | 5719.95 |
| 8 | Stephenie | 7913 | Business Development | 5719.95 |
| 9 | Melita | 6807 | Marketing | 5719.95 |
| 10 | Christie | 1344 | Research and Development | 5719.95 |
| 11 | Tera | 7291 | Human Resources | 5719.95 |
| 12 | Odele | 5033 | Legal | 5719.95 |
| 13 | Goldia | 4283 | Services | 5719.95 |
| 14 | Gianina | 8526 | Research and Development | 5719.95 |
| 15 | Beryle | 5572 | Engineering | 5719.95 |
+------+-----------+----------+--------------------------+---------+
We can also specify ORDER BY
:
1
SELECT id, name, salary, dept, avg(salary) OVER (PARTITION BY dept ORDER BY salary)::numeric(10, 2) FROM employees LIMIT 15;
which result in:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
+------+----------+----------+----------------------+---------+
| id | name | salary | dept | avg |
|------+----------+----------+----------------------+---------|
| 34 | Ingrid | 1108 | Accounting | 1108.00 |
| 70 | Hillyer | 2831 | Accounting | 1969.50 |
| 4 | Irma | 5168 | Accounting | 3035.67 |
| 29 | Ailbert | 5617 | Accounting | 3681.00 |
| 44 | Caril | 6239 | Accounting | 4192.60 |
| 74 | Dilly | 6327 | Accounting | 4548.33 |
| 55 | Nady | 6660 | Accounting | 4850.00 |
| 51 | Danyelle | 6881 | Accounting | 5103.88 |
| 6 | Nico | 7158 | Accounting | 5332.11 |
| 58 | Erica | 7492 | Accounting | 5548.10 |
| 56 | Sigfried | 8222 | Accounting | 5791.18 |
| 31 | Elysia | 2431 | Business Development | 2431.00 |
| 26 | Benyamin | 2937 | Business Development | 2684.00 |
| 39 | Sela | 5120 | Business Development | 3496.00 |
| 76 | Dannye | 6205 | Business Development | 4173.25 |
+------+----------+----------+----------------------+---------+
ORDER BY
will restrict the window frame to the start of the partition up through the current row, plus any following rows equal to the current row.
For example, for id: 34
, the window frame is just id: 34, salary: 1108
hence the average is 1108
.
Then we advance one row, id: 70
, the window frame is then 1108, 2831
which makes an avergage of 1969.50
.
And as we continue to advance, we add more rows until the last row which will compute the overal average for the partition.
In the average example, the ORDER BY
didn’t really make sense but when we want to apply ranking, the order by allows to determine the rank.
1
SELECT id, name, salary, dept, rank() OVER (PARTITION BY dept ORDER BY salary) FROM employees LIMIT 15;
This will compute the rank of employees per salary for each departments:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
+------+----------+----------+----------------------+--------+
| id | name | salary | dept | rank |
|------+----------+----------+----------------------+--------|
| 34 | Ingrid | 1108 | Accounting | 1 |
| 70 | Hillyer | 2831 | Accounting | 2 |
| 4 | Irma | 5168 | Accounting | 3 |
| 29 | Ailbert | 5617 | Accounting | 4 |
| 44 | Caril | 6239 | Accounting | 5 |
| 74 | Dilly | 6327 | Accounting | 6 |
| 55 | Nady | 6660 | Accounting | 7 |
| 51 | Danyelle | 6881 | Accounting | 8 |
| 6 | Nico | 7158 | Accounting | 9 |
| 58 | Erica | 7492 | Accounting | 10 |
| 56 | Sigfried | 8222 | Accounting | 11 |
| 31 | Elysia | 2431 | Business Development | 1 |
| 26 | Benyamin | 2937 | Business Development | 2 |
| 39 | Sela | 5120 | Business Development | 3 |
| 76 | Dannye | 6205 | Business Development | 4 |
+------+----------+----------+----------------------+--------+
Similarly, we can use the window frame to compute count:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
SELECT id, name, salary, dept, count(*) OVER (PARTITION BY dept) FROM employees LIMIT 15;
+------+-----------+----------+----------------------+---------+
| id | name | salary | dept | count |
|------+-----------+----------+----------------------+---------|
| 51 | Danyelle | 6881 | Accounting | 11 |
| 44 | Caril | 6239 | Accounting | 11 |
| 4 | Irma | 5168 | Accounting | 11 |
| 74 | Dilly | 6327 | Accounting | 11 |
| 6 | Nico | 7158 | Accounting | 11 |
| 70 | Hillyer | 2831 | Accounting | 11 |
| 34 | Ingrid | 1108 | Accounting | 11 |
| 29 | Ailbert | 5617 | Accounting | 11 |
| 58 | Erica | 7492 | Accounting | 11 |
| 56 | Sigfried | 8222 | Accounting | 11 |
| 55 | Nady | 6660 | Accounting | 11 |
| 20 | Aimil | 6431 | Business Development | 9 |
| 26 | Benyamin | 2937 | Business Development | 9 |
| 8 | Stephenie | 7913 | Business Development | 9 |
| 39 | Sela | 5120 | Business Development | 9 |
+------+-----------+----------+----------------------+---------+
Or we could omit the PARTITION BY
to return the total count:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
SELECT id, name, salary, dept, count(*) OVER () FROM employees LIMIT 15;
+------+-----------+----------+--------------------------+---------+
| id | name | salary | dept | count |
|------+-----------+----------+--------------------------+---------|
| 1 | Geri | 5449 | Sales | 100 |
| 2 | Alane | 4709 | Research and Development | 100 |
| 3 | Shirleen | 9051 | Marketing | 100 |
| 4 | Irma | 5168 | Accounting | 100 |
| 5 | Juliane | 1496 | Human Resources | 100 |
| 6 | Nico | 7158 | Accounting | 100 |
| 7 | Emilio | 9248 | Product Management | 100 |
| 8 | Stephenie | 7913 | Business Development | 100 |
| 9 | Melita | 6807 | Marketing | 100 |
| 10 | Christie | 1344 | Research and Development | 100 |
| 11 | Tera | 7291 | Human Resources | 100 |
| 12 | Odele | 5033 | Legal | 100 |
| 13 | Goldia | 4283 | Services | 100 |
| 14 | Gianina | 8526 | Research and Development | 100 |
| 15 | Beryle | 5572 | Engineering | 100 |
+------+-----------+----------+--------------------------+---------+
We can also see how the window frame make sense with sum:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
SELECT id, name, salary, dept, sum(salary) OVER (PARTITION BY dept) FROM employees LIMIT 15;
+------+-----------+----------+----------------------+-------+
| id | name | salary | dept | sum |
|------+-----------+----------+----------------------+-------|
| 51 | Danyelle | 6881 | Accounting | 63703 |
| 44 | Caril | 6239 | Accounting | 63703 |
| 4 | Irma | 5168 | Accounting | 63703 |
| 74 | Dilly | 6327 | Accounting | 63703 |
| 6 | Nico | 7158 | Accounting | 63703 |
| 70 | Hillyer | 2831 | Accounting | 63703 |
| 34 | Ingrid | 1108 | Accounting | 63703 |
| 29 | Ailbert | 5617 | Accounting | 63703 |
| 58 | Erica | 7492 | Accounting | 63703 |
| 56 | Sigfried | 8222 | Accounting | 63703 |
| 55 | Nady | 6660 | Accounting | 63703 |
| 20 | Aimil | 6431 | Business Development | 53868 |
| 26 | Benyamin | 2937 | Business Development | 53868 |
| 8 | Stephenie | 7913 | Business Development | 53868 |
| 39 | Sela | 5120 | Business Development | 53868 |
+------+-----------+----------+----------------------+-------+
which gives the total sum per department, whereas if we use ORDER BY
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
SELECT id, name, salary, dept, sum(salary) OVER (PARTITION BY dept ORDER BY salary) FROM employees LIMIT 15;
+------+----------+----------+----------------------+-------+
| id | name | salary | dept | sum |
|------+----------+----------+----------------------+-------|
| 34 | Ingrid | 1108 | Accounting | 1108 |
| 70 | Hillyer | 2831 | Accounting | 3939 |
| 4 | Irma | 5168 | Accounting | 9107 |
| 29 | Ailbert | 5617 | Accounting | 14724 |
| 44 | Caril | 6239 | Accounting | 20963 |
| 74 | Dilly | 6327 | Accounting | 27290 |
| 55 | Nady | 6660 | Accounting | 33950 |
| 51 | Danyelle | 6881 | Accounting | 40831 |
| 6 | Nico | 7158 | Accounting | 47989 |
| 58 | Erica | 7492 | Accounting | 55481 |
| 56 | Sigfried | 8222 | Accounting | 63703 |
| 31 | Elysia | 2431 | Business Development | 2431 |
| 26 | Benyamin | 2937 | Business Development | 5368 |
| 39 | Sela | 5120 | Business Development | 10488 |
| 76 | Dannye | 6205 | Business Development | 16693 |
+------+----------+----------+----------------------+-------+
which will gives a cumulative result of the salary, yielding the total on the last row.