Feb 6th, 2019 - written by Kimserey with .
In programming, we often hear about Closures. Closures are present in any languages possessing functions as first class citizen. This includes functional languages and also widespread languages such as C#, Python and JavaScript. Today we will look at the origin of closures and understand what they are.
As a side note, closures seem to be one of the favorite interviewers questions. I am not sure why. I believe that knowing the term Closure isn’t important as long as we are aware of the behavior in the language. On the other hand, I believe it becomes important when we want to have a deeper understanding of how programming languages work as documentations are likely to refer to them.
Closures first appeared in the λ-calculus (Lambda calculus). The Lambda calculus was developed in the 30s by Alonzo Church. At the time, programming wasn’t a thing and the Lambda calculus was a mathematical notation developped to reason about the foundations of mathematics.
The λ-calculus is composed by three rules:
x
or hello
;(λx.x)
or (λx.y)
where x
is a parameter and y
is a result - (notice that the expression body can contain external variables like y
);(f x)
where x
is applied to f
.The interesting and puzzling part, is that the λ-calculus with only three rules turned out to be able to represent any computer programs built today. In other words, it is Turing complete.
As we might have noticed, the constructs forming the λ-calculus are present in common languages like C#, JavaScript or Python hence making them Turing complete as well.
Turing completeness
Alan Turing, who was Alonzo Church Ph.D student, developed the Turing Machine. Similarly to the λ-calculus, any computer algorithms able to be represented with the λ-calculus can be implemented on a Turing machine. Turing completeness is a term used to describe the capability of a system to describe any computer algorithm.
For our examples, we will be using Racket which is a language derived from Scheme, a dialect of Lisp. Racket provides an elegant way of representating lambdas and function applications.
Following the λ-calculus, we can define the lambda λx.x+10
, a lambda adding 10
to an initial parameter, as such in Racket:
1
(λ (x) (+ x 10))
The value of x
within the expression body is bound to the parameter. In other words, x
is said to be a bound variable and in the λ-calculus terms, λx
is known as the binder.
Function application (f x)
with lambdas work by substituting f
with the lambda and x
with the parameter value.
1
((λ (x) (+ x 10)) 10)
Here we applied 10
to our lambda and got 20
as result as x
is substituted by 10
in the expression body.
In λ-calculus, lambdas accept only one parameter. Multiparameter functions can be modeled with functions returning functions.
For example the following lambda:
1
(λ (x y) (+ x y))
Can be rewritten in term of λ-calculus λx.λy.x+y
:
1
2
3
(λ (x)
(λ (y) (+ x y))
)
This procedure takes x
as parameter and return a function taking y
as parameter. It can then be applied as followed, resulting in 20
.
1
2
3
4
(((λ (x)
(λ (y) (+ x y))
) 10)
10)
So far the only variables introduced in λx.λy.x+y
or in λx.x+10
were contained within the scope of the lambdas. Such lambdas are called closed lambdas (and also called combinators).
In contrast, open lambdas are lambdas containing variables from outside their enclosing. The inner lambda used λy.x+y
is open as x
is defined outside of its enclosing. x
is called a free variable.
If we were to take the lambda to the top level, we would get the following error:
1
2
(λ (y) (+ x y))
; x: unbound identifier in: x
x: unbound identifier in: x
, the compiler tells us that x
is unbound.
What we can also do is partially apply the lambda by fixing the first parameter and creating a procedure:
1
2
3
4
(define add-ten
((λ (x)
(λ (y) (+ x y)))
10))
We create a procedure called add-ten
by fixing x
to 10
. Then we can call add-ten
from anywhere and the argument provided will be topped by 10
.
1
(add-ten 50)
x
is no longer within the scope of the expression but the result is still 60
, 10+50
.
In the following example, even while redefining x
, add-ten
behavior is unchanged:
1
2
3
(let ([x 20])
(add-ten 100)
)
This results in 110
even after redefining x
as 20
. x
is in the scope of add-ten
but not in the let
scope. This type of scope is called lexical scope (or static scope).
The lexical environment of add-ten
makes x
available.
Now that we understand all the terms needed, we can go back to our add-ten
definition:
1
2
3
4
(define add-ten
((λ (x)
(λ (y) (+ x y)))
10))
As part of the creation of the add-ten
lambda, two steps occurred:
10
was applied to the lambda (λ (x) (λ (y) (+ x y))
;And in term of program, a closure was created:
10
was captured as x=10
in the lexical environment;A closure is a record containing the lambda plus its environment at the time it was created.
The closure allows the lambda to be executed at any point in time with its environment providing bindings for all its variables. By providing the lexical environment to an open lambda, we can bind its free variables and therefore close a lambda.
Today we looked into the definition of a closure. It brought us to look into the λ-calculus passing by its definition and its rules. We also looked at what Turing completeness was and why common languages like C# or JavaScript are so called Turing complete. Lastly we ended up looking into different examples of lambda definition while always comparing it with the λ-calculus notation and we extracted from those example the definition of a closure. I hope you liked this post and I see you on the next one!