Oct 15th, 2021 - written by Kimserey with .
Python itertools
module provides a set of iterator blocks that can be used to combine iterators into a new iterator which will apply some modifications during the iteration of the sequences. For example building blocks like cycle
allows infinitely cycling through a sequence, another example is groupby
which provides a new iterator giving the groups on each iteration. In today’s post, we will look at some of the functions provided by itertools
with examples.
The first type of iterators we will be looking at are the infinite iterators.
count()
count
provides a sequential infinite integer iterator:
1
2
3
4
5
6
7
8
9
10
11
In [5]: for i in count():
...: if i > 5:
...: break
...:
...: print(i)
0
1
2
3
4
5
We can see that we can keep iterating infinitely until we decide to break out of the loop.
cycle()
cycle
provides a way to infinitely cycle through an iterable.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
In [6]: i = 0
...:
...: for c in cycle(['A', 'B']):
...: if i > 5:
...: break
...:
...: print(c)
...: i += 1
A
B
A
B
A
B
We can see that we cycle through A, B
until we break out of the loop.
repeat()
repeat
will repeat the same provided value.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
In [8]: i = 0
...:
...: for c in repeat('A'):
...: if i > 5:
...: break
...:
...: print(c)
...: i += 1
A
A
A
A
A
A
We can see that ‘A’ is infinitely repeated until we break out of the loop.
The next type of iterators we will be looking at iterators terminating on the shortest input sequence.
accumulate()
accumulate
will accumulate the result on each iteration.
1
2
3
4
5
6
In [3]: for c in accumulate([1, 2, 3, 4]):
...: print(c)
1
3
6
10
Here we use it to get an iterator of the total sum, but we could use it with string
and char
to append on each iteration.
1
2
3
4
5
6
In [2]: for c in accumulate(['A', 'B', 'C']):
...: print(c)
...:
A
AB
ABC
chain()
chain
will chain the iterators into a single iterators:
1
2
3
4
5
6
7
In [4]: for c in chain(['A', 'B'], ['C', 'D']):
...: print(c)
...:
A
B
C
D
compress()
compress
will return only the elements that are judged as valid by the second selector sequence:
1
2
3
4
5
In [6]: for c in compress(['A', 'B', 'C'], [1, 0, 1]):
...: print(c)
...:
A
C
We only get A
and C
because the selector was 1
for them. We could also use True
/False
rather than 1
/0
.
dropwhile()
dropwhile
will apply the function provided at each iteration and drop the elements until the function returns True
.
1
2
3
4
5
6
In [4]: for c in dropwhile(lambda x: x < 5, [1, 3, 5, 10, 3]):
...: print(c)
...:
5
10
3
We can see that we drop all elements while x < 5
. One thing to note is that once you stop dropping, the iterator then just iterate normally on the element. We can see that the last value 3
was still returned.
takewhile()
takewhile
is the opposite of dropwhile
and takes all element until the function provided is False
.
1
2
3
4
5
In [8]: for c in takewhile(lambda x: x < 5, [1, 3, 5, 10]):
...: print(c)
...:
1
3
Here we take while x<5
after that we stop taking.
filterfalse()
filterfalse
filters what is false. It returns the elements where the function provided return False
.
1
2
3
4
5
6
In [10]: for c in filterfalse(lambda x: x == 5, [1, 3, 5, 10]):
...: print(c)
...:
1
3
10
groupby()
groupby
will group the elements using the function key provided:
1
2
3
4
5
6
7
8
9
10
In [10]: x = [(1, "hello"), (1, "bye"), (2, "Test")]
In [11]: from itertools import groupby
In [12]: for k, g in groupby(x, lambda v: v[0]):
...: print(k, list(g))
...:
...:
1 [(1, 'hello'), (1, 'bye')]
2 [(2, 'Test')]
We group by the first value of our tuple and we can see that we get an iterable with two values which will be the groups.
tee()
tee
allows us to create copies of the iterable.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
In [17]: from itertools import count
In [18]: itcount = count()
In [19]: for c in itcount:
...: if c > 5:
...: break
...: print(c)
...:
0
1
2
3
4
5
In [20]: for c in itcount:
...: if c > 10:
...: break
...: print(c)
...:
7
8
9
10
In [21]: count1, count2 = tee(itcount, 2)
In [22]: for c in count1:
...: if c > 15:
...: break
...: print(c)
...:
12
13
14
15
In [23]: for c in count2:
...: if c > 15:
...: break
...: print(c)
...:
12
13
14
15
We can see that after creating a count
iterable, the values returned continue to increase as we continue iterating, even on different loops. If we need two separate iterators, we can use tee
to create two or more iterators which will then allow us to iterate them independenly of each other.
zip_longest()
zip_longest
allows us to zip
but for the shorter sequence, we will be albe to fill with a default value - compared to zip
which will stop at the shortest iterable.
1
2
3
4
5
6
7
8
In [31]: for c in zip_longest([1, 2, 3, 4, 5, 6], [10, 20, 30, 40], fillvalue=0):
...: print(c)
(1, 10)
(2, 20)
(3, 30)
(4, 40)
(5, 0)
(6, 0)
5
and 6
didn’t have any equivalent to zip on the other sequence therefore it was filled with fillvalue=0
.
Lastly we will look at the combinatoric iterators.
product()
product
will provide the product between two iterables.
1
2
3
4
5
6
7
8
9
10
11
12
In [37]: for c in product('ABC', 'DEF'):
...: print(c)
...:
('A', 'D')
('A', 'E')
('A', 'F')
('B', 'D')
('B', 'E')
('B', 'F')
('C', 'D')
('C', 'E')
('C', 'F')
We can see that we get all product from the two sequaneces, AD
, AE
, AF
, BD
, etc…
permutations()
permutations
will return all the permutations possible given a sequence:
1
2
3
4
5
6
7
8
9
In [38]: for c in permutations('ABC'):
...: print(c)
...:
('A', 'B', 'C')
('A', 'C', 'B')
('B', 'A', 'C')
('B', 'C', 'A')
('C', 'A', 'B')
('C', 'B', 'A')
We can also specify a length so that we get all permutations for a specific length.
1
2
3
4
5
6
7
8
9
In [39]: for c in permutations('ABC', 2):
...: print(c)
...:
('A', 'B')
('A', 'C')
('B', 'A')
('B', 'C')
('C', 'A')
('C', 'B')
combinations()
combinations
will return all the combinations for the specific sequence:
1
2
3
4
5
6
In [45]: for c in combinations('ABC', 2):
...: print(c)
...:
('A', 'B')
('A', 'C')
('B', 'C')
The difference between permutation and combination is that permutation is an arangement of the elements where the order matters; AB
and BA
would be different permutations, while combination is a selections where the order doesn’t matter, AB
and BA
would be the same combination.
We can see that we have in total 3
combinations of length 2
for ABC
while we have 6
permutations of length 2
for ABC
.
combinations_with_replacement()
With replacement means that after each pick, we can pick back the same value.
1
2
3
4
5
6
7
8
9
In [46]: for c in combinations_with_replacement('ABC', 2):
...: print(c)
...:
('A', 'A')
('A', 'B')
('A', 'C')
('B', 'B')
('B', 'C')
('C', 'C')
So here was can have AA
as a combination with replacement. And that concludes today’s post!
Today we looked at itertools
module, a Python module providing a set of iterator blocks used to combine iterables and construct new iterables. We started by looking at infinite iterators, then moved on to look at finite iterators and completed this post with combinatoric iterators. I hope you liked this post and I see you on the next one!