What is itertools?
Itertools is a Python module that used to iterate iterables using for-loop. As according to Python official
itertools – Functions creating iterators for efficient looping
It is useful to construct different of sequences using iterators.
itertools groupby
groupby() is a way to chunk an iterable data structure into another form of list according to it’s group. Next, specify the key of iterables if any.
Knowing the syntax:
Syntax: itertools.groupby(iterable, key)
For arguments
| Arguments | Requirement |
|---|---|
| iterables | strings, tuples, list, dictionary |
| key | optional |
Source from official Python:
| Iterator | Arguments | Results |
|---|---|---|
| groupby | iterable[, key] | sub-iterator grouped by value of key(v) |
Construct itertools.groupby() function iterables value but without key:
import itertools
iterables = []
for key, group in itertools.groupby(iterables, key):
print(key, list(group))
A list of strings
import itertools
chars = "aaaaabbcccdeee"
x = itertools.groupby(chars)
for key, group in x:
print(key, list(group))
This will output:
a ['a', 'a', 'a', 'a', 'a']
b ['b', 'b']
c ['c', 'c']
d ['d']
e ['e', 'e', 'e']
Always sort first
Always make sure that our iterables got sorted out according to it’s group before anything. Here is a scenario of unsorted iterables.
import itertools
chars = "aaaaabbcccdeeeccaa"
x = itertools.groupby(chars)
for key, group in x:
print(key, list(group))
Here is the result
a ['a', 'a', 'a', 'a', 'a']
b ['b', 'b']
c ['c', 'c', 'c']
d ['d']
e ['e', 'e', 'e']
c ['c', 'c']
a ['a', 'a']
Look at above output it seems to be cluttered as not in sequence alphabets.
Let get it sort using sorted() function.
import itertools
chars = "aaaaabbcccdeeeccaa"
# sort first
chars = sorted(chars)
x = itertools.groupby(chars)
for key, group in x:
print(key, list(group))
Add a line of code to sort iterable list.
chars = sorted(chars)
Here is the result we wanted after sorted(chars).
a ['a', 'a', 'a', 'a', 'a', 'a', 'a']
b ['b', 'b']
c ['c', 'c', 'c', 'c', 'c']
d ['d']
e ['e', 'e', 'e']
Here is how to convert into list
import itertools
print("\n>>> convert into list")
# iterables
animals = [('dog', 'German Shepherd'), ('dog', 'Bulldog'), \
('dog', 'Chihuahua'), ('cat', 'Persian'), \
('cat', 'Bengal'), ('cat', 'Siamese'), ('bird', 'duck'), ('bird', 'penguin'), ('bird', 'peacock')]
# specify key using lambda function
key = lambda x: x[0]
# group as an iterator
an_iterator = itertools.groupby(animals, key)
# loop the iterator
for key, group in an_iterator:
# convert into list
print(key + ' :' + list(group))
Result -> list()
>>> convert into list
dog : [('dog', 'German Shepherd'), ('dog', 'Bulldog'), ('dog', 'Chihuahua')]
cat : [('cat', 'Persian'), ('cat', 'Bengal'), ('cat', 'Siamese')]
bird : [('bird', 'duck'), ('bird', 'penguin'), ('bird', 'peacock')]
Convert into a dictionary
import itertools
print("\n>>> convert into dictionary")
# iterables
animals = [('dog', 'German Shepherd'), ('dog', 'Bulldog'), \
('dog', 'Chihuahua'), ('cat', 'Persian'), \
('cat', 'Bengal'), ('cat', 'Siamese'), ('bird', 'duck'), ('bird', 'penguin'), ('bird', 'peacock')]
# specify key using lambda function
key = lambda x: x[0]
# group as an iterator
an_iterator = itertools.groupby(animals, key)
# loop the iterator
for key, group in an_iterator:
# convert into dictionary
key_and_group = {key : list(group)}
print(key_and_group)
Result -> dict()
>>> convert into dictionary
{'dog': [('dog', 'German Shepherd'), ('dog', 'Bulldog'), ('dog', 'Chihuahua')]}
{'cat': [('cat', 'Persian'), ('cat', 'Bengal'), ('cat', 'Siamese')]}
{'bird': [('bird', 'duck'), ('bird', 'penguin'), ('bird', 'peacock')]}
Above illustration is an example of grouping the same animals together. However we keep repeating the same key for each dog, cat and bird.
Let improve this by not looping the values only instead key.
It can be done by using list comprehension.
import itertools
print("\n>>> using list comprehension")
animals = [('dog', 'German Shepherd'), ('dog', 'Bulldog'), \
('dog', 'Chihuahua'), ('cat', 'Persian'), \
('cat', 'Bengal'), ('cat', 'Siamese'), ('bird', 'duck'), ('bird', 'penguin'), ('bird', 'peacock')]
# loop the iterator
for key, group in itertools.groupby(animals, lambda x: x[0]):
animals = " and " .join([animals[1] for animals in group])
print(key + "s: " + animals + ".")
View the results of list comprehension.
>>> using list comprehension
dogs: German Shepherd and Bulldog and Chihuahua.
cats: Persian and Bengal and Siamese.
birds: duck and penguin and peacock.