What is itertools?
Itertools is a Python module that used to iterate iterables
using for-loop
. As according to Python official
itertools – Functions creating iterators for efficient looping
It is useful to construct different of sequences using iterators.
itertools groupby
groupby()
is a way to chunk an iterable data structure into another form of list according to it’s group. Next, specify the key of iterables if any.
Knowing the syntax:
Syntax: itertools.groupby(iterable, key)
For arguments
Arguments | Requirement |
---|---|
iterables | strings, tuples, list, dictionary |
key | optional |
Source from official Python:
Iterator | Arguments | Results |
---|---|---|
groupby | iterable[, key] | sub-iterator grouped by value of key(v) |
Construct itertools.groupby() function iterables value but without key:
import itertools
iterables = []
for key, group in itertools.groupby(iterables, key):
print(key, list(group))
A list of strings
import itertools
chars = "aaaaabbcccdeee"
x = itertools.groupby(chars)
for key, group in x:
print(key, list(group))
This will output:
a ['a', 'a', 'a', 'a', 'a']
b ['b', 'b']
c ['c', 'c']
d ['d']
e ['e', 'e', 'e']
Always sort first
Always make sure that our iterables got sorted out according to it’s group before anything. Here is a scenario of unsorted iterables.
import itertools
chars = "aaaaabbcccdeeeccaa"
x = itertools.groupby(chars)
for key, group in x:
print(key, list(group))
Here is the result
a ['a', 'a', 'a', 'a', 'a']
b ['b', 'b']
c ['c', 'c', 'c']
d ['d']
e ['e', 'e', 'e']
c ['c', 'c']
a ['a', 'a']
Look at above output it seems to be cluttered as not in sequence alphabets.
Let get it sort using sorted()
function.
import itertools
chars = "aaaaabbcccdeeeccaa"
# sort first
chars = sorted(chars)
x = itertools.groupby(chars)
for key, group in x:
print(key, list(group))
Add a line of code to sort iterable list.
chars = sorted(chars)
Here is the result we wanted after sorted(chars).
a ['a', 'a', 'a', 'a', 'a', 'a', 'a']
b ['b', 'b']
c ['c', 'c', 'c', 'c', 'c']
d ['d']
e ['e', 'e', 'e']
Here is how to convert into list
import itertools
print("\n>>> convert into list")
# iterables
animals = [('dog', 'German Shepherd'), ('dog', 'Bulldog'), \
('dog', 'Chihuahua'), ('cat', 'Persian'), \
('cat', 'Bengal'), ('cat', 'Siamese'), ('bird', 'duck'), ('bird', 'penguin'), ('bird', 'peacock')]
# specify key using lambda function
key = lambda x: x[0]
# group as an iterator
an_iterator = itertools.groupby(animals, key)
# loop the iterator
for key, group in an_iterator:
# convert into list
print(key + ' :' + list(group))
Result -> list()
>>> convert into list
dog : [('dog', 'German Shepherd'), ('dog', 'Bulldog'), ('dog', 'Chihuahua')]
cat : [('cat', 'Persian'), ('cat', 'Bengal'), ('cat', 'Siamese')]
bird : [('bird', 'duck'), ('bird', 'penguin'), ('bird', 'peacock')]
Convert into a dictionary
import itertools
print("\n>>> convert into dictionary")
# iterables
animals = [('dog', 'German Shepherd'), ('dog', 'Bulldog'), \
('dog', 'Chihuahua'), ('cat', 'Persian'), \
('cat', 'Bengal'), ('cat', 'Siamese'), ('bird', 'duck'), ('bird', 'penguin'), ('bird', 'peacock')]
# specify key using lambda function
key = lambda x: x[0]
# group as an iterator
an_iterator = itertools.groupby(animals, key)
# loop the iterator
for key, group in an_iterator:
# convert into dictionary
key_and_group = {key : list(group)}
print(key_and_group)
Result -> dict()
>>> convert into dictionary
{'dog': [('dog', 'German Shepherd'), ('dog', 'Bulldog'), ('dog', 'Chihuahua')]}
{'cat': [('cat', 'Persian'), ('cat', 'Bengal'), ('cat', 'Siamese')]}
{'bird': [('bird', 'duck'), ('bird', 'penguin'), ('bird', 'peacock')]}
Above illustration is an example of grouping the same animals together. However we keep repeating the same key for each dog
, cat
and bird
.
Let improve this by not looping the values only instead key.
It can be done by using list comprehension.
import itertools
print("\n>>> using list comprehension")
animals = [('dog', 'German Shepherd'), ('dog', 'Bulldog'), \
('dog', 'Chihuahua'), ('cat', 'Persian'), \
('cat', 'Bengal'), ('cat', 'Siamese'), ('bird', 'duck'), ('bird', 'penguin'), ('bird', 'peacock')]
# loop the iterator
for key, group in itertools.groupby(animals, lambda x: x[0]):
animals = " and " .join([animals[1] for animals in group])
print(key + "s: " + animals + ".")
View the results of list comprehension.
>>> using list comprehension
dogs: German Shepherd and Bulldog and Chihuahua.
cats: Persian and Bengal and Siamese.
birds: duck and penguin and peacock.