What is itertools?

Itertools is a Python module that used to iterate iterables using for-loop. As according to Python official

itertools – Functions creating iterators for efficient looping

It is useful to construct different of sequences using iterators.

itertools groupby

groupby() is a way to chunk an iterable data structure into another form of list according to it’s group. Next, specify the key of iterables if any.

Knowing the syntax:

Syntax: itertools.groupby(iterable, key)

For arguments

Arguments Requirement
iterables strings, tuples, list, dictionary
key optional

Source from official Python:

Iterator Arguments Results
groupby iterable[, key] sub-iterator grouped by value of key(v)

Construct itertools.groupby() function iterables value but without key:

import itertools

iterables = []

for key, group in itertools.groupby(iterables, key):
    print(key, list(group))

A list of strings

import itertools

chars = "aaaaabbcccdeee"

x = itertools.groupby(chars)

for key, group in x:
    print(key, list(group))

This will output:

a ['a', 'a', 'a', 'a', 'a']
b ['b', 'b']
c ['c', 'c']
d ['d']
e ['e', 'e', 'e']

Always sort first

Always make sure that our iterables got sorted out according to it’s group before anything. Here is a scenario of unsorted iterables.

import itertools

chars = "aaaaabbcccdeeeccaa"

x = itertools.groupby(chars)

for key, group in x:
    print(key, list(group))

Here is the result

a ['a', 'a', 'a', 'a', 'a']
b ['b', 'b']
c ['c', 'c', 'c']
d ['d']
e ['e', 'e', 'e']
c ['c', 'c']
a ['a', 'a']

Look at above output it seems to be cluttered as not in sequence alphabets.

Let get it sort using sorted() function.

import itertools

chars = "aaaaabbcccdeeeccaa"

# sort first
chars = sorted(chars)

x = itertools.groupby(chars)

for key, group in x:
    print(key, list(group))

Add a line of code to sort iterable list.

chars = sorted(chars)

Here is the result we wanted after sorted(chars).

a ['a', 'a', 'a', 'a', 'a', 'a', 'a']
b ['b', 'b']
c ['c', 'c', 'c', 'c', 'c']
d ['d']
e ['e', 'e', 'e']

Here is how to convert into list

import itertools

print("\n>>> convert into list")

# iterables
animals = [('dog', 'German Shepherd'), ('dog', 'Bulldog'), \
    ('dog', 'Chihuahua'), ('cat', 'Persian'), \
    ('cat', 'Bengal'), ('cat', 'Siamese'), ('bird', 'duck'), ('bird', 'penguin'), ('bird', 'peacock')]

# specify key using lambda function
key = lambda x: x[0]

# group as an iterator
an_iterator = itertools.groupby(animals, key)

# loop the iterator
for key, group in an_iterator:
    # convert into list
    print(key + ' :' + list(group))

Result -> list()

>>> convert into list
dog : [('dog', 'German Shepherd'), ('dog', 'Bulldog'), ('dog', 'Chihuahua')]
cat : [('cat', 'Persian'), ('cat', 'Bengal'), ('cat', 'Siamese')]
bird : [('bird', 'duck'), ('bird', 'penguin'), ('bird', 'peacock')]

Convert into a dictionary

import itertools

print("\n>>> convert into dictionary")

# iterables
animals = [('dog', 'German Shepherd'), ('dog', 'Bulldog'), \
    ('dog', 'Chihuahua'), ('cat', 'Persian'), \
    ('cat', 'Bengal'), ('cat', 'Siamese'), ('bird', 'duck'), ('bird', 'penguin'), ('bird', 'peacock')]

# specify key using lambda function
key = lambda x: x[0]

# group as an iterator
an_iterator = itertools.groupby(animals, key)

# loop the iterator
for key, group in an_iterator:
    # convert into dictionary
    key_and_group = {key : list(group)}
    print(key_and_group)

Result -> dict()

>>> convert into dictionary
{'dog': [('dog', 'German Shepherd'), ('dog', 'Bulldog'), ('dog', 'Chihuahua')]}
{'cat': [('cat', 'Persian'), ('cat', 'Bengal'), ('cat', 'Siamese')]}
{'bird': [('bird', 'duck'), ('bird', 'penguin'), ('bird', 'peacock')]}

Above illustration is an example of grouping the same animals together. However we keep repeating the same key for each dog, cat and bird.

Let improve this by not looping the values only instead key.

It can be done by using list comprehension.

import itertools

print("\n>>> using list comprehension")

animals = [('dog', 'German Shepherd'), ('dog', 'Bulldog'), \
    ('dog', 'Chihuahua'), ('cat', 'Persian'), \
    ('cat', 'Bengal'), ('cat', 'Siamese'), ('bird', 'duck'), ('bird', 'penguin'), ('bird', 'peacock')]

# loop the iterator
for key, group in itertools.groupby(animals, lambda x: x[0]):
    animals = " and " .join([animals[1] for animals in group])
    print(key + "s: " + animals + ".")

View the results of list comprehension.

>>> using list comprehension
dogs: German Shepherd and Bulldog and Chihuahua.
cats: Persian and Bengal and Siamese.
birds: duck and penguin and peacock.