Python Collections Module

Sharing is caring

In this post, we will introduce the collection module in Python, which provides a range of container types for object storage. We introduce the concept of a container in Python and then we go through the available container types.

What is a Container in Python?

Containers in Python are meta-objects that can hold an arbitrary number of other objects. They enable users to access objects and iterate over them. Python contains several in-built container types such as lists, sets, tuples, and dictionaries.

Containers in the Collections Module

In the following sections, we will look at the individual container types available in Python.

Python Counters

A counter is a special type of dictionary in Python. It takes a collection of elements and creates a dictionary of all unique elements displaying their respective counts in the collection. You can pass either a list of elements, a dictionary of elements, or the individual items and their respective counts as arguments to counter.

from collections import Counter

#Passing a list
print(Counter(['a', 'b', 'a', 'a', 'c', 'c']))

#Passing a dictionary
print(Counter({'a':3, 'c':1, 'b':2}))

#Passing individual elements with their respective counts
print(Counter(a=3, b=2, c=1))
python counter

The item functiona like a key in a dictionary. Just like in a Python dictionary, you can access the counts by their keys.

c = Counter(['a', 'b', 'a', 'a', 'c', 'c'])
c['a']  # 3

Retrieving Counter Elements

The counter object has a method that allows you to retrieve a list of elements. To display the list, you need to explicitly turn it into a list object.

c = Counter(['a', 'b', 'a', 'a', 'c', 'c'])
list(c.elements()) #['a', 'a', 'a', 'b', 'c', 'c']

Finding the Most Common Elements

The next method on the counter object allows you to retrieve the most common elements. The method call returns a list of tuples each consisting of the item and its count. The list is sorted in descending order with the tuple containing the highest number of occurrences listed first.

c = Counter(['a', 'b', 'a', 'a', 'c', 'c'])
c.most_common()

Subtracting Elements in a Counter Object

Using the subtract method, you can remove a specific number of instances of an object from the counter object. For example, in the following counter, I have three instances of ‘a’. I subtract two of them which leaves me with one.

c = Counter({'a':3, 'c':1, 'b':2})
c.subtract({'a' : 2})
print(c) #Counter({'a': 1, 'b': 1, 'c': 2})

Python OrderedDict

An OrderedDict in Python is a dictionary that keeps values sorted in the same order they were added to the dictionary. Like a standard dictionary, it contains values that are indexed with a custom keys.

Since the release of Python 3.7. standard dictionaries also maintain the order in which the items were added. Some differences remain, though. OrderedDicts are order sensitive when it comes to equality comparisons. An ordereddict that contains the same items but in a different order would not pass an equality check while a standard dictionary would.

python ordereddict vs dict

You create an ordered dictionary with the OrderedDict constructor. After initialization, you can add items as you would to a standard dictionary.

from collections import OrderedDict

ordered_dict = OrderedDict()
ordered_dict['b'] = 2
ordered_dict['a'] = 1
ordered_dict['c'] = 3

print(ordered_dict)
python ordereddict

OrderedDicts are iterable like normal dictionaries. A key, value pair always forms an item.

for key, value in ordered_dict.items():
  print(key)
  print(value)
iterating through ordereddict

Python DefaultDict

A DefaultDict in Python works like a standard dictionary. The only difference compared to a standard dictionary is that it automatically creates a default value when the callers attempt to access unknown keys. The standard dictionary would throw an error that the key doesn’t exist.

To create a default dictionary you need to import it from the collections module and subsequently create it with the constructor. The constructor requires the data type of the items. If we want to create a String dictionary, we pass str as an argument. For integers, we pass int.

from collections import defaultdict

#creating an integer dictionary
dd_int = defaultdict(int)

#creating a string dictionary
dd_string = defaultdict(str)

After creation, you can add items to the default dictionary as you would in a normal dictionary. If you call a key on the defaultdict without assigning a value, a default value will be assigned. When the associated data type is an integer, the default value is 0. When the associated data type is a string, the default value is an empty string. For this reason, you have to specify an associated data type upon creation.

dd_int = defaultdict(int)
dd_int[0] = 5
print(dd_int[1])


dd_string = defaultdict(str)
dd_string["a"] = 5
print(dd_string["b"])

Python Deque

A deque is a type of list in Python. Contrary to a standard list, the deque manages elements with an underlying doubly linked and stores each item in a new memory block. Insertions and deletions in the beginning and the end are fast and atomic. Each item contains a reference to the exact location of the previous element and the next element in memory.

deque in python
A deque in Python is based on a doubly linked list.

For this reason, we’ve used deques in the post on stacks in Python. In terms of speed, lists excel at fixed-length operations, while deques excel at operations involving appends and pops.

You create a deque by passing a list as an initial argument.

from collections import deque

l = [1, 2, 3]
d = deque(l)
print(d) 
python deque

You can add items to the end of a deque using append. To add items to the beginning of a queue we use appendleft.

d = deque([1, 2, 3])
d.append(4)
print(d) #deque([1, 2, 3, 4])

d.appendleft(5)
print(d) #deque([5, 1, 2, 3, 4])

Similarly, you can remove elements from the beginning or the end of a deque using the pop and popleft functions respectively.

d.pop()
print(d) #deque([5, 1, 2, 3])

d.popleft()
print(d) #deque([1, 2, 3])

A deque also has a count method to count the occurrence of individual elements in a list. In the following example, the deque contains two occurrences of the number 3.

d = deque([1, 2, 3, 3])

print(d.count(3)) #2

Lasty, you can delete all elements from a deque using the clear method.

d = deque([1, 2, 3])

d.clear()
print(d) #deque([])

Python Namedtuple

A namedtuple extends a normal tuple by defining names for every item in a tuple. The names are tied to the position in which items are inserted.

To create a namedtuple, you pass the name of the tuple as the first argument to the constructor. In the second argument you list all elements as one comma-separated string.

For example, we can define a namedtuple called car, with three attributes as follows:

from collections import namedtuple

Car = namedtuple('Car', 'brand, model, color')

Note that the second argument is one string that consists of three attributes. After defining the namedtuple “Car”, we can create a concrete namedtuple of “type” car with the appropriate entries for each attribute.

Car = namedtuple('Car', 'brand, model, color')
car = Car('Tesla', 'S', 'Blue')
print(car)

Printing this tuple shows that each entry is now associated with the attribute in the same position.

Replace an Item in NamedTuple

SIngle attribute items can be replaced using the replace method.

car = Car('Tesla', 'S', 'Blue')
new_car = car._replace(color='red')
new_car

Create a NamedTuple from a List

Using the make function, we can also create a tuple from a list containing the car’s attributes.

list_car = ['Tesla', 'S', 'Blue']
car = Car._make(list_car)
print(car)
Python namedtuple

Convert a Named Tuple to a Dictionary

A named tuple can also be converted to a dictionary.

Car = namedtuple('Car', 'brand, model, color')
car = Car('Tesla', 'S', 'Blue')
dict_car = car._asdict()
print(dict_car)
namedtuple items

Python ChainMap

A chainmap is a meta container that can contain multiple dictionaries or collections such as lists. The result is a list of collections.

Create a Chain Map

To create a chainmap, we can combine several collection types and pass them to the chain map constructor.

from collections import ChainMap

l = [1, 2, 3]
d1 = {'a': 1, 'b': 2}
d2 = {'a': 1, 'b': 2}

cm = ChainMap(l, d1, d2)
print(cm)
python chainmap

If you use a ChainMap with dictionaries, you can access and assign individual values using the dictionary’s keys directly on the ChainMap.

l = [1, 2, 3]
d1 = {'a': 1, 'b': 2}
d2 = {'a': 2, 'b': 3}
cm = ChainMap( d1, d2, l)

print(cm['b']) #2 
cm['b'] = 5
print(cm) #ChainMap({'a': 1, 'b': 5}, {'a': 2, 'b': 3}, [1, 2, 3])

Note that when several dictionaries have the same key, only the item in the first dictionary will be retrieved or changed.

Add Items to a ChainMap

You can add new items to a ChainMap using the new_child method.

d1 = {'a': 1, 'b': 2}
d2 = {'c': 4, 'd': 3}
cm = ChainMap(d1)

cm2 = cm.new_child(d2)
print(cm2)

Display Items in ChainMap

For displaying items in a ChainMap you have three options. You can print the entire chain map using the maps() method, you can display just the keys using the keys() method, or you can display just the value using the values() method.

d1 = {'a': 1, 'b': 2}
d2 = {'c': 4, 'd': 3}
cm = ChainMap(d1, d2)

# print the whole map
print (cm.maps) #[{'a': 1, 'b': 2}, {'c': 4, 'd': 3}]
  
# print keys 
print(list(cm.keys())) #['c', 'd', 'a', 'b']

# print keys 
print(list(cm.values())) #[4, 3, 1, 2]

Reverse a ChainMap

A ChainMap can also be reversed using the reverse function.

d1 = {'a': 1, 'b': 2}
d2 = {'c': 4, 'd': 3}
cm = ChainMap(d1, d2)

cm.maps = reversed(cm.maps)
print(cm)
chainmap reversed


Sharing is caring