Python Collections Module
In this post, we will introduce the collection module in Python, which provides a range of container types for object storage. We introduce the concept of a container in Python and then we go through the available container types.
What is a Container in Python?
Containers in Python are meta-objects that can hold an arbitrary number of other objects. They enable users to access objects and iterate over them. Python contains several in-built container types such as lists, sets, tuples, and dictionaries.
Containers in the Collections Module
In the following sections, we will look at the individual container types available in Python.
Python Counters
A counter is a special type of dictionary in Python. It takes a collection of elements and creates a dictionary of all unique elements displaying their respective counts in the collection. You can pass either a list of elements, a dictionary of elements, or the individual items and their respective counts as arguments to counter.
from collections import Counter #Passing a list print(Counter(['a', 'b', 'a', 'a', 'c', 'c'])) #Passing a dictionary print(Counter({'a':3, 'c':1, 'b':2})) #Passing individual elements with their respective counts print(Counter(a=3, b=2, c=1))
The item functiona like a key in a dictionary. Just like in a Python dictionary, you can access the counts by their keys.
c = Counter(['a', 'b', 'a', 'a', 'c', 'c']) c['a'] # 3
Retrieving Counter Elements
The counter object has a method that allows you to retrieve a list of elements. To display the list, you need to explicitly turn it into a list object.
c = Counter(['a', 'b', 'a', 'a', 'c', 'c']) list(c.elements()) #['a', 'a', 'a', 'b', 'c', 'c']
Finding the Most Common Elements
The next method on the counter object allows you to retrieve the most common elements. The method call returns a list of tuples each consisting of the item and its count. The list is sorted in descending order with the tuple containing the highest number of occurrences listed first.
c = Counter(['a', 'b', 'a', 'a', 'c', 'c']) c.most_common()
Subtracting Elements in a Counter Object
Using the subtract method, you can remove a specific number of instances of an object from the counter object. For example, in the following counter, I have three instances of ‘a’. I subtract two of them which leaves me with one.
c = Counter({'a':3, 'c':1, 'b':2}) c.subtract({'a' : 2}) print(c) #Counter({'a': 1, 'b': 1, 'c': 2})
Python OrderedDict
An OrderedDict in Python is a dictionary that keeps values sorted in the same order they were added to the dictionary. Like a standard dictionary, it contains values that are indexed with a custom keys.
Since the release of Python 3.7. standard dictionaries also maintain the order in which the items were added. Some differences remain, though. OrderedDicts are order sensitive when it comes to equality comparisons. An ordereddict that contains the same items but in a different order would not pass an equality check while a standard dictionary would.
You create an ordered dictionary with the OrderedDict constructor. After initialization, you can add items as you would to a standard dictionary.
from collections import OrderedDict ordered_dict = OrderedDict() ordered_dict['b'] = 2 ordered_dict['a'] = 1 ordered_dict['c'] = 3 print(ordered_dict)
OrderedDicts are iterable like normal dictionaries. A key, value pair always forms an item.
for key, value in ordered_dict.items(): print(key) print(value)
Python DefaultDict
A DefaultDict in Python works like a standard dictionary. The only difference compared to a standard dictionary is that it automatically creates a default value when the callers attempt to access unknown keys. The standard dictionary would throw an error that the key doesn’t exist.
To create a default dictionary you need to import it from the collections module and subsequently create it with the constructor. The constructor requires the data type of the items. If we want to create a String dictionary, we pass str as an argument. For integers, we pass int.
from collections import defaultdict #creating an integer dictionary dd_int = defaultdict(int) #creating a string dictionary dd_string = defaultdict(str)
After creation, you can add items to the default dictionary as you would in a normal dictionary. If you call a key on the defaultdict without assigning a value, a default value will be assigned. When the associated data type is an integer, the default value is 0. When the associated data type is a string, the default value is an empty string. For this reason, you have to specify an associated data type upon creation.
dd_int = defaultdict(int) dd_int[0] = 5 print(dd_int[1]) dd_string = defaultdict(str) dd_string["a"] = 5 print(dd_string["b"])
Python Deque
A deque is a type of list in Python. Contrary to a standard list, the deque manages elements with an underlying doubly linked and stores each item in a new memory block. Insertions and deletions in the beginning and the end are fast and atomic. Each item contains a reference to the exact location of the previous element and the next element in memory.
For this reason, we’ve used deques in the post on stacks in Python. In terms of speed, lists excel at fixed-length operations, while deques excel at operations involving appends and pops.
You create a deque by passing a list as an initial argument.
from collections import deque l = [1, 2, 3] d = deque(l) print(d)
You can add items to the end of a deque using append. To add items to the beginning of a queue we use appendleft.
d = deque([1, 2, 3]) d.append(4) print(d) #deque([1, 2, 3, 4]) d.appendleft(5) print(d) #deque([5, 1, 2, 3, 4])
Similarly, you can remove elements from the beginning or the end of a deque using the pop and popleft functions respectively.
d.pop() print(d) #deque([5, 1, 2, 3]) d.popleft() print(d) #deque([1, 2, 3])
A deque also has a count method to count the occurrence of individual elements in a list. In the following example, the deque contains two occurrences of the number 3.
d = deque([1, 2, 3, 3]) print(d.count(3)) #2
Lasty, you can delete all elements from a deque using the clear method.
d = deque([1, 2, 3]) d.clear() print(d) #deque([])
Python Namedtuple
A namedtuple extends a normal tuple by defining names for every item in a tuple. The names are tied to the position in which items are inserted.
To create a namedtuple, you pass the name of the tuple as the first argument to the constructor. In the second argument you list all elements as one comma-separated string.
For example, we can define a namedtuple called car, with three attributes as follows:
from collections import namedtuple Car = namedtuple('Car', 'brand, model, color')
Note that the second argument is one string that consists of three attributes. After defining the namedtuple “Car”, we can create a concrete namedtuple of “type” car with the appropriate entries for each attribute.
Car = namedtuple('Car', 'brand, model, color') car = Car('Tesla', 'S', 'Blue') print(car)
Printing this tuple shows that each entry is now associated with the attribute in the same position.
Replace an Item in NamedTuple
SIngle attribute items can be replaced using the replace method.
car = Car('Tesla', 'S', 'Blue') new_car = car._replace(color='red') new_car
Create a NamedTuple from a List
Using the make function, we can also create a tuple from a list containing the car’s attributes.
list_car = ['Tesla', 'S', 'Blue'] car = Car._make(list_car) print(car)
Convert a Named Tuple to a Dictionary
A named tuple can also be converted to a dictionary.
Car = namedtuple('Car', 'brand, model, color') car = Car('Tesla', 'S', 'Blue') dict_car = car._asdict() print(dict_car)
Python ChainMap
A chainmap is a meta container that can contain multiple dictionaries or collections such as lists. The result is a list of collections.
Create a Chain Map
To create a chainmap, we can combine several collection types and pass them to the chain map constructor.
from collections import ChainMap l = [1, 2, 3] d1 = {'a': 1, 'b': 2} d2 = {'a': 1, 'b': 2} cm = ChainMap(l, d1, d2) print(cm)
If you use a ChainMap with dictionaries, you can access and assign individual values using the dictionary’s keys directly on the ChainMap.
l = [1, 2, 3] d1 = {'a': 1, 'b': 2} d2 = {'a': 2, 'b': 3} cm = ChainMap( d1, d2, l) print(cm['b']) #2 cm['b'] = 5 print(cm) #ChainMap({'a': 1, 'b': 5}, {'a': 2, 'b': 3}, [1, 2, 3])
Note that when several dictionaries have the same key, only the item in the first dictionary will be retrieved or changed.
Add Items to a ChainMap
You can add new items to a ChainMap using the new_child method.
d1 = {'a': 1, 'b': 2} d2 = {'c': 4, 'd': 3} cm = ChainMap(d1) cm2 = cm.new_child(d2) print(cm2)
Display Items in ChainMap
For displaying items in a ChainMap you have three options. You can print the entire chain map using the maps() method, you can display just the keys using the keys() method, or you can display just the value using the values() method.
d1 = {'a': 1, 'b': 2} d2 = {'c': 4, 'd': 3} cm = ChainMap(d1, d2) # print the whole map print (cm.maps) #[{'a': 1, 'b': 2}, {'c': 4, 'd': 3}] # print keys print(list(cm.keys())) #['c', 'd', 'a', 'b'] # print keys print(list(cm.values())) #[4, 3, 1, 2]
Reverse a ChainMap
A ChainMap can also be reversed using the reverse function.
d1 = {'a': 1, 'b': 2} d2 = {'c': 4, 'd': 3} cm = ChainMap(d1, d2) cm.maps = reversed(cm.maps) print(cm)