Priority Queue and Heapq in Python
In this post we learn how to create priority queues using Python. There are 3 main ways to implement and use a priority queue in Python:
- Create a list and keep it manually sorted
- Use a binary heap on the basis of Python’s heapq module
- Use the priority queue implementation from Python’s Queue package
The first way is really not recommended because it is inefficient and prone to errors. Therefore, we will look at the later two ways and also learn how to create our own priority queue class on the basis of a heap.
What is a Priority Queue?
A priority queue extends the queue by associating items with priorities. Items with higher priorities are dequeued first even if they are not at the front of the line. When two elements have the same priority, the queue serves them in a first-in-first-out (FIFO) order.
A priority queue generally supports at least three operations:
- add: adds an item to the end of the queue
- pop: retrieves the item with the highest priority that is first in the queue
- is empty: check whether the queue is empty
A priority queue represents an abstract data structure. Abstract data structures define the expected behavior such as that items should be ordered by priority. But they do not define an implementation. To implement a priority queue you need a concrete data structure such as a binary heap. In Python, a common way to implement a priority queue is the heapq module.
What is a Heap?
A heap implements a binary tree structure. A binary tree consists of a hierarchy of nodes where each parent node always has two child nodes. In a max heap, the node with the largest value sits on top. In a min heap, the node with the smallest value sits on top.
Whenever you add a new node with a smaller value than the largest nodes, it will percolate up to its position in the tree while the larger values will percolate down. For example, if we add 1 to a min heap with 2 at the top, one will percolate up.
After the reordering process has finished, the tree will look like this.
Note that the nodes at the same level do not need to come in strictly ascending order. For example, if 5 and 6 were exchanged, the heap property would still be maintained.
To implement a priority queue using a min-heap, we would assign priorities in ascending order. The item with the lowest value has the highest priority. Whenever we add a new value, the reordering process ensures that the new item is placed at the position corresponding to its priority in the heap. When polling items from the queue, we simply get the node on top of the heap.
Python Heapq
The heapq module implements a complete binary tree. If we have an unordered list of elements we can use the heapq module to turn it into a priority queue.
l = [5, 3, 1, 8, 6, 2, 7] heapq.heapify(l) l #[1, 3, 2, 8, 6, 5, 7]
Next, you can pop items off of the priority queue, which will reorder the heap so that the item with the next-highest priority will be next in line.
heapq.heappop(l) # 1 print(l) # [2, 3, 5, 8, 6, 7] heapq.heappop(l) # 2 print(l) # [3, 6, 5, 8, 7]
If you push an item onto the queue, it will be placed at the appropriate position in the heap.
heapq.heappush(l, 9) print(l) # [3, 6, 5, 8, 7, 9] heapq.heappush(l, 4) print(l) # [3, 6, 4, 8, 7, 9, 5]
In addition to pop and push, the heapq module also has a function heapreplace, which pops an item and immediately pushes a new item onto the heap.
l = [5, 3, 1, 8, 6, 2, 7] heapq.heapify(l) print(l) #[1, 3, 2, 8, 6, 5, 7] heapq.heapreplace(l, 4) print(l) #[2, 3, 4, 8, 6, 5, 7]
Heappushpop inverts the order of operations of heapreplace by pushing first and popping next.
l = [5, 3, 1, 8, 6, 2, 7] heapq.heapify(l) print(l) #[1, 3, 2, 8, 6, 5, 7] heapq.heappushpop(l, 4) print(l) #[2, 3, 4, 8, 6, 5, 7]
These two methods are usually more efficient in cases when you need to perform a push and a pop in succession or vice versa.
Custom Priority Queue based on Heapq
In the example above, we used the heap modules to turn a list into a priority queue. We had to explicitly manipulate the list using functions from the heap module. This is prone to errors because we can still interact with the list through the list interface, which can mess up our priority queue. Imagine we create a priority queue and then somebody appends an element using the list’s standard append function.
l = [5, 3, 8, 6, 2, 7] heapq.heapify(l) print(l) #[2, 3, 7, 6, 5, 8] l.append(1) print(l) #[2, 3, 7, 6, 5, 8, 1]
The number 1 is now at the end of the queue instead of at the beginning.
To avoid this problem, we can create a custom wrapper class around the list that makes use of the heap.
class PriorityQueue: def __init__(self, elements = None): if(elements == None): self.elements = list() elif type(elements) == list: heapq.heapify(elements) self.elements = elements def __str__(self): return str(self.elements) # for checking if the queue is empty def isEmpty(self): return len(self.elements) == 0 # for inserting an element in the queue def push(self, element): heapq.heappush(self.elements, element) # for popping an element based on Priority def pop(self): heapq.heappop(self.elements)
The priority queue class constructor takes an optional list with initial values as an argument. If there is an initial list of values, we heapify it to create the order for the priority queue and then assign it to the instance variable called elements.
#creating an empty priority queue q = PriorityQueue() #creating a priority queue with initial values q = PriorityQueue([5, 3, 1, 8, 6, 2, 7]) print(q) [1, 3, 2, 8, 6, 5, 7]#
Now we can apply push and pop operations. The queue always ensures that the lowest value (with the highest priority) is at the front.
q = PriorityQueue() q.push(5) q.push(4) q.push(1) q.push(2) q.push(5) print(q) #[1, 2, 4, 5, 5] q.pop() print(q) #[2, 5, 4, 5] q.pop() print(q) #[4, 5, 5]
Lastly, we can also check whether the queue is empty.
q = PriorityQueue() print(q.isEmpty()) #True q = PriorityQueue([5, 3, 1, 8, 6, 2, 7]) print(q.isEmpty()) #False
Python Priority Queue using the Queue Module
Last but not least, Python provides us with a readymade implementation of a priority queue in their queue module.
The Python priority queue from the queue module is based on a binary heap from the heapq module. Contrary to a direct implementation based on heapq, the Python priority queue ensures thread safety. Therefore, this implementation is preferable in multithreaded environments.
The interface differs slightly. Instead of push you use put and instead of pop you use get.
from queue import PriorityQueue q = PriorityQueue() q.put(5) q.put(4) q.put(1) q.put(2) q.put(5) print(q.get()) print(q.get()) print(q.get()) print(q.get()) print(q.get())
As you can see, the order of the items is the same as in our custom implementation. But you cannot pass an initial array of values.
Edit: As Luke van Hulle helpfully pointed out in the comments, you can also print the entire queue as follows:
print(q.queue) #[1, 2, 4, 5, 5]
Otherwise, you have to pop each item from the list individually as demonstrated above with the get method.
You should also check whether the queue is empty before attempting to retrieve an item. If there is no item currently in the queue, the get function will not terminate. After all, an item could arrive from a different thread. In a single-thread environment, this is not a problem since no item can arrive as long as the get function is running and therefore is occupying the only available thread.
if not q.empty(): q.get()