Binary Heaps


Hello people…! In this post I will talk about Binary Heaps, or, more casually called as Heaps. Binary Heaps are used to implement Priority Queues which are associated with Scheduling. Binary Heaps is a starting step to learn many advanced heap related Data Structures that are used in many algorithms. So let’s understand the need of the situation by taking up a scenario.

Let’s say that the CPU has to execute ‘n’ programs that take their own pre-defined time to get completed. Then one strategy to schedule them is the Shortest Remaining Processing Time (SRPT) first method. As the name suggests, we look at all the available programs, then pick up that program which takes the minimum time to get executed. After completing that program, then the next minimum is supposed to be picked up. So if we observe what operations are we doing on the data we have, they would be –

  • Finding the minimum of given elements.
  • Deleting the minimum element (after a program gets executed).
  • Inserting a new element (if a new program enters the scheduling).

So, we must choose such a Data Structure which takes the minimum to these operations so that we can come up with an efficient solution to the scenario. Binary Heap is one such Data Structure that does these operations in pretty fast time. Now, what exactly is the Binary Heap…? What does it look like…? The Binary Heap is actually a Binary Tree has two very important properties –

  • Structural Property – The value of every node in the Binary Tree should be less than or equal to the values of its children nodes.
  • Heap Property – The Binary Tree should be fully-filled up to the 2nd last level and the last level must be “left-filled”.

To understand these two properties let’s take up an example –

heaps1 (2)

So, as we can see, the last level is what is called “left-filled”, i.e., all the nodes are to the left and all the empty spaces to the right. The heap property is also depicted. The parent is smaller than both of its children. Now, having introduced this structure to you. Can you think where the smallest element of the heap would be….? At the top of course…! This is due to the very definition of the Heap Property. If the parent has the lower number, then the ancestor of all the nodes should surely have the least number. But this does not say that the leaves at the last level have the greatest number. Think about it for a while looking at the picture and you will understand it.

Now, the Binary Heap is although depicted as a tree, it is implemented as an array…! Just like the Fenwick Tree. The Binary Tree structure of the allows this. Look at the sketch below –

heaps2

As, you can see, the nodes of the heap are put into an array in the left-to-right order at each level. Why is this so…? This is the way we can access all the nodes in a heap. How…? If we take any element arr[i] –

  • Parent of Element at i → arr[ i2 ]
  • Left Child of Element at → arr[(2 ✕ i)]
  • Right Child of Element at → arr[(2 ✕ i) + 1]

Looking at the picture, we can verify the above statements. For example, take the node 9 (arr[6]) which is at level 3. Its parent is arr[ 92 ] which is arr[3], i.e., node 6 at level 2, as it is in the structure. The left child of arr[6] is arr[(2 ✕ 6)], which is, arr[12], node 20 and like this even the right child can be verified. The way to find the parent is actually floor( i2 ), but as both ‘i’ and 2 are integers the result comes as it is desired. Now, we must discuss the five important operations in a binary heap –

  • Getting minimum element
  • Inserting a new element
  • Deleting a new Element
  • Heapify
  • Build Heap

Minimum Element – As you can clearly see, the minimum element would be the topmost element of the heap which is the first element of the array. So this operation takes O(1) time.

Inserting an Element – Recall that the last level of the heap must be left-filled. So, where ever you add a node, the resulting structure should be such that the last level will have an extra element. We do this, by first adding the new element to the end, then to ensure that the Structural Property is satisfied we keep swapping the element with its parent until the parent node to be swapped is less than the inserted element. So, we kind of insert the element at the bottom and we “move up” the heap. This process is illustrated in the sketch below.

heaps3

This is how we insert a node into a heap. In the worst case we would have to travel all the way up which costs us (log n) swaps, which is the height of the tree. So, the insertion operation has a complexity of O(log n).

Deleting an Element – Deleting an element is also similar to Inserting the element. Structurally, when an element is deleted, the last level will lose the right-most node. So what we do is, we swap the last element and the node to be deleted, then remove the last element. And carry the swapped element up the heap just as we did in Insertion so that the Heap Property is not violated. This is illustrated in the sketch below –

heaps4

This is how the Deletion Operation is done in a Binary Heap. As you can clearly analyse that the worst case would be if we have to swap elements all the way up the Binary Heap, where the Deletion Operation would take O(log n) time.

Heapify – The heapify operation is as the name suggests, converting some “non-heap” thing into a heap. Well what this non-heap thing is that, it is a parent node, where the left and the right children are heaps, but the structure as a whole isn’t a heap. How is that possible…? This parent node we are talking about, its value could be greater than one or both of its children. In such a case, the smaller valued child is swapped with the parent and this is carried on down the tree until the heap property holds true for the whole structure. To understand this, observe the sketch below.

heap5

The process of heapify is clearly shown in the sketch, how we keep swapping the child nodes till the heap property is satisfied for the whole structure. This too takes O(log n) time in the worst case, when we would have to travel the whole till the leaves. Where is this process used anyway…? It is used to build a heap, which is our next discussion.

Build Heap – Build heap is constructing the heap from a given set of randomly arranged nodes in a binary tree. The build heap process uses the heapify to construct the heap “down-to-up”. Now, if you remember the principle of the heapify process, we were able to do that only because the left-subtree and the right-subtree were heaps. Think of the second last level in a heap, the left and the right nodes are leaves, i.e., they have no children, so, they are heaps by themselves. Why is a leaf by itself a heap…? This is because the left child and the right child don’t exist, so we can claim that the parent node (the leaf), satisfies the heap property. And for a leaf, the structural property also holds good because the second-last level, (the level of the leaf node), is fully filled, and the last level (the level of the leaf’s children), is left-filled (actually there are no nodes to be left-filled). Well, most probably, you won’t understand this in the first reading, give it another couple of readings and observe the picture below.

heaps6

I hope the process of the Build Heap is clear. At every level we use the heapify process to gradually build the heap “down-up”. So what is the asymptotic time complexity of Build Heap…? We use heapify for all the nodes except the nodes in the last level,

Nodes till the 2nd Last level = 12 ✕ (Total number of Nodes ⇒ N)
Complexity of the Heapify procedure = O(log N)
Complexity of Build Heap = O(N log N)

But we can argue that the Build Heap process will only take O(N). Now, look at the above sketch once again. If you feel the text is too small, open the image by right-clicking, its bigger than you think…! Now,

The (n + 1)4 nodes at level 3 took 1 swap ⇒ (n + 1)4 ✕ 1
The (n + 1)8 nodes at level 2 took 2 swaps ⇒ (n + 1)8 ✕ 2
The (n + 1)16 nodes at level 1 took 3 swaps ⇒ (n + 1)16 ✕ 3

As, you can see, we have a sort-of progression here. If we were to sum them up, taking (n + 1) common,

⇒ (N + 1) ✕ (14 + 28 + 316 + …)
⇒ (N + 1) ✕ Σ i2i
⇒ (N + 1) ✕ 2       { Σ i2i = 2 }
⇒ O(N)

Therefore, the Build Heap takes O(N) time. The principle behind this is that for most of the time Heapify works on smaller values of ‘N’. These are the most important operations of a Binary Heap. We have discussed the implementation of Binary Heaps using arrays and accessing the parent and child. Now, try to code the above stated operations at least 3 times…! You have to try. There is no better way of knowing a Data Structure than trying to code it. After you have given your everything, if you succeed, you are brilliant…! If not, look at my code below and see for yourselves how close you were…!

/* ==========  ========== ========== ========= */
//        Binary Min Heap Data Structure       //
//           Array Implementation              //
//                                             //
//         Functions follow Pascal Case        //
//           Convention and Variables      	   //
//         follow Camel Case Convention        //
//                                             //
//            Author - Vamsi Sangam            //
//            Theory of Programming            //
/* ========== ========== ========== ========== */
 
#include <stdio.h>
#include <stdlib.h>

struct Vertex
{
	int label, value;
};

// Adds an element to the heap and returns the size - O(log N)
int EnqueueVertex(struct Vertex minHeap[], int size, struct Vertex newValue)
{
    ++size;
    minHeap[size] = newValue;
 
    int i = size;
	struct Vertex temp;
 
    while (i >= 1) {
        if (minHeap[i / 2].value > minHeap[i].value) {
            // Parent node is greater than Child Node
            // Heap Property violated
            // So, swap
            temp = minHeap[i / 2];
            minHeap[i / 2] = minHeap[i];
            minHeap[i] = temp;
 
            i = i / 2;
        } else {
            // Parent is less or equal to the child
            // Heap property not violated
            // So, Insertion is done, break
            break;
        }
    }
 
    return size;
}

// Applies the heapify procedure - O(log N)
void Heapify(struct Vertex minHeap[], int size, int index)
{
    int i = index;
	struct Vertex temp;
 
    while ((2 * i) <= size) {
        // Left Child exists
 
        if ((2 * i) + 1 > size) {
            // Right child does not exist, but left does
            
			if (minHeap[i].value > minHeap[2 * i].value) {
                // Left child is smaller than parent
                temp = minHeap[i];
                minHeap[i] = minHeap[2 * i];
                minHeap[2 * i] = temp;
            }
            
            break;
            // Once you reach this level where it does not
        	// have a right child, the heap ends here
            // taking it to the next iteration is pointless
        }
 
        //Both Children exist
        if (minHeap[i].value > minHeap[2 * i].value || minHeap[i].value > minHeap[2 * i + 1].value) {
            // One of the other child is lesser than parent
            // Now find the least amoung 2 children
 
            if (minHeap[2 * i].value <= minHeap[(2 * i) + 1].value) {
                // Left Child is lesser, so, swap
                temp = minHeap[2 * i];
                minHeap[2 * i] = minHeap[i];
                minHeap[i] = temp;
 
                // And go down the heap
                i = 2 * i;
            } else if (minHeap[2 * i].value > minHeap[(2 * i) + 1].value) {
                // Left Child is lesser, so, swap
                temp = minHeap[(2 * i) + 1];
                minHeap[(2 * i) + 1] = minHeap[i];
                minHeap[i] = temp;
 
                // And go down the heap
                i = (2 * i) + 1;
            }
        } else {
            // Parent is lesser than its children
            break;
        }
    }
}

// Deletes the vertex and returns the size - O(log N)
int DeleteVertex(struct Vertex minHeap[], int size, int index)
{
    // Swap the indexed element with the last
    struct Vertex temp = minHeap[index];
    minHeap[index] = minHeap[size];
    minHeap[size] = temp;
 
    int i = index;
    --size;
 
    Heapify(minHeap, size - 1, i);
 
    return size;
}

// Build Heap Procedure - O(N)
void BuildHeap(struct Vertex minHeap[], int size)
{
    int i;
 
    // Simply call heapify() for all nodes
    // except the last one...!
    for (i = size / 2; i >= 1; --i) {
        Heapify(minHeap, size, i);
    }
}
 
int main()
{
    int n, i;
 
    printf("Enter the Initial size of the Min Heap -\n");
 
    scanf("%d", &n);
 
    struct Vertex minHeap[n + 2];
    // Extra size just to demonstrate Insertion
    // and use the array as 1-indexed
    
    printf("Enter %d elements -\n", n);
 
    for (i = 1; i <= n; ++i) {
        scanf("%d%d", &minHeap[i].label, &minHeap[i].value);
    }
 
    BuildHeap(minHeap, n);
 
    printf("\nHeap -\n");
    for (i = 1; i <= n; ++i) {
        printf("%d, %d\n", minHeap[i].label, minHeap[i].value);
    }
 
	struct Vertex node;
	
    printf("\n\nInsert an element -\n");
    scanf("%d%d", &node.label, &node.value);
    n = EnqueueVertex(minHeap, n, node);
 
    printf("\nHeap After Insert -\n");
    for (i = 1; i <= n; ++i) {
        printf("%d, %d\n", minHeap[i].label, minHeap[i].value);
    }
 
    printf("\n\nDelete an Element at index-\n");
    scanf("%d", &i);
    n = DeleteVertex(minHeap, n, i);
 
    printf("\nHeap After Delete -\n");
    for (i = 1; i <= n; ++i) {
        printf("%d, %d\n", minHeap[i].label, minHeap[i].value);
    }
 
    return 0;
}

One thing I would like to point out is that, my heap is an array of structures. Just an integer would suffice, but I made it an array of structures so that it becomes a little generic. So, tomorrow, if you want to you this code as a Priority Queue, you needn’t change anything (or at most the names) in the code. Or, if you wanted to store more information about each node of the Binary Heap, you can simply add attributes to the struct.
Next, for my deletion operation, I swapped the last element and the element to be deleted and called the heapify procedure. This is because, the heapify procedure, checks if the Heap Property is violated and makes the necessary corrections. As the first element of the heap is the smallest, it can be accessed directly by heap[1]. Remember, the arrays in my code are 1-indexed, and all the loops and conditions go with it.
The code is pretty well tested. You can download the input file here.

There is one last topic to discuss before I conclude.

Max Heap – Whatever heaps we have been seeing till now are all Min Heaps. This is because, the Heap Property is such that the minimum element is put on top. What if the Heap Property would just be the reverse of it…? Something like –

Heap Property – Every parent node must have a value greater than or equal to both of its child node.

In this case, by the Heap property, the maximum element would be put on the top and so will be for all the sub-trees or sub-heaps in the structure. This is a Max Heap. Actually, the very first heap that we used in our Build Heap sketch is a Max Heap. Have a close look at it and you will understand the difference. Max Heap differ from Min Heaps only by the Heap Property, due to which the other differences arise.

Whoo…! This has become a really long post…! Thanks a lot to stay with me till the end. I hope this helps you to get started with heaps. It may not be easy to code the Binary Heap, but that’s how you grow your skills in programming. You won’t learn anything if you sit for hours and write the Hello World program over-and-over again. You need to challenge yourself…! Feel free to comment your doubts..! Commenting is super easy if you are a Facebook, Twitter or a Google+ user…! So, don’t hesitate..! 😉 … Keep practicing…! Happy Coding…! 🙂

4 thoughts on “Binary Heaps

    • Lol…! 😛 … Thanks a lot..! 😀 … But, I can never be better than any professor… I wrote this post leisurely… On the other hand, I don’t think professors have the time to explain anything too descriptively… 😉

      Like

  1. Pingback: binary heaps | programmingalgos

  2. Pingback: Graph Theory – Dijkstra’s Algorithm | Theory of Programming

Leave a comment