More Binary Heaps and Heap Sort Algorithm


Hello people…! In this post I will talk about the Binary Heap implemented using structures and not arrays. This is post is the successor to my previous post on Binary Heaps. Using arrays to code Binary Heaps is very comfortable for the programmer. But it is not fit for the day-to-day applications. This is because the Binary Heap is used to implement the Priority Queue and it is meant to grow, not stay of fixed size. But given that it is meant to grow, we cannot use STL Vector for this. This is because the Vector keeps reallocating memory as it grows. So, it is slow. Here we will learn the proper structure based implementation of the Binary Heaps.

Now, currently we have the knowledge of the Binary Heap, its structure, working and other properties. But we implemented this using arrays. Now if we were to implement using a structure, there is one challenge that we will face. That is to ensure the Structural Property of the heap. Why? For insertion into an array, we created a node at the end which itself ensured the Structural Property is not violated and then we bubbled upwards to ensure that the Heap Property is not violated at any point. In the array, we knew exactly where we are supposed to insert the node. But in a tree coded using structures how will you find where to insert the new element such that the tree remains a complete binary tree..? The same goes when we want to delete the root. How do we know where the last node is. For this, we will use the binary representation of the size of the heap to traverse to the location we want.

Size does matter !

While inserting itself we could keep a track of the heap size. So, this takes constant time. While inserting, we must search for the appropriate location. For this, we will convert the size into its binary representation and begin the traversal from the root. How? Keep reading the bits of the binary representation you just generated, if you encounter ‘0’ go to the left child of the parent, if it is a ‘1’ go to the right child of the parent. I know it sounds really weird, but that’s the best thing we can do. Let’s take up an example to understand this better. Consider the Binary Heap below –

BinaryHeaps1

Now consider the binary representation of the number (HeapSize + 1) = 8, which is “1000”. We omit the first bit (left-to-right) because that bit says “come to the root”.

  • The Binary String left is “000”. The leftmost bit is ‘0’. So, go to the Left Child. You come to Node 4.
  • The Binary String left is “00”. The leftmost bit is ‘0’. So, go to the Left Child. You come to Node 1.
  • The Binary String left is “0”. There is only one bit left. This bit doesn’t tell you to “go left” or “go right”, it tells you to “insert it to left” or “insert it to right”. As this bit is “0”, we place the new node as the left child of Node 1.

This is how you must use the binary representation of the (HeapSize + 1) to get to the location you want to insert the node. This is process is depicted in the sketch below –

BinaryHeaps2

Every time you insert a node, you will go all the way down the tree, you will not perform any insertions in the middle, so the traversal always takes O(log n), once inserted, you travel up, ensuring that the heap property holds good, so you have to travel all the way up again, so it is another O(log n) operation. So, the over all process takes O(log n), which is same as in the case of an array implementation.

But for deletion, we are supposed to find something else. We are supposed to find out the last node. In the sketch above, before the insertion of 8, we are supposed to somehow find out that the last node is Node 5. There are a couple of methods by which we can accomplish this –

  1. BFS or DFS – By using the Breadth First Search or the Depth First Search we can find out which is the last node in the Binary Heap by searching the whole tree for a node at the maximum depth which is encountered as last as possible. The time complexity of BFS and DFS is O(|E| + |V|). We would have N vertices and what about the number the edges? We would have N – 1 edges, because every node is pointed by a child pointer, except the root. So, the overall complexity becomes O(N + N), which is O(N).
  2. Binary Traversal – This is the same traversal as we did in the case of Insertion. I just called it Binary Traversal. But the method which we discussed about gave us the Node to be inserted but not the last node. Can you think of a way of modifying the procedure such that it would give us the last node directly…? Think about this for a couple of minutes and you will figure out the simple logic behind it. For those who can’t, don’t worry I will discuss it, when I do, try to figure how silly a thing you missed…! The logic is simple, you used the binary representation of (HeapSize + 1) to get to the location where you were supposed to insert a node, if you use the binary representation of (HeapSize) itself for traversal, you will end up at the last node…! Try working out this on paper and you will realise the logic behind it. As, we would traverse the whole tree from root to leaf, this takes O(log N), which is way faster than BFS and DFS which was an O(N) operation. This kind of Binary Traversal is depicted in the sketch below –

BinaryHeaps3

The next thing that we must know about the structure implementation is that unlike the array implementation, where we could construct the heap using build heap procedure, we cannot do that here. The only way to construct the heap from a given collection of numbers is to keep inserting them one-by-one. As we discussed that in insertion, we would have to go all the way down and come up, inserting a node in the structure implementation will be slower than inserting into a heap implemented using arrays.

So, now let us plan what we would be writing as our code. First things first..! The structure of our nodes in the heap would be –

struct node
{
	int value;
	struct node * parent;
	struct node * leftChild;
	struct node * rightChild;
};

Apart from the standard variables that we see in many structure related codes, we would be needing the parent pointer here. This is because, without that, we cannot travel up the tree, which is needed in insertion and deletion.

Heap Sort – If you have understood the concepts of constructing the Binary Heap, finding the Last Node and Extracting maximum element, then Heap Sort becomes obvious. You just have to keep extracting the maximum element and store it in an array or a list. Then, you will have a sorted array. This essentially destroys the entire heap.

Take a break and come back to code the binary heap. If you have already coded the binary search tree or a binary tree before this should be relatively easy for you. If you have not, then this will be challenging, and I’m sure you’ll be proud of your efforts once you have successfully coded this data structure. You will spend 80% of the time debugging your code, and trust me that’s the way programming is done. I can’t discuss the issues that you might encounter while coding but you can always comment them if you need help. Once you are able to code Insertion and Deletion, Heap Sort should be easy. If you get stuck, you can always refer to my code below –

/*
 * Binary Heap using Structures
 * and Heap Sort Algorithm
 *
 * Authored by,
 * Vamsi Sangam
 */

#include <cstdio>
#include <cstdlib>
#include <list>

using namespace std;

struct node
{
	int value;
	struct node * parent;
	struct node * leftChild;
	struct node * rightChild;
};

// Traverses the Heap top-to-bottom once and returns the location
// of the last node. Works on the priciple of traversing a
// Binary Tree by the binary representation of its size.
struct node * findLastNode(struct node * BinaryHeap, int size)
{
	list<int> Stack;

	// Getting the binary
	// representation of the size
	while (size != 0) {
		if (size & 1) {
			Stack.push_front(true);
		} else {
			Stack.push_front(false);
		}

		size = size >> 1;
	}

	// Omit the first bit
	Stack.pop_front();

	struct node * traverse = BinaryHeap;
	bool temp;

	while (Stack.size() != 0) {
		temp = Stack.front();
		Stack.pop_front();

		if (temp) {
			// encountered 1
			// go to rightChild
			traverse = traverse->rightChild;
		} else {
			// encountered 0
			// go to leftChild
			traverse = traverse->leftChild;
		}
	}

	return traverse;
}

// This makes the neccessary adjustments
// after insertion to ensure the validity
// of Heap Property in the Binary Heap
void bubbleUp(struct node * heap)
{
	if (heap->parent == NULL) {
		return;
	}

	if (heap->parent->value < heap->value) {
		// Swap child and parent
		int temp = heap->value;
		heap->value = heap->parent->value;
		heap->parent->value = temp;

		bubbleUp(heap->parent);
	}
}

// The recursive procedure which goes top-to-bottom from the root to the place of
// insertion and inserts the new node with a value 'Value'. Uses binary traversal.
struct node * insertUtil(struct node * BinaryHeap, int value, list<bool> * Stack)
{
	bool temp;

	if ((*Stack).size() == 1) {
		temp = (*Stack).front();
		(*Stack).pop_front();

		if (temp) {
			// create new node as the right child
			BinaryHeap->rightChild = (struct node *) calloc(1, sizeof(struct node));
			BinaryHeap->rightChild->parent = BinaryHeap;
			BinaryHeap = BinaryHeap->rightChild;
			BinaryHeap->value = value;
		} else {
			// create new node as the left child
			BinaryHeap->leftChild = (struct node *) calloc(1, sizeof(struct node));
			BinaryHeap->leftChild->parent = BinaryHeap;
			BinaryHeap = BinaryHeap->leftChild;
			BinaryHeap->value = value;
		}
	} else {
		temp = (*Stack).front();
		(*Stack).pop_front();

		if (temp) {
			// encountered 1
			// go to rightChild
			BinaryHeap = insertUtil(BinaryHeap->rightChild, value, Stack);
		} else {
			// encountered 0
			// go to leftChild
			BinaryHeap = insertUtil(BinaryHeap->leftChild, value, Stack);
		}
	}

	return BinaryHeap;
}

// Inserts a new node into the binary heap and also makes
// the neccessary swaps to ensure Heap Property isn't violated
void insertIntoHeap(struct node * BinaryHeap, int value, list<bool> * Stack)
{
	(*Stack).pop_front(); // Omit the first bit

	// Inserting into Heap and keeping a
	// track of the newly inserted node
	struct node * newNode = insertUtil(BinaryHeap, value, Stack);

	// Bubble the new node up the
	// Heap to ensure validity of
	// the Heap Property
	bubbleUp(newNode);
}

// Allocates memory for the Binary
// Heap thus creates the root node
struct node * getBinaryHeap(int value)
{
	struct node * BinaryHeap = (struct node *) calloc(1, sizeof(struct node));

	BinaryHeap->value = value;

	return BinaryHeap;
}

// A simple utility function
// to aid In-Order Walk
void print(struct node * BinaryHeap)
{
	printf("%d ", BinaryHeap->value);
}

// Prints the nodes of the Binary
// Heap in the In-Order fashion
void inOrderTraversal(struct node * BinaryHeap)
{
	if (BinaryHeap == NULL) {
		return;
	}

	inOrderTraversal(BinaryHeap->leftChild);
	print(BinaryHeap);
	inOrderTraversal(BinaryHeap->rightChild);
}

// The standard recursive Heapify procedure to
// convert a partial heap to a complete heap
// Mainly used by Extract-Max procedure
void heapify(struct node * BinaryHeap)
{
	if (BinaryHeap->leftChild != NULL && BinaryHeap->rightChild != NULL) {
		// both children exist

		if (BinaryHeap->value < BinaryHeap->leftChild->value && BinaryHeap->value < BinaryHeap->rightChild->value) {
			if (BinaryHeap->leftChild->value > BinaryHeap->rightChild->value) {
				// leftChild is greater

				int temp = BinaryHeap->value;
				BinaryHeap->value = BinaryHeap->leftChild->value;
				BinaryHeap->leftChild->value = temp;

				heapify(BinaryHeap->leftChild);
			} else {
				// rightChild is greater

				int temp = BinaryHeap->value;
				BinaryHeap->value = BinaryHeap->rightChild->value;
				BinaryHeap->rightChild->value = temp;

				heapify(BinaryHeap->rightChild);
			}
		} else if (BinaryHeap->value < BinaryHeap->leftChild->value && BinaryHeap->value > BinaryHeap->rightChild->value) {
			// leftChild is greater & rightChild is smaller than parent

			int temp = BinaryHeap->value;
			BinaryHeap->value = BinaryHeap->leftChild->value;
			BinaryHeap->leftChild->value = temp;

			heapify(BinaryHeap->leftChild);
		} else if (BinaryHeap->value > BinaryHeap->leftChild->value && BinaryHeap->value < BinaryHeap->rightChild->value) {
			// rightChild is greater & leftChild is smaller than parent

			int temp = BinaryHeap->value;
			BinaryHeap->value = BinaryHeap->rightChild->value;
			BinaryHeap->rightChild->value = temp;

			heapify(BinaryHeap->rightChild);
		}
	} else if (BinaryHeap->rightChild == NULL && BinaryHeap->leftChild != NULL) {
		// Only the leftChild exists

		if (BinaryHeap->leftChild->value > BinaryHeap->value) {
			// The existing leftChild is greater than parent
			// So swap

			int temp = BinaryHeap->leftChild->value;
			BinaryHeap->leftChild->value = BinaryHeap->value;
			BinaryHeap->value = temp;
		}

		// When we reach here, logically speaking, we have come
		// to the last node in the Binary Heap, so after this
		// we simply return from the function
	}
}

// Returns the root value and deletes it by swapping the root
// node and the last node and then calling heapify on root node
int extractMax(struct node * BinaryHeap, struct node * lastNode)
{
	if (BinaryHeap == NULL) {
		return -1;
	}

	if (BinaryHeap == lastNode) {
		// A corner case where only
		// one node exists

		int max = BinaryHeap->value;

		free(BinaryHeap);

		return max;
	}

	int max = BinaryHeap->value;

	// Swap root node and last node
	int temp = BinaryHeap->value;
	BinaryHeap->value = lastNode->value;
	lastNode->value = temp;

	// Making its parent point to NULL appropriately
	struct node * parent = lastNode->parent;

	if (parent->leftChild == lastNode) {
		parent->leftChild = NULL;
	} else {
		parent->rightChild = NULL;
	}

	// Deleting the lastNode
	free(lastNode);
	heapify(BinaryHeap);	// making adjustments

	return max;
}

// Heap Sort procedure which takes and empyt list 'Sorted List'
// and fills it with the elements in sorted order. Keeps extracting
// the maximum element from the Max Heap until it is empty
void heapSort(struct node * BinaryHeap, list<int> * SortedList, int HeapSize)
{
	struct node * lastNode;
	list<int>::iterator itr;

	while (HeapSize != 0) {
		//getting the lastNode to assist extractMax procedure
		lastNode = findLastNode(BinaryHeap, HeapSize);
		(*SortedList).push_front(extractMax(BinaryHeap, lastNode));
		--HeapSize;
	}
}

int main()
{
	int n;

	printf("Enter the size of input -\n");
	scanf("%d", &n);

	int temp, i, j, HeapSize = 0;
	struct node * BinaryHeap;
	list<bool> Stack;

	printf("\nEnter %d integers -\n", n);

	for (i = 0; i < n; ++i) {
		scanf("%d", &temp);
		++HeapSize;

		if (i == 0) {
			BinaryHeap = getBinaryHeap(temp);
			continue;
		}

		// Clears the Stack before filling it with
		// the Binary Representation of HeapSize
		Stack.clear();
		j = HeapSize;

		while (j != 0) {
			if (j & 1) {
				Stack.push_front(true);
			} else {
				Stack.push_front(false);
			}

			j = j >> 1;
		}

		insertIntoHeap(BinaryHeap, temp, &Stack);
	}

	printf("\nIn Order Traversal - ");
	inOrderTraversal(BinaryHeap);
	printf("\n");

	list<int> SortedList;

	heapSort(BinaryHeap, &SortedList, HeapSize);
	BinaryHeap = NULL;

	list<int>::iterator itr = SortedList.begin();

	printf("\nElements in Sorted order - ");
	while (itr != SortedList.end()) {
		printf("%d ", *itr);
		++itr;
	}
	printf("\n");

	return 0;
}

The code is big and that is how it should be and moreover, it is well commented with information. If you still have doubts regarding the code or any concept, feel free to comment them. I hope my post has helped you to get a complete idea about the Binary Heap. If it did, let me know by commenting…! Keep practising… Happy Coding…! 🙂

You may also like –

More in the Navigation Page ! →

Advertisements

Segment Trees


Hello people…! In this post I will talk about one of the most versatile data structure in competitive programming, the Segment Tree. Almost in every contest that you do in the major coding websites, at least one question is asked which uses a Segment Tree in its solution. This is because it is easy to code, and does some useful operations quickly, in logarithmic time. Segment Trees are used where the problem demands these two queries on a given array arr[1 … n] –

  1. Sum of elements in the range arr[L … R]. Where, 1 <= L <= R <= n.
  2. Updating an element arr[i].

You can think of doing this in the naive approach where you directly operate on the input array itself. You could do the first operation in O(n) time by running a loop calculate the sum. The second operation could be done in O(1) time, because, it’s only simply updating. This is good if the number of sum queries asked are very less and the update queries are more.

The next best thing you could think of is, Prefix Sum Array. The Prefix Sum Array is an array where the element arr[i] holds the sum of values from arr[1] to arr[i]. You can construct the Prefix Sum Array in O(n) time and solve the first query in O(1) time. Prefix Sum Array is one of my favourite techniques to find out the sum in a given range. But the update query takes O(n) time. If you want to update the ith element, you’d have to update all the elements from i to n, which takes O(n) in the worst case.

How about doing both of them in O(log n) time…? That’s really quick..! This is the speciality of the Segment Tree. We can get the sum query and the update query done in logarithmic time. The Segment Tree works on the principle of partial sums. If you have read my post on Fenwick Trees (or) Binary Indexed Trees, or if you already know BIT, Segment Tree should be easy for you. For those who don’t know, don’t worry, you’ll have an easier time learning the Binary Indexed Trees than compared to the others. So, let’s get started..!

As I mentioned earlier, the Segment Tree works on the principle of partial sums. What this principle is that we somewhere hold the sum of a few elements and keep building on it. In a Segment Tree, this “partial” is actually “half”. The Segment Tree structurally is a Binary Tree. Every internal node is associated with a range of numbers, whose sum is stored in that node. Now, what we do is partition the array into two halves. The left child will hold the sum of elements in the left partition and the right child will hold the sum of elements in the right partition. You may not get the idea of Segment Tree in the first reading. That’s okay. Give another reading and ponder upon the sketch below for a better understanding.

segment1

I hope it is clear from the picture how the Segment Tree works on the partial sums, which here is the left partition sum and the right partition sum. The leaf nodes represent the individual elements of the input array.

Note that the array we will not store the array in the nodes of our Segment Tree as was depicted in the sketch. This was done in the sketch only to give you the idea. The array will not be stored in the nodes. However, we will store the range limits on which the sum is defined, i.e, we will store the start index and the end index of the span of elements in the input array whose sum is there in the node. We will use a struct for the node which looks like this –

struct node
{
	int value, low, high;
	struct node * leftChild;
	struct node * rightChild;
};

The variable ‘value’ was to store the sum, ‘low’ and ‘high’ is to store the range. The variables ‘leftChild’ and the ‘rightChild’ are to point to the children.

Building the Segment Tree –

The Segment Tree is built in a recursive manner. We could write it in a step-by-step procedure –

  • Allocate memory.
  • Start with the root.
  • Partition the array you are given into two halves.
  • Create nodes for each half.
  • The value of the parent node is the sum of values on its children.
  • This is calculated by recursively calling the procedure on the children.
  • When the array given reduces to one element. The procedure must create a node and return the value.
  • When the parent node has acquired the sum of its children, it must return its current value to carry on the recursion.

The step-by-step procedure may not be clear in the first reading. You will know what I was talking about when you understand the pseudo-code below –

int buildSegmentTree(Array, low, high, SegmentTree)
	if low == high
		SegmentTree->value = Array[low]
		SegmentTree->low = SegmentTree->high = low

		return SegmentTree->value

	mid = (low + high) / 2

	SegmentTree->low = low
	SegmentTree->high = high
	SegmentTree->leftChild = allocate memory using calloc()
	SegmentTree->rightChild = allocate memory using calloc()
	SegmentTree->value = buildSegmentTree(Array, low, mid, SegmentTree)
					   + buildSegmentTree(Array, mid + 1, high, SegmentTree)

struct node * getSegmentTree(Array, ArraySize)
	if ArraySize <= 0
		return NULL

	struct node * SegmentTree = allocate memory using calloc()

	buildSegmentTree(Array, 1, ArraySize, SegmentTree)

	return SegmentTree

I made the pseudo-code a little extra-descriptive so that you can understand the step-by-step procedure and the code simultaneously. If you have any doubts, feel free to comment them.

The time complexity of the code is O(n). This is because in the worst case we would end up building a complete binary tree which will have 2 ✗ n – 1 nodes, and we preform calculations that take O(1) time for every node.

Sum Query –

Now, it is time to discuss the two important queries that we started our discussion with. We will begin with the sum query. We are supposed to tell the sum of all the elements in a given range. Now we will use our ‘low’ and ‘high’ variables in each node of the Segment Tree. The way we do this query mimics the Binary Search. We would be given a well defined range and we will keep comparing that with the range in each node of the Tree. If it fits perfectly, then we simply return the value stored in that node. If it doesn’t we have a few cases to handle. To understand the cases, consider the following notations –

  • NodeLeft – This is the left bound of the range whose sum is stored in the node. We actually named this as ‘low’ inn our struct.
  • NodeRight – This is the right bound of the range whose sum is stored in the node. We actually named this as ‘high’ inn our struct.
  • QueryLeft – This is the left bound of the range whose sum we are supposed to calculate.
  • QueryRight – This is the right bound of the range whose sum we are supposed to calculate.
  • Mid – This is the index upon which the partition occurs to assign values to the children nodes. It is essentially the mean of NodeLeft and NodeRight.
  • Value – This is the value that the current node is holding. Same as the ‘value’ variable in our struct.

Now, let’s talk about the cases we would encounter –

  1. (NodeLeft == QueryLeft && NodeRight == QueryRight) – This is the best thing that can happen. As we discussed a moment ago, we simply return the Value.
  2. (NodeLeft >= QueryLeft && NodeRight >= QueryRight) – This is the situation where among the elements that the current node holds the sum of, a few of the elements are not needed. The range that the current node has covers more than that’s needed. Now, to proceed with the recursion, we use Mid –
    • (QueryLeft <= Mid) → getSumQuery(SegmentTree->leftChild, QueryLeft, Mid) – This condition checks if the query demands the sum of elements which are in the left partition. So, we carry on the recursion with query range concerning only the left partition.
    • (QueryRight > Mid + 1) → getSumQuery(SegmentTree->leftChild, QueryLeft, Mid) – This checks if the query wants elements of the right partition. So, we carry on the recursion with query range concerning only the right partition.

    Note that both the conditions must be checked for the sum query to give accurate results. A special case arises when (QueryLeft == QueryRight), which means that we are looking for one element, which would obviously be either in the left or the right partition. In that case, we don’t have the necessity of  checking the second condition if the first condition is already met. This is because if we know the query asks for an element in the left partition, we have nothing to do with the right partition in this case.

This will be more clear once you put this idea to work by pen-and-paper. Coding this is simple too. Once you are able to code it, you will realise its resemblance to Binary Search. In the worst case, we would have to travel all the way down the tree, which takes O(log n).

Update Query –

What do we do if we were to update an element initially given in the input. How would we have to modify our Segment Tree so that it is still correct? We know that the initial input elements can be found at the leaves of the Segment Tree. So, we should be updating that. How do we traverse down to that leaf? We do this similarly as we did it in Sum Query, where we took a range and went down the tree based on it. If we put it in Sum Query terms, QueryLeft is equal to QueryRight. We search for it in a similar fashion. Then, we update the node which we came searching for. We also keep a note of the difference, and we add the difference to all those nodes which we came by. This is because, when you update a leaf of the Segment Tree, the internal nodes which would get affected are those which contain the partial sum in which the leaf is involved. We already met those nodes in our recursive traversal. When we are “coming out” of recursion, we update them by the difference that was caused by the new value. So, basically we go down the tree and come back. So, this operation takes O(2 log n) which is O(log n).

Take a break for a while and come back to start coding the Segment Tree. You will need to do a lot of pen-and-paper work while coding this data structure. And it is totally worth it. To print your Segment Tree, you could use the Tree Walks, In-order or Pre-order or whichever pleases you. If you are getting stuck, you can always refer to my code below –

/*
 * Segment Tree
 * Data Structure
 *
 * Authored by,
 * Vamsi Sangam
 */

#include <stdio.h>
#include <stdlib.h>

struct node
{
	int value, low, high;
	struct node * leftChild;
	struct node * rightChild;
};

// The recursive procedure which actually constructs the Segment Tre
int buildSegmentTree(int arr[], int low, int high, struct node * SegmentTree)
{
	if (low == high) {
		// we have only 1 element of the array
		SegmentTree->value = arr[low];
		SegmentTree->low = SegmentTree->high = low;

		return SegmentTree->value;
	}

	// Partition
	int mid = (low + high) / 2;

	SegmentTree->low = low;			// Setting bounds to the partial sum
	SegmentTree->high = high;		// that this node is associated with

	// Creating children
	SegmentTree->leftChild = (struct node *) calloc(1, sizeof(struct node));
	SegmentTree->rightChild = (struct node *) calloc(1, sizeof(struct node));
	SegmentTree->value = buildSegmentTree(arr, low, mid, SegmentTree->leftChild)
						+ buildSegmentTree(arr, mid + 1, high, SegmentTree->rightChild);

	// Must return the value to carry on recursion
	return SegmentTree->value;
}

// Allocates memory for the root and calls method
// to construct the Segment Tree
struct node * getSegmentTree(int arr[], int n)
{
	if (n <= 0) {
		return NULL;
	}

	struct node * SegmentTree = (struct node *) calloc(1, sizeof(struct node));

	buildSegmentTree(arr, 1, n, SegmentTree);

	return SegmentTree;
}

// A utility function used for Inorder Walk
void print(struct node * SegmentTree)
{
	printf("%d -> [%d, %d]\n", SegmentTree->value, SegmentTree->low, SegmentTree->high);
}

// In-Order Walk of Segment Tree
void inOrderWalk(struct node * SegmentTree)
{
	if (SegmentTree == NULL) {
		return;
	}

	inOrderWalk(SegmentTree->leftChild);
	print(SegmentTree);
	inOrderWalk(SegmentTree->rightChild);
}

// Recursive procedure to find out the sum of elements
// from arr[low] to arr[high] using the Segment Tree
int getSumQuery(struct node * SegmentTree, int low, int high, int sum)
{
	if (low == SegmentTree->low && high == SegmentTree->high) {
		sum += SegmentTree->value;

		return sum;
	}

	if (low == high) {
		int mid = (SegmentTree->low + SegmentTree->high) / 2;

		if (low <= mid) {
			sum = getSumQuery(SegmentTree->leftChild, low, low, sum);
		} else {
			sum = getSumQuery(SegmentTree->rightChild, high, high, sum);
		}
	} else {
		int mid = (SegmentTree->low + SegmentTree->high) / 2;

		if (low <= mid) {
			sum = getSumQuery(SegmentTree->leftChild, low, mid, sum);
		}

		if (high >= mid + 1) {
			sum = getSumQuery(SegmentTree->rightChild, mid + 1, high, sum);
		}
	}

	return sum;
}

// Updates an element in the Segment Tree and makes the neccessary adjustments
int updateQuery(struct node * SegmentTree, int index, int newValue, int difference)
{
	if (SegmentTree->low == SegmentTree->high) {
		difference += newValue - SegmentTree->value;
		SegmentTree->value = newValue;

		return difference;
	}

	int mid = (SegmentTree->low + SegmentTree->high) / 2;

	if (index <= mid) {
		difference = updateQuery(SegmentTree->leftChild, index, newValue, difference);
	} else {
		difference = updateQuery(SegmentTree->rightChild, index, newValue, difference);
	}

	SegmentTree->value += difference;

	return difference;
}

int main()
{
	int n;

	printf("Enter the size of input -\n");
	scanf("%d", &n);

	int arr[n + 1], i;

	printf("\nEnter %d Integers -\n", n);

	arr[0] = -1;
	for (i = 1; i <= n; ++i) {
		scanf("%d", &arr[i]);
	}

	struct node * SegmentTree = getSegmentTree(arr, n);

	// Printing the Tree after constructing
	printf("\nIn-Order Walk of the Segment Tree -\n");
	inOrderWalk(SegmentTree);

	int low, high;

	// Demonstrating Sum Query
	printf("\nEnter the range for sum query - [1, %d]\n", n);
	scanf("%d%d", &low, &high);
	printf("\nSum = %d\n", getSumQuery(SegmentTree, low, high, 0));

	// Demonstrating Update Query
	printf("\nEnter the index of number and new value to update -\n");
	scanf("%d%d", &i, &low);
	updateQuery(SegmentTree, i, low, 0);

	// Printing the Tree after Updating
	printf("\nIn-Order Walk of the Segment Tree after Update -\n");
	inOrderWalk(SegmentTree);

	return 0;
}

There are many varieties of questions that can be asked based on this Data Structure. I’m sure you’ll start liking and appreciating this data structure as you keep solving many questions on it. If you have any doubts, feel free to comment them. I hope my post has helped you in understanding the Segment Tree. If it did, let me know by commenting…! Keep practising… Happy Coding…! 🙂

You may also like –

More in the Navigation Page ! →

Bellman Ford Algorithm


Hello people…! In this post I will talk about another single source shortest path algorithm, the Bellman Ford Algorithm. Unlike Dijkstra’s Algorithm, which works only for a graph positive edge weights, the Bellman Ford Algorithm will give the shortest path from a given vertex for a graph with negative edge weights also. Due to this, the Bellman Ford Algorithm is more versatile, but, it’s speciality comes at a cost. The runtime complexity of Bellman Ford Algorithm is O(|V||E|), which is substantially more than that of Dijkstra’s Algorithm. Sometimes this algorithm is called Bellman Ford Moore Algorithm, as the same algorithm was published by another researcher.

Before we get started, there are a couple of things that we must understand. Firstly, why does Dijkstra’s Algorithm fail for negative edge weights and second, the concept of Negative Cycles.

Why does Dijkstra fail?

Consider, the graph below,

bellman1

The Dijkstra’s Algorithm is based on the principle that, if S → V1 → … → Vk is the shortest path from S → Vk then D(S, Vi) ≤ D(S, Vj). But in the above given graph, clearly that principle is violated. In the above graph, the shortest path from V1 to V3 is of length 3 units but the shortest path to V4 is of length 1 unit which means that V4 is actually closer to V1 than V3, which is contradicting Dijkstra’s principle.

Negative Cycles

A Negative Cycle is a path V1 → V 2 → V3 → … Vk → V1 where the total sum of the edge weights in the path is negative. Consider the graph below –

bellman2

The path B → C → D is a negative cycle as the path’s total weight would be -2. So, the distance from A → B is 2, but if we circle the cycle once, we would get the distance as 0, if we circle once more, we would get -2. Like this we could keep on circling as much as we want to reduce the shortest distance. Hence the shortest distance to the vertex B, E becomes indeterminate.

So, we want Bellman Ford Algorithm to solve these two issues. We want it to compute the shortest path even for a graph with negative edges and negative cycles. The Bellman Ford will accurately compute the shortest path for a graph with negative edges but the algorithm fails for a graph with negative cycles. But, the algorithm can tell you if a negative cycle exists or not. If it exists the solution it puts up is incorrect, otherwise, the solution given by Bellman Ford Algorithm is perfect. This sounds fine because logically there will be no shortest paths for a graph with negative cycles.

Unlike the Dijkstra’s Algorithm where we had to use a priority queue, we will require no additional data structure to code Bellman Ford Algorithm. This makes writing the code much easier. And the algorithm is pretty straight-forward too. Take a look at the pseudo-code of the Bellman Ford Algorithm given below –

bellmanFord(G, s)
	for all edges in G(V)
		D(V) = INT_MAX
		parent[V] = -1

	D(s) = 0

	for i = 1 to |G(V)| - 1
		for each edge (u, v) in G(E)
			if edge can be Relaxed
				D(v) = D(u) + weight of edge (u, v)
				parent[v] = u

	for each edge in G(E)
		if edge can be Relaxed
			return false

	return true

You may not understand the pseudo-code at the first look, here’s a step-by-step representation of it –

  • Initialise the array which contains the shortest distances to infinity (a high integer value in the pseudo-code).
  • Initialise the parent array which contains the parent vertices in the shortest path to NULL (or -1 if it is an integer array).
  • Set the shortest distance of starting vertex to 0.
  • Explore all the edges, and see if you can relax them. If you can, relax the edge and proceed with the exploration.
  • Do the above operation |V| – 1 times.
  • After that, do another exploration on the graph checking all the edges if they can be relaxed. If they can be relaxed, you can a negative cycle in the graph. Hence, return false.
  • If the exploration gets finished successfully, the graph has no negative cycles and the data that you compute dis correct, so return true.

Now, what does exploring all the edges mean? If you are implementing the graph using an Adjacency List, it means to iterate over all the linked lists associated with all vertices. Now, what will be the sum of all the nodes in all Linked Lists in a given Adjacency List? Number of edges off course! So, we check all the edges from, edges of vertex 1 to vertex |V| in a linear manner. This whole operation takes O(|E|) time, which is repeated |V| – 1, so this part of the code takes O(|E||V|) time. Now, analyse the pseudo-code for a few minutes. Ask yourself how would you code this-ans-that. Now, when your mind is filled with the pseudo-code, look at the sketch below. The sketch below is sort of, “dry run” of the pseudo-code stated above –

bellman3

The above sketch is self-explanatory. I hope you understand how the iterations go. In a way it looks like a very ordinary algorithm, without any greedy steps or partitions or so. The Bellman Ford Algorithm is pretty easy to code too. If you can work hard for an hour or two I’m sure you can code this algorithm. It does not require any priority queue or other tools. All you need to code Bellman Ford Algorithm is the pseudo-code. The pseudo-code is very important. Keep looking at the pseudo-code again-and-again whenever you get a doubt. I have put my code below for a reference, it is a C++ code –

/*
 * Bellman-Ford Algorithm
 *
 * Authored by,
 * Vamsi Sangam
 */

#include <cstdio>
#include <cstdlib>
#include <climits>

using namespace std;

// This is used to construct
// the Adjacency List
struct node {
	int vertex, weight;
	struct node * next;
};

// This is used to construct the Shortest Paths to all
// vertices, as we cannot return multiple values,
// we use a struct
struct pathInfo {
	int vertex, distance, parent;
};

// Adds a new edge into the Adjacency List
// Follows Head Insertion for O(1) Insertion
struct node * add(struct node * head, int vertex, int weight)
{
	struct node * p = (struct node *) malloc(sizeof(struct node));

	p->vertex = vertex;
	p->weight = weight;
	p->next = head;

	return p;
}

// Bellman-Ford Algorithm which takes the Graph (adjacencyList[]), starting vertex (startVertex),
// and an empty array shortestDistances[] as input. It applies the algorithm and keeps filling values
// into shortestDistances[]. It returns true if there are no negative edges, and vice-versa.
bool bellmanFord(struct node * adjacencyList[], int vertices, int startVertex, struct pathInfo shortestDistances[])
{
	struct node * traverse;
	int i, j, k;

	// Initialisation
	for (i = 0; i <= vertices; ++i) {
		shortestDistances[i].vertex = i;
		shortestDistances[i].distance = INT_MAX;
		shortestDistances[i].parent = -1;
	}

	// Setting distance to source = 0
	shortestDistances[startVertex].parent = 0;
	shortestDistances[startVertex].distance = 0;

	// The Algorithm that computes Shortest Distances
	for (i = 1; i <= vertices - 1; ++i) {		// Runs 'vertices - 1' times = O(|V|)
		for (j = 1; j <= vertices; ++j) {		// Runs as many times as the edges = O(|E|)
			// The code ahead basically explores the whole of Adjcency List,
			// covering one edge once, so the runtime of the code in this
			// block is O(|E|)

			traverse = adjacencyList[j];

			while (traverse != NULL) {
				if (shortestDistances[j].distance == INT_MAX) {
					// Important...!
					traverse = traverse->next;
					continue;
				}

				// Checking for Relaxation
				if (traverse->weight + shortestDistances[j].distance < shortestDistances[traverse->vertex].distance) {
					// Relaxation
					shortestDistances[traverse->vertex].distance = traverse->weight + shortestDistances[j].distance;
					shortestDistances[traverse->vertex].parent = j;
				}

				traverse = traverse->next;
			}
		}
	}

	// Checking for Negative Cycles
	for (j = 1; j <= vertices; ++j) {
		traverse = adjacencyList[j];

		while (traverse != NULL) {
			// Checking for further relaxation
			if (traverse->weight + shortestDistances[j].distance < shortestDistances[traverse->vertex].distance) {
				// Negative Cycle found as further realxation is possible
				return false;
			}

			traverse = traverse->next;
		}
	}

	return true;
}

int main()
{
	int vertices, edges, i, j, v1, v2, weight;

	printf("Enter the Number of Vertices -\n");
	scanf("%d", &vertices);

	printf("Enter the Number of Edges -\n");
	scanf("%d", &edges);

	struct node * adjacency_list[vertices + 1];
	//Size is made (vertices + 1) to use the
	//array as 1-indexed, for simplicity

	//Must initialize your array
	for (i = 0; i <= vertices; ++i) {
		adjacency_list[i] = NULL;
	}
	printf("Enter the edges -\n\n");

	for (i = 1; i <= edges; ++i) {
		scanf("%d%d%d", &v1, &v2, &weight);
		adjacency_list[v1] = add(adjacency_list[v1], v2, weight);
	}

 	//Printing Adjacency List
	printf("\nAdjacency List -\n\n");
	for (i = 1; i <= vertices; ++i) {
		printf("adjacency_list[%d] -> ", i);

		struct node * temp = adjacency_list[i];

		while (temp != NULL) {
			printf("(%d, %d) -> ", temp->vertex, temp->weight);
			temp = temp->next;
		}

		printf("NULL\n");
	}

	struct pathInfo shortestDistances[vertices + 1];
	int startVertex;

	printf("\nEnter a Start Vertex -\n");
	scanf("%d", &startVertex);

	if (bellmanFord(adjacency_list, vertices, startVertex, shortestDistances)) {
		printf("No Negative Cycles exist in the Graph -\n");
	} else {
		printf("Negative Cycles exists in the Graph -\n");
		// The Bellman-Ford Algorithm does not work with negative cycles,
		// all it can do is merely detect them, so when a negative cycle
		// is detected, the shortestDistances array has wrong values
		return 0;
	}

	printf("\n\nVertex    Shortest Distance to Vertex %d     Parent Vertex-\n", startVertex);
	for (i = 1; i <= vertices; ++i) {
		printf("%d         %d                                 %d\n", i, shortestDistances[i].distance, shortestDistances[i].parent);
	}

	return 0;
}

If you have any doubts regarding the algorithm feel free to drop a comment. I’ll surely reply to them. I hope my post helped you in learning the Bellman Ford Algorithm. If it did, let me know by commenting ! Commenting is super easy if you are a Facebook, Twitter or a Google+ user…! So, don’t hesitate..! 😉 Keep practicing… Happy Coding…! 🙂

You may also like –

More in the Navigation Page ! →

Trie Tree Implementation


Hello people…! In this post we will implement an amazing data structure, the Trie Tree. Trie Trees are are used to search for all occurrences of a word in a given text very quickly. To be precise, if the length of the word is “L“, the trie tree searches for all occurrences of this data structure in O(L) time, which is very very fast in comparison to many pattern matching algorithms.

But I must mention, this data structure is not exactly used for “pattern matching”, it is used to search for the occurrences of the word in the given text. How these both functionalities differ…? We’ll get to know that shortly. The Trie Tree has many applications. Your browser could be using a trie tree internally to search for words when you press Ctrl + F. So, let’s get started with this data structure…!

The Trie Tree is a very straight forward data structure. It is a simple tree where the nodes have an alphabet associated with them. And the way these nodes are arranged is the best part of the Trie Tree. To get an idea take a close look at the sketch below –

Structure of Trie Tree

Structure of Trie Tree

The arrows in the sketch above indicate how to traverse the trie tree in order to tell if a word exists or not. We travel through the root node, down the tree until we hit a leaf. Picking up a character at every edge, we construct our word. Now, you can easily tell why the time to search for a node in the text will be in the order of length of the word to be searched. It’s obvious…! One would have to go down till the leaf. In the worst case, one would have to travel the height of the tree which would be the length of the longest word in the whole text..! 😉

So, as we got a little idea about working with a Trie Tree, let us move on to the serious part. Let us talk about how this is implemented and then we will talk about three fundamental operations done on the Trie Tree –

  • Insert
  • Delete
  • Search

In C, the Trie Tree is implemented using structures. But the C implementation of this data structure gets a little dirty. So we will switch to C++ but we will keep it as C-ish as possible. As we keep discussing about the implementation, you will notice how many advantages we have when we use C++. This will be our structure of each node in the tree –

struct node
{
	struct node * parent;
	struct node * children[ALPHABETS];
	vector<int> occurrences;
};
  • parent – This points to the node which is the parent of the current node in the tree hierarchy. It may seem useless at the first, but we need this to travel up the tree when we want to delete any word. We’ll get to the deletion operation in a moment.
  • children – It points to the children nodes. It is made an array so that we can have O(1) accessing time. But why is the size 26…? Its simple. Consider a word, “th”, now, what could the third letter possibly be, if it had one…? One among the 26 english alphabets…! So that’s why the size is made 26. If you want to make a trie tree for another language, replace the number 26 with the number of letters in your language.
  • occurrences – This will store the starting indices of the word occurring in the given text. Now, why this is made a vector is that, vector is as good as a Linked List with random access. It is one of the most handy ready made data structures available in the C++ STL Library. If you are not familiar with vectors this is a good place to start.
    If this were C, we would have to give a fixed array size, and we would have a problem if the occurrences of a particular node are more. We could avoid this by putting a Linked List there. But we sacrifice random access and a whole lot of operations get time taking. Moreover, the code will get really really cumbersome to manage if you have a tree and a linked list.

Having got a picture of the implementation, let us look at how the operations are done in a Trie Tree.

Insert Operation

When we do an insert operation, there are a few cases –

  1. The word to be inserted does not exist.
  2. The word to be inserted already exists in the tree.
  3. The word to be inserted does not exists, but as the suffix of a word.

The first case is simple. One would have to traverse till the alphabets of the words have nodes in the trie tree or else create new nodes one-after-the-other. And at the end of the word, i.e., the node for the last alphabet, we will mark it as a leaf and push the starting index into the vector indicating the occurrence of the newly inserted word.

During this course of traversal, we will be cutting off the string of the word we have one-by-one as they are processed. This is done by putting using a vector of characters and popping off one character after the other. This is less code to handle and more efficient as we can use a vector as a queue. This is another advantage of using C++.

After having learnt what to do with the first case, you can guess what we would have to do in the second case. We simply have to add a new value to the occurrences vector at the node corresponding to the last alphabet of the word. We can also know the number of occurrences in constant time, we simply return the size of the vector. This is another advantage of using C++.

To understand the challenge in the third case, let’s take a simple example. What would you do with your trie tree if you wanted to insert the word “face” if the word “facebook” is already there in your tree…? This is the third case. The answer to this is the occurrence vector itself. We simply push the starting index of the word into the vector of that node which corresponds to the last alphabet of the word to be inserted, in the above example, this would be the node of “e”. So, what really tells if there’s a word ending with the alphabet corresponding to the current node is the size of the vector.

So I hope you understand how important our vector is. A lot depends on it…!

Delete Operation

The deletion of a word in the trie tree is similar to the insertion, we have a few cases –

  • Error 404 : Word not found…!
  • Word exists as a stand-alone word.
  • Word exists as a prefix of another word.

If the word is not there at all, we don’t have to do anything. We just have to make sure that we don’t mess up the existing data structure…!

The second case is a little tricky. We would have to delete the word bottom-up. That is, we will delete that part of the word which is not a part of any other word. For example, consider the sketch above. If we were to delete “this”, we would delete the letters ‘i’ and ‘s’ as, ‘h’ is a part of another word. This keeps us away from distorting the data structure. If the word were existing multiple number of times we will simply remove the occurrence from the vector of the concerned node.

In the third case too, we will simply delete the occurrence of the word from the vector. We needn’t write a lot of code as we can use the functions in algorithm header file. This is another advantage of using C++.

Note – When we delete the occurrence of a word, we are not concerned about the validity of the indices stored as occurrences of other words. What I mean to say is, suppose we have 10 words. If we delete the 3rd word, the 5th word or the 9th word is supposed to become the 4rth and the 8th word as far as the original text is concerned. But we will not consider this. The data stored in the trie tree is not meant to me deleted or inserted. The Trie Tree is meant for processing the given text not to manipulate the given text.

Search Operation

The search operation is simple and is as we discussed when we began our discussion about the Trie Tree. We go down the tree traversing the nodes and keep “picking up characters” as we go. And the occurrences vector tells us if a word exists that ends with the alphabet associated with the current node, and if so, it gives us the indices of occurrences and also the number of occurrences.

Besides these basic operations, there is another very interesting operation that is done with the Trie Tree –

  • Lexicographical Sort – If we want to print all the words processed into the trie tree lexicographically, all we have to do is do a Preorder Walk on the tree. This will automatically print all the words in the lexicographical order or the dictionary order. This is due to the very structure and arrangement of nodes in a Trie Tree. Now, I’d like to share another interesting thing about pre-order walk in trees… The Pre-order walk works exactly as a Depth First Search (DFS) in graphs, i.e., the sequence in which both the algorithms visit the nodes is the same. Think about this for a while and word out the two algorithms on an example (you could take mine in the sketch above), and you can see why it is so. You will also notice why the printed words would be lexicographically sorted.

Now, having learned a lot about the trie tree, try coding it in C++. If you are uneasy with C++, you can try it in C, but make sure you try at least 3 times. Trying is very important. I don’t know if you are new to reading my posts, but I insist a lot on trying in every post of mine…! If you have succeeded, you’re fabulous…! If not, check out my code below any try figuring out how just close you were…!!

/* ==========  ========== ========== ========= */
//          Trie Tree Data Structure           //
//                using C++ STL                //
//                                             //
//         Functions follow Pascal Case        //
//           Convention and Variables      	   //
//         follow Camel Case Convention        //
//                                             //
//            Author - Vamsi Sangam            //
//            Theory of Programming            //
/* ========== ========== ========== ========== */

#include <cstdio>
#include <cstdlib>
#include <vector>

#define ALPHABETS 26
#define CASE 'a'
#define MAX_WORD_SIZE 25

using namespace std;

struct Node
{
    struct Node * parent;
    struct Node * children[ALPHABETS];
    vector<int> occurrences;
};

// Inserts a word 'text' into the Trie Tree
// 'trieTree' and marks it's occurence as 'index'.
void InsertWord(struct Node * trieTree, char * word, int index)
{
    struct Node * traverse = trieTree;

    while (*word != '\0') {     // Until there is something to process
        if (traverse->children[*word - CASE] == NULL) {
            // There is no node in 'trieTree' corresponding to this alphabet

            // Allocate using calloc(), so that components are initialised
            traverse->children[*word - CASE] = (struct Node *) calloc(1, sizeof(struct Node));
            traverse->children[*word - CASE]->parent = traverse;  // Assigning parent
        }

        traverse = traverse->children[*word - CASE];
        ++word; // The next alphabet
    }

    traverse->occurrences.push_back(index);      // Mark the occurence of the word
}

// Searches for the occurence of a word in 'trieTree',
// if not found, returns NULL,
// if found, returns poniter pointing to the
// last node of the word in the 'trieTree'
// Complexity -> O(length_of_word_to_be_searched)
struct Node * SearchWord(struct Node * treeNode, char * word)
{
    // Function is very similar to insert() function
    while (*word != '\0') {
        if (treeNode->children[*word - CASE] != NULL) {
            treeNode = treeNode->children[*word - CASE];
            ++word;
        } else {
            break;
        }
    }

    if (*word == '\0' && treeNode->occurrences.size() != 0) {
        // Word found
        return treeNode;
    } else {
        // Word not found
        return NULL;
    }
}

// Searches the word first, if not found, does nothing
// if found, deletes the nodes corresponding to the word
void RemoveWord(struct Node * trieTree, char * word)
{
    struct Node * trieNode = SearchWord(trieTree, word);

    if (trieNode == NULL) {
        // Word not found
        return;
    }

    trieNode->occurrences.pop_back();    // Deleting the occurence

    // 'noChild' indicates if the node is a leaf node
    bool noChild = true;

    int childCount = 0;
    // 'childCount' has the number of children the current node
    // has which actually tells us if the node is associated with
    // another word .This will happen if 'childCount' != 0.
    int i;

    // Checking children of current node
    for (i = 0; i < ALPHABETS; ++i) {
        if (trieNode->children[i] != NULL) {
            noChild = false;
            ++childCount;
        }
    }

    if (!noChild) {
        // The node has children, which means that the word whose
        // occurrence was just removed is a Suffix-Word
        // So, logically no more nodes have to be deleted
        return;
    }

    struct Node * parentNode;     // variable to assist in traversal

    while (trieNode->occurrences.size() == 0 && trieNode->parent != NULL && childCount == 0) {
        // trieNode->occurrences.size() -> tells if the node is associated with another word
        //
        // trieNode->parent != NULL -> is the base case sort-of condition, we simply ran
        // out of nodes to be deleted, as we reached the root
        //
        // childCount -> does the same thing as explained in the beginning, to every node
        // we reach

        parentNode = trieNode->parent;

        for (i = 0; i < ALPHABETS; ++i) {
        	if (parentNode->children[i] != NULL) {
        		++childCount;

        		if (trieNode == parentNode->children[i]) {
	                parentNode->children[i] = NULL;
	                free(trieNode);
	        		trieNode = parentNode;
	            }
			}
        }
    }
}

// Prints the 'trieTree' in a Pre-Order or a DFS manner
// which automatically results in a Lexicographical Order
void LexicographicalPrint(struct Node * trieTree, vector<char> word)
{
    int i;
    bool noChild = true;

    if (trieTree->occurrences.size() != 0) {
        // Condition trie_tree->occurrences.size() != 0,
        // is a neccessary and sufficient condition to
        // tell if a node is associated with a word or not

        vector<char>::iterator charItr = word.begin();

        while (charItr != word.end()) {
            printf("%c", *charItr);
            ++charItr;
        }
        printf(" -> @ index -> ");

        vector<int>::iterator counter = trieTree->occurrences.begin();
        // This is to print the occurences of the word

        while (counter != trieTree->occurrences.end()) {
            printf("%d, ", *counter);
            ++counter;
        }

        printf("\n");
    }

    for (i = 0; i < ALPHABETS; ++i) {
        if (trieTree->children[i] != NULL) {
            noChild = false;
            word.push_back(CASE + i);   // Select a child

            // and explore everything associated with the cild
            LexicographicalPrint(trieTree->children[i], word);
            word.pop_back();
            // Remove the alphabet as we dealt
            // everything associated with it
        }
    }

    word.pop_back();
}

int main()
{
    int n, i;
	vector<char> printUtil;		// Utility variable to print tree

	// Creating the Trie Tree using calloc
	// so that the components are initialised
    struct Node * trieTree = (struct Node *) calloc(1, sizeof(struct Node));
    char word[MAX_WORD_SIZE];

	printf("Enter the number of words-\n");
    scanf("%d", &n);

    for (i = 1; i <= n; ++i) {
        scanf("%s", word);
        InsertWord(trieTree, word, i);
    }

    printf("\n");   // Just to make the output more readable
    LexicographicalPrint(trieTree, printUtil);

    printf("\nEnter the Word to be removed - ");
    scanf("%s", word);
    RemoveWord(trieTree, word);

    printf("\n");   // Just to make the output more readable
    LexicographicalPrint(trieTree, printUtil);

    return 0;
}

The code is highly commented with explanation. It is well described, but if you have any doubts regarding the data structure or the code, feel free to comment them. I have used a few macros. The macro CASE indicates for which case the Trie Tree works. If we mention ‘A’ as the macro, the Trie Tree will work for upper case words only.

Other Implementations

The code is well tested against fairly large input. You can download the test case file here – Trie Tree Input (PDF). You can just clock your code for the insert operations. My code took 1.236 seconds to execute that loop which reads a word and inserts it into the Trie Tree. There are 5000 words in total. The last word in the input file is the word to be deleted.

If you think you can give a suggestion to make my code better, please do comment them too. I appreciate your suggestions. For those there who are struggling to get their code right, “Keep the effort going guys”…! Remember, you won’t learn anything if you keep writing Hello World program again and again. So, keep practicing…! I hope my post has helped you in getting to know about the Trie Tree. If it did, let me know by commenting ! Happy Coding…! 😀

Dijkstra’s Algorithm


Hello people…! In this post I will talk about one of the fastest single source shortest path algorithms, which is, the Dijkstra’s Algorithm. The Dijkstra’s Algorithm works on a weighted graph with non-negative edge weights and ultimately gives a Shortest Path Tree. It is a Greedy Algorithm, which sort of… mimics the working of Breadth First Search and Depth First Search. It is used in a number of day-to-day scenarios. It is used in network routing, to calculate the path from a network device A and B in a network which would have the maximum bandwidth. It could also be used by the GPS in a car to calculate the shortest path between two locations. The Dijkstra’s Algorithm can be modified to solve a lot of real world problems. So let’s get started…!

The Dijkstra’s Algorithm starts with a source vertex ‘s‘ and explores the whole graph. We will use the following elements to compute the shortest paths –

  • Priority Queue Q, implemented by a Min Binary Heap using C++ STL Vector.
  • Another set D, which keeps the record of the shortest paths from starting vertex s. Implemented using C++ STL Vector.

Just like the other graph search algorithms, Dijkstra’s Algorithm is best understood by listing out the algorithm in a step-by-step process –

  • The Initialisation –
  1. D(s), which is the shortest distance to s is set to 0. It is obvious as distance between source to itself is 0.
  2. For all the other vertices V, D(V) is set to infinity as we do not have a path yet to them, so we simply say that the distance to them is infinity.
  3. The Priority Queue Q, is constructed which is initially holds all the vertices of the Graph. Each vertex V will have the priority D(V).
  • The Algorithm –
  1. Now, pick up the first (or the minimum) element from the Priority Queue Q (which removes it from Q). For the first time, this operation would obviously give s.
  2. For all the vertices adjacent to s, i.e., for all vertices in adjacencyMatrix[s], check if the edge from s → v gives a shorter path. This is done by checking the following condition –

    if, D(s) + (weight of edge s → v) < D(v), we found a new shorter route, so update D(v)
    D(v) = D(s) + (weight of edge s → v)

  3. Now pick the next element from Q, and repeat the process until there are elements left in Q.

It might look like a long and cumbersome process, but this is actually a very smart technique. It’s okay if you don’t understand it in the first reading. Give another 3-4 readings and try to picture what is happening to the graph when you implement the algorithm, in your head. After you feel you have got a hang of the algorithm, look at the sketch below for complete understanding.

Dijkstra's Algorithm Step-by-Step

Dijkstra’s Algorithm Step-by-Step

 

The Dijkstra’s Algorithm is a little tricky. Many don’t understand it in the first attempt. In reference to the diagram above, I will give a step-by-step explanation for each graph marked with the number on top in purple.

  1. Firstly, initialize your components, the shortest distances array D, the priority queue Q, and starting vertex s. The distance from source to itself is zero. So, D(s) = 0, and the rest of the array is ∞ . The set of vertices V are inserted into the priority queue Q, with a priority D(V). Now, we start our algorithm by extracting (hence removing it from the priority queue) the minimum element from the priority queue. The minimum element in the priority queue will definitely be s (which is A here). Look at all the adjacent vertices of A. Vertices B, C, D are adjacent to A. We can go to B travelling the edge of weight 2, to C travelling an edge of weight 1, to D travelling an edge of weight 5. The values of D(B), D(C), D(D) are ∞ . We have found a new way of reaching them in 2, 1, 5 units respectively, which is less than ∞ , hence a shorter path. This is what the if-condition mentioned above does. So, we update the values of D(B), D(C), D(D) and the priorities of B, C, D, in the priority queue. With this we have finished processing the Vertex A.
  2. Now, the process continues to its next iteration and we extract the minimum element from the priority queue. The minimum element would be Vertex C which would be having a priority of 1. Now, look at all the adjacent vertices to C. There’s Vertex D. From C, the it would take 1 unit of distance to reach D. But to reach C in prior, you need 1 more unit of distance. So, if you go to D, via C, the total distance would be 2 units, which is less than the current value of shortest distance discovered to D, D(D) = 5. So, we reduce the value of D(D) to 2. This reduction is also called as “Relaxation”. With that we’re done with Vertex C.
  3. Now, the process continues to its next iteration and we extract the minimum element from the priority queue. Now, there are two minimum elements, B and D. You can go for anyone, it doesn’t matter. For now, we will go for Vertex D. From Vertex D, you can go to Vertex E, and Vertex F, with a total distance of 2 + 2 {D(D) + (weight of D → E)}, and 2 + 3. Which is less than ∞ , so D(E) becomes 4 and D(F) becomes 5. We’re done with Vertex D.
  4. Now, the process continues to its next iteration and we extract the minimum element from the priority queue. The minimum element in the priority queue is vertex B. From vertex B, you can reach vertex F in 2 + 1 units of distance, which is less than the current value of D(F), 5. So, we relax D(F) to 3. From vertex B, you can reach vertex D in 2 + 2 units of distance, which is more than the current value of D(D), 2. This route is not considered as it is clearly proven to be a longer route. With that we’re done with vertex B.
  5. Now, the process continues to its next iteration and we extract the minimum element from the priority queue. The minimum element in the priority queue is vertex E. From vertex E, you can reach vertex C in 4 + 4 units of distance, which is more than the current value of D(C), 1. This route is not considered as it is clearly proven to be a longer route. With that we’re done with vertex E.
  6. Now, the process continues to its next iteration and we extract the minimum element from the priority queue. The minimum element in the priority queue is vertex F. You cannot go to any other vertex from vertex F, so, we’re done with vertex F.
  7. With the removal of vertex F, our priority queue becomes empty. So, our algorithm is done…! You can simply return the array D to output the shortest paths.

Having got an idea about the overall working of the Dijkstra’s Algorithm, it’s time to look at the pseudo-code –

dijsktra(G, S)
	D(S) = 0
	Q = G(V)

	while (Q != NULL)
		u = extractMin(Q)
		for all V in adjacencyList[u]
			if (D(u) + weight of edge < D(V))
				D(V) = D(u) + weight of edge
				decreasePriority(Q, V)

In the pseudo-code, G is the input graph and S is the starting vertex. I hope you understand the pseudo-code. If you don’t, feel free to comment your doubts. Now, before we code Dijkstra’s Algorithm, we must first prepare a tool, which is the Priority Queue.

The Priority Queue

The Priority Queue is implemented by a number of data structures such as the Binary Heap, Binomial Heap, Fibonacci Heap, etc. The priority queue in my code is implemented by a Binary Heap. If you are not aware about the Binary Heap, you can refer to my post on Binary Heaps. But the implementation that we will do here is different from that in my post regarding the Binary Heap. The difference is that, here we will implement the Binary Heap using C++ STL Vector, not an array. This is because a heap is meant to grow, not remain of a fixed size, so we are using a data structure that can grow, a vector. Also, when we use a vector we can apply the same traversing techniques that we used in the case of an array. And, every element of our vector will be a Pair of two integers from the utility header file. The two integers represent vertex and weight, which is actually the shortest distances and this weight property will function as the priority to the elements. So, the vertices are placed in the priority queue based on their weight property. We will have to code the operations based on this weight property. Now the functionalities that we need from our priority queue are –

  • Insert – We will insert |V| elements into the Priority Queue.
  • Extract Min – We return the top-most element from the Binary Heap and delete it. Finally we make the neccessary
  • Decrease Priority – We decrease the priority of an element in the priority queue when we find a shorter path, as known as Relaxation.

If you know the working of the Binary Heap and have a little knowledge about the STL Library, you can code the Priority Queue in about 2-5 hours. You can keep referring to the internet of various functions and the syntaxes and any other doubts you have. Typing the code doesn’t take long, but debugging it and making it work takes the most time. That’s how you learn coding. Try your best, work on it for hours, if you don’t get it, take a break for 10-20 minutes… Come back and check my code below and try to figure out how close you were to getting it perfect…!

/*
 * Priority Queue implemented
 * by a Binary Heap using
 * C++ STL Vector, for
 * Dijkstra's Algorithm
 *
 * Authored by,
 * Vamsi Sangam
 */

#include <cstdio>
#include <vector>
#include <utility>

using namespace std;

// Inserts an element into the Queue
void enqueue(vector< pair<int, int> > * priorityQueue, pair<int, int> * entry)
{
	(*priorityQueue).push_back(*entry);

	int i = (*priorityQueue).size() - 1;
	pair<int, int> temp;

	while (i > 0) {
		if ((*priorityQueue)[(i - 1) / 2].second > (*priorityQueue)[i].second) {
			temp = (*priorityQueue)[(i - 1) / 2];
			(*priorityQueue)[(i - 1) / 2] = (*priorityQueue)[i];
			(*priorityQueue)[i] = temp;

			i = (i - 1) / 2;
		} else {
			break;
		}
	}
}

// Iterates over the Queue to return the index of the element
int findByKey(vector< pair<int, int> > * priorityQueue, pair<int, int> entry)
{
	int i;

	for (i = 0; i < (*priorityQueue).size(); ++i) {
		if ((*priorityQueue)[i].first == entry.first) {
			break;
		}
	}

	if (i != (*priorityQueue).size()) {
		return i;
	} else {
		return -1;
	}
}

// Decreases the priority of the element and makes neccessary adjustments
void decreasePriority(vector< pair<int, int> > * priorityQueue, int index, int newWeight)
{
	(*priorityQueue)[index].second = newWeight;

	int i = index;
	pair<int, int> temp;

	while (i > 0) {
		if ((*priorityQueue)[(i - 1) / 2].second > (*priorityQueue)[i].second) {
			temp = (*priorityQueue)[(i - 1) / 2];
			(*priorityQueue)[(i - 1) / 2] = (*priorityQueue)[i];
			(*priorityQueue)[i] = temp;

			i = (i - 1) / 2;
		} else {
			break;
		}
	}
}

// Returns the minimum element, deletes it and makes neccessary changes
pair<int, int> extractMin(vector< pair<int, int> > * priorityQueue)
{
	pair<int, int> min = (*priorityQueue)[0];
	pair<int, int> temp;

	// Swap first and last
	temp = (*priorityQueue)[0];
	(*priorityQueue)[0] = (*priorityQueue)[(*priorityQueue).size() - 1];
	(*priorityQueue)[(*priorityQueue).size() - 1] = temp;

	(*priorityQueue).pop_back();

	int i = 0;
	pair<int, int> parent;
	pair<int, int> rightChild;
	pair<int, int> leftChild;

	while (i < (*priorityQueue).size()) {
		parent = (*priorityQueue)[i];
		printf("Currently at - (%d, %d)\n", parent.first, parent.second);

		if (2 * (i + 1) < (*priorityQueue).size()) {
			// both children exist
			rightChild = (*priorityQueue)[2 * (i + 1)];
			leftChild = (*priorityQueue)[2 * (i + 1) - 1];

			if (parent.second < leftChild.second && parent.second < rightChild.second) {
				break;
			} else {
				if (leftChild.second < rightChild.second) {
					temp = (*priorityQueue)[2 * (i + 1) - 1];
					(*priorityQueue)[2 * (i + 1) - 1] = (*priorityQueue)[i];
					(*priorityQueue)[i] = temp;

					i = 2 * (i + 1) - 1;
				} else {
					temp = (*priorityQueue)[2 * (i + 1) - 1];
					(*priorityQueue)[2 * (i + 1) - 1] = (*priorityQueue)[i];
					(*priorityQueue)[i] = temp;

					i = 2 * (i + 1);
				}
			}
		} else if ((2 * (i + 1)) >= (*priorityQueue).size() && (2 * (i + 1) - 1) < (*priorityQueue).size()) {
			// only left child exists
			leftChild = (*priorityQueue)[2 * (i + 1) - 1];

			if (leftChild.second < parent.second) {
				temp = (*priorityQueue)[2 * (i + 1) - 1];
				(*priorityQueue)[2 * (i + 1) - 1] = (*priorityQueue)[i];
				(*priorityQueue)[i] = temp;
			}

			break;
		} else {
			// no more children exist
			break;
		}
	}

	return min;
}

int main()
{
	int n;

	printf("Enter the size -\n");
	scanf("%d", &n);

	int vertex, weight, i;
	vector< pair<int, int> > priorityQueue;
	pair<int, int> entry;

	for (i = 0; i < n; ++i) {
		scanf("%d%d", &vertex, &weight);
		entry = make_pair(vertex, weight);
		enqueue(&priorityQueue, &entry);
	}

	printf("\n\nThe Priority Queue (Interpret as Binary Heap) -\n");

	vector< pair<int, int> >::iterator itr = priorityQueue.begin();

	while (itr != priorityQueue.end()) {
		printf("(%d, %d) ", (*itr).first, (*itr).second);
		++itr;
	}
	printf("\n");

	pair<int, int> min = extractMin(&priorityQueue);
	printf("\n\nExtract Min returned = (%d, %d), The Queue -\n", min.first, min.second);
	itr = priorityQueue.begin();

	while (itr != priorityQueue.end()) {
		printf("(%d, %d) ", (*itr).first, (*itr).second);
		++itr;
	}
	printf("\n");

	decreasePriority(&priorityQueue, priorityQueue.size() - 1, 1);
	printf("\n\ndecreasePriority() used, The Queue -\n");
	itr = priorityQueue.begin();

	while (itr != priorityQueue.end()) {
		printf("(%d, %d) ", (*itr).first, (*itr).second);
		++itr;
	}
	printf("\n");

	return 0;
}

There are many other functionalities that a Priority Queue can give. But for now, we’ll need only these.

Joining the pieces

After you have your tool ready, you are all good to code Dijkstra’s Algorithm. Coding the Dijkstra’s Algorithm is easy but can go really weird. This is because you need to handle and co-ordinate many data structures at once. You’ll have to manage the adjacency list, the priority queue, the shortest distance array, and most importantly your loops…! You will surely end up spending more time in debugging the code, which is perfectly the right way of doing it. All throughout the coding, keep revising the step-by-step process explained above. If you don’t get it you, don’t fret, I put my code below. Before you look at my code, I would like to mention a few things –

  • We cannot have infinity in programming, so the shortest distances are initialised to the highest possible value in integer range, present in a macro, INT_MAX, in the header file climits.
  • The header file utility must be included to use pairs.
/* 
 * Dijkstra's Algorithm in C++
 * using Binary Heap as Priority
 * Queue implemented using 
 * C++ STL Vector
 *
 * Authored by,
 * Vamsi Sangam
 */

#include <cstdio>
#include <cstdlib>
#include <climits>
#include <vector>
#include <utility>

using namespace std;

// Our Vertex for Graph
struct node {
	int vertex, weight;
	struct node * next;
};

// To construct our Adjacency List
// Follows Head Insertion to give O(1) insertion
struct node * addEdge(struct node * head, int vertex, int weight)
{	 
	struct node * p = (struct node *) calloc(1, sizeof(struct node));
	
	p->vertex = vertex;
	p->weight = weight;
	p->next = head;
	
	return p;
}

// Adds vertices to the Priority Queue, Vertices a represented
// as pairs of vertex number and its shortest distance.
// This is logically a Binary Heap Insertion
void enqueue(vector< pair<int, int> > * priorityQueue, pair<int, int> * entry)
{
	(*priorityQueue).push_back(*entry);
	
	int i = (*priorityQueue).size() - 1;
	pair<int, int> temp;
	
	while (i > 0) {
		// Checking the priority of the parent, if greater, swap, else, we are done
		if ((*priorityQueue)[(i - 1) / 2].second > (*priorityQueue)[i].second) {
			temp = (*priorityQueue)[(i - 1) / 2];
			(*priorityQueue)[(i - 1) / 2] = (*priorityQueue)[i];
			(*priorityQueue)[i] = temp;
			
			i = (i - 1) / 2;
		} else {
			break;
		}
	}
}

// Finds for a Vertex in the Priority Queue and 
// returns its index as in its vector implementation
int findByKey(vector< pair<int, int> > * priorityQueue, pair<int, int> entry)
{
	int i;
	
	// Linear Search
	for (i = 0; i < (*priorityQueue).size(); ++i) {
		if ((*priorityQueue)[i].first == entry.first) {
			break;
		}
	}
	
	if (i != (*priorityQueue).size()) {
		return i;
	} else {
		return -1;
	}
}

// Decreases the priority of a given entry in the
// Priority Queue who's location is given by 'index'
// to 'newWeight' and re-arranges the Binary Heap
void decreasePriority(vector< pair<int, int> > * priorityQueue, int index, int newWeight)
{
	// Decreasing Priority
	(*priorityQueue)[index].second = newWeight;
	
	int i = index;
	pair<int, int> temp;
	
	// Adjusting the Binary Heap, similar re-arrangement as done in enqueue()
	while (i > 0) {
		if ((*priorityQueue)[(i - 1) / 2].second > (*priorityQueue)[i].second) {
			temp = (*priorityQueue)[(i - 1) / 2];
			(*priorityQueue)[(i - 1) / 2] = (*priorityQueue)[i];
			(*priorityQueue)[i] = temp;
			
			i = (i - 1) / 2;
		} else {
			break;
		}
	}
}

// Picks up the minimum element of the Priority Queue and re-arranges
// the Binary Heap and finally returns the Minimum Element picked.
// Functionally resembles Delete operation in Binary Heap, but additionally
// returns the deleted element which is the minimum element.
pair<int, int> extractMin(vector< pair<int, int> > * priorityQueue)
{
	pair<int, int> min = (*priorityQueue)[0];		// Min Element Stored
	pair<int, int> temp;
	
	// Swap first and last elements
	temp = (*priorityQueue)[0];
	(*priorityQueue)[0] = (*priorityQueue)[(*priorityQueue).size() - 1];
	(*priorityQueue)[(*priorityQueue).size() - 1] = temp;
	
	(*priorityQueue).pop_back();					// Deleted Min Element
	
	int i = 0;
	pair<int, int> parent;			// These three variables
	pair<int, int> rightChild;		// are for readability of
	pair<int, int> leftChild;		// the if conditions ahead
	
	while (i < (*priorityQueue).size()) {
		parent = (*priorityQueue)[i];
		
		if (2 * (i + 1) < (*priorityQueue).size()) {
			// both children exist
			rightChild = (*priorityQueue)[2 * (i + 1)];
			leftChild = (*priorityQueue)[2 * (i + 1) - 1];
			
			if (parent.second < leftChild.second && parent.second < rightChild.second) {
				// Parent has lesser priority than its children, so wer're done
				break;
			} else {
				if (leftChild.second < rightChild.second) {
					// Left-child has a lesser priority, so, swap
					temp = (*priorityQueue)[2 * (i + 1) - 1];
					(*priorityQueue)[2 * (i + 1) - 1] = (*priorityQueue)[i];
					(*priorityQueue)[i] = temp;
				
					i = 2 * (i + 1) - 1;
				} else {
					// Right-child has a lesser priority, so, swap
					temp = (*priorityQueue)[2 * (i + 1) - 1];
					(*priorityQueue)[2 * (i + 1) - 1] = (*priorityQueue)[i];
					(*priorityQueue)[i] = temp;
				
					i = 2 * (i + 1);
				}
			}
		} else if ((2 * (i + 1)) >= (*priorityQueue).size() && (2 * (i + 1) - 1) < (*priorityQueue).size()) {
			// only left child exists
			leftChild = (*priorityQueue)[2 * (i + 1) - 1];
			
			if (leftChild.second < parent.second) {
				// Left-child has a lesser priority, so, swap
				temp = (*priorityQueue)[2 * (i + 1) - 1];
				(*priorityQueue)[2 * (i + 1) - 1] = (*priorityQueue)[i];
				(*priorityQueue)[i] = temp;
			}
			
			break;
		} else {
			// no more children exist
			break;
		}
	}
	
	return min;
}

// The Dijkstra's Algorithm sub-routine which takes the Graph as Adjcaency List,
// number of vertices, a starting vertex, and an empty shortestDistances array as
// input and computest the shortest paths and fills them up in the shortestDistances array
void dijkstra(struct node * adjacencyList[], int vertices, int startVertex, pair<int, int> shortestDistances[])
{
	int i;
	
	// Initially no routes to vertices are know, so all are infinity,
	// here, we initialize to a very high integer value
	for (i = 0; i <= vertices; ++i) {
		shortestDistances[i].first = INT_MAX;
		shortestDistances[i].second = -1;
	}
	
	// Setting distance to source to zero
	shortestDistances[startVertex].first = 0;
	shortestDistances[startVertex].second = 0;
		
	struct node * trav;
	vector< pair<int, int> > priorityQueue;
	pair<int, int> min;
	
	// Making a the vertex that corresponds to the
	// 'startVertex' which will have a priority of zero
	// and we begin to intialise the Priority Queue
	pair<int, int> entry = make_pair(startVertex, 0);
	enqueue(&priorityQueue, &entry);
	
	// Initialising Priority Queue
	for (i = 1; i <= vertices; ++i) {
		if (i == startVertex) {
			continue;
		} else {
			// Priorities are set to a high integer value
			entry = make_pair(i, INT_MAX);
			enqueue(&priorityQueue, &entry);
		}
	}

	// We have the tools ready..! Let's roll...!!
	while (priorityQueue.size() != 0) {		// Untill there are vertices to be processed in Queue
		min = extractMin(&priorityQueue);	// Greedily process the nearest vertex
		
		trav = adjacencyList[min.first];	// Checking all the vertices adjacent to 'min'
		while (trav != NULL) {
			if (shortestDistances[trav->vertex].first > shortestDistances[min.first].first + trav->weight) {
				// We have discovered a new shortest route
				// Make the neccesary adjustments in data
				entry = make_pair(trav->vertex, trav->weight);
				
				int index = findByKey(&priorityQueue, entry);
				
				decreasePriority(&priorityQueue, index, trav->weight);
				shortestDistances[trav->vertex].first = shortestDistances[min.first].first + trav->weight;
				shortestDistances[trav->vertex].second = min.first;
			}
			
			trav = trav->next;
		}
	}
}

void PrintShortestPath(pair<int, int> shortestDistances[], int vertex, int startVertex)
{
	if (vertex == startVertex) {
		printf("%d ", startVertex);
		return;
	} else {
		PrintShortestPath(shortestDistances, shortestDistances[vertex].second, startVertex);
		printf("%d ", vertex);
	}
}

int main()
{
	int vertices, edges, i, j, v1, v2, w;
	
	printf("Enter the Number of Vertices -\n");
	scanf("%d", &vertices);
	
	printf("Enter the Number of Edges -\n");
	scanf("%d", &edges);
	
	struct node * adjacencyList[vertices + 1];
	//Size is made (vertices + 1) to use the
	//array as 1-indexed, for simplicity
	
	//Must initialize your array
	for (i = 0; i <= vertices; ++i) {
		adjacencyList[i] = NULL;
	}
	
	printf("\n");
	
	for (i = 1; i <= edges; ++i) {
		scanf("%d%d%d", &v1, &v2, &w);
		adjacencyList[v1] = addEdge(adjacencyList[v1], v2, w);
	}
 
 	//Printing Adjacency List
	printf("\nAdjacency List -\n\n");
	for (i = 1; i <= vertices; ++i) {
		printf("adjacencyList[%d] -> ", i);
		
		struct node * temp = adjacencyList[i];
		
		while (temp != NULL) {
			printf("%d(%d) -> ", temp->vertex, temp->weight);
			temp = temp->next;
		}
		
		printf("NULL\n");
	}
	
	int startVertex;
	
	printf("Choose a Starting Vertex -\n");
	scanf("%d", &startVertex);
	
	pair<int, int> shortestDistances[vertices + 1];
	// shortestDistances.first = shortest distance calculated
	// shortestDistances.second = parent node
	
	dijkstra(adjacencyList, vertices, startVertex, shortestDistances);
	
	printf("\n\nDijkstra's Algorithm Used - Shortest Paths from Vertex %d -\n", startVertex);
	for (i = 1; i <= vertices; ++i) {
		printf("Vertex %d, Distance = %d, Parent = %d\n", i, shortestDistances[i].first, shortestDistances[i].second);
	}
	printf("\n");
	
	PrintShortestPath(shortestDistances, vertices /* The Last Vertex */, startVertex);
	
	return 0;
}

This is the Dijkstra’s Algorithm. The code is well commented with explanation. If you don’t understand anything or if you have any doubts. Feel free to comment them. Now talking about the complexity of Dijkstra’s Algorithm. We perform |V| enqueue operations into the priority queue, which take O(log N), here, N is |V|, so this takes O(|V| log |V|). And at most |E| decrease priority operations which will take O(|V|) time. The extract-min is also called |V| times which will take O(|V| log |V|) time. So, the overall complexity of Dijkstra’s Algorithm we wrote is O(|V| log |V| + |E||V|). Dijkstra’s Algorithm can be improved by using a Fibonacci Heap as a Priority Queue, where the complexity reduces to O(|V| log |V| + |E|). But the Fibonacci Heap is an incredibly advanced and difficult data structure to code. We’ll talk about that implementation later.

I really hope my post has helped you in understanding the Dijkstra’s Algorithm. If it did, let me know by commenting. I tried my best to keep it as simple as possible. If you have any doubts, you can comment them too and I will surely reply to them. This algorithm is a particularly tough one. So, good luck… Keep practicing and… Happy Coding…! 🙂

You may also like –

More in the Navigation Page ! →

Binary Heaps


Hello people…! In this post I will talk about Binary Heaps, or, more casually called as Heaps. Binary Heaps are used to implement Priority Queues which are associated with Scheduling. Binary Heaps is a starting step to learn many advanced heap related Data Structures that are used in many algorithms. So let’s understand the need of the situation by taking up a scenario.

Let’s say that the CPU has to execute ‘n’ programs that take their own pre-defined time to get completed. Then one strategy to schedule them is the Shortest Remaining Processing Time (SRPT) first method. As the name suggests, we look at all the available programs, then pick up that program which takes the minimum time to get executed. After completing that program, then the next minimum is supposed to be picked up. So if we observe what operations are we doing on the data we have, they would be –

  • Finding the minimum of given elements.
  • Deleting the minimum element (after a program gets executed).
  • Inserting a new element (if a new program enters the scheduling).

So, we must choose such a Data Structure which takes the minimum to these operations so that we can come up with an efficient solution to the scenario. Binary Heap is one such Data Structure that does these operations in pretty fast time. Now, what exactly is the Binary Heap…? What does it look like…? The Binary Heap is actually a Binary Tree has two very important properties –

  • Structural Property – The value of every node in the Binary Tree should be less than or equal to the values of its children nodes.
  • Heap Property – The Binary Tree should be fully-filled up to the 2nd last level and the last level must be “left-filled”.

To understand these two properties let’s take up an example –

heaps1 (2)

So, as we can see, the last level is what is called “left-filled”, i.e., all the nodes are to the left and all the empty spaces to the right. The heap property is also depicted. The parent is smaller than both of its children. Now, having introduced this structure to you. Can you think where the smallest element of the heap would be….? At the top of course…! This is due to the very definition of the Heap Property. If the parent has the lower number, then the ancestor of all the nodes should surely have the least number. But this does not say that the leaves at the last level have the greatest number. Think about it for a while looking at the picture and you will understand it.

Now, the Binary Heap is although depicted as a tree, it is implemented as an array…! Just like the Fenwick Tree. The Binary Tree structure of the allows this. Look at the sketch below –

heaps2

As, you can see, the nodes of the heap are put into an array in the left-to-right order at each level. Why is this so…? This is the way we can access all the nodes in a heap. How…? If we take any element arr[i] –

  • Parent of Element at i → arr[ i2 ]
  • Left Child of Element at → arr[(2 ✕ i)]
  • Right Child of Element at → arr[(2 ✕ i) + 1]

Looking at the picture, we can verify the above statements. For example, take the node 9 (arr[6]) which is at level 3. Its parent is arr[ 92 ] which is arr[3], i.e., node 6 at level 2, as it is in the structure. The left child of arr[6] is arr[(2 ✕ 6)], which is, arr[12], node 20 and like this even the right child can be verified. The way to find the parent is actually floor( i2 ), but as both ‘i’ and 2 are integers the result comes as it is desired. Now, we must discuss the five important operations in a binary heap –

  • Getting minimum element
  • Inserting a new element
  • Deleting a new Element
  • Heapify
  • Build Heap

Minimum Element – As you can clearly see, the minimum element would be the topmost element of the heap which is the first element of the array. So this operation takes O(1) time.

Inserting an Element – Recall that the last level of the heap must be left-filled. So, where ever you add a node, the resulting structure should be such that the last level will have an extra element. We do this, by first adding the new element to the end, then to ensure that the Structural Property is satisfied we keep swapping the element with its parent until the parent node to be swapped is less than the inserted element. So, we kind of insert the element at the bottom and we “move up” the heap. This process is illustrated in the sketch below.

heaps3

This is how we insert a node into a heap. In the worst case we would have to travel all the way up which costs us (log n) swaps, which is the height of the tree. So, the insertion operation has a complexity of O(log n).

Deleting an Element – Deleting an element is also similar to Inserting the element. Structurally, when an element is deleted, the last level will lose the right-most node. So what we do is, we swap the last element and the node to be deleted, then remove the last element. And carry the swapped element up the heap just as we did in Insertion so that the Heap Property is not violated. This is illustrated in the sketch below –

heaps4

This is how the Deletion Operation is done in a Binary Heap. As you can clearly analyse that the worst case would be if we have to swap elements all the way up the Binary Heap, where the Deletion Operation would take O(log n) time.

Heapify – The heapify operation is as the name suggests, converting some “non-heap” thing into a heap. Well what this non-heap thing is that, it is a parent node, where the left and the right children are heaps, but the structure as a whole isn’t a heap. How is that possible…? This parent node we are talking about, its value could be greater than one or both of its children. In such a case, the smaller valued child is swapped with the parent and this is carried on down the tree until the heap property holds true for the whole structure. To understand this, observe the sketch below.

heap5

The process of heapify is clearly shown in the sketch, how we keep swapping the child nodes till the heap property is satisfied for the whole structure. This too takes O(log n) time in the worst case, when we would have to travel the whole till the leaves. Where is this process used anyway…? It is used to build a heap, which is our next discussion.

Build Heap – Build heap is constructing the heap from a given set of randomly arranged nodes in a binary tree. The build heap process uses the heapify to construct the heap “down-to-up”. Now, if you remember the principle of the heapify process, we were able to do that only because the left-subtree and the right-subtree were heaps. Think of the second last level in a heap, the left and the right nodes are leaves, i.e., they have no children, so, they are heaps by themselves. Why is a leaf by itself a heap…? This is because the left child and the right child don’t exist, so we can claim that the parent node (the leaf), satisfies the heap property. And for a leaf, the structural property also holds good because the second-last level, (the level of the leaf node), is fully filled, and the last level (the level of the leaf’s children), is left-filled (actually there are no nodes to be left-filled). Well, most probably, you won’t understand this in the first reading, give it another couple of readings and observe the picture below.

heaps6

I hope the process of the Build Heap is clear. At every level we use the heapify process to gradually build the heap “down-up”. So what is the asymptotic time complexity of Build Heap…? We use heapify for all the nodes except the nodes in the last level,

Nodes till the 2nd Last level = 12 ✕ (Total number of Nodes ⇒ N)
Complexity of the Heapify procedure = O(log N)
Complexity of Build Heap = O(N log N)

But we can argue that the Build Heap process will only take O(N). Now, look at the above sketch once again. If you feel the text is too small, open the image by right-clicking, its bigger than you think…! Now,

The (n + 1)4 nodes at level 3 took 1 swap ⇒ (n + 1)4 ✕ 1
The (n + 1)8 nodes at level 2 took 2 swaps ⇒ (n + 1)8 ✕ 2
The (n + 1)16 nodes at level 1 took 3 swaps ⇒ (n + 1)16 ✕ 3

As, you can see, we have a sort-of progression here. If we were to sum them up, taking (n + 1) common,

⇒ (N + 1) ✕ (14 + 28 + 316 + …)
⇒ (N + 1) ✕ Σ i2i
⇒ (N + 1) ✕ 2       { Σ i2i = 2 }
⇒ O(N)

Therefore, the Build Heap takes O(N) time. The principle behind this is that for most of the time Heapify works on smaller values of ‘N’. These are the most important operations of a Binary Heap. We have discussed the implementation of Binary Heaps using arrays and accessing the parent and child. Now, try to code the above stated operations at least 3 times…! You have to try. There is no better way of knowing a Data Structure than trying to code it. After you have given your everything, if you succeed, you are brilliant…! If not, look at my code below and see for yourselves how close you were…!

/* ==========  ========== ========== ========= */
//        Binary Min Heap Data Structure       //
//           Array Implementation              //
//                                             //
//         Functions follow Pascal Case        //
//           Convention and Variables      	   //
//         follow Camel Case Convention        //
//                                             //
//            Author - Vamsi Sangam            //
//            Theory of Programming            //
/* ========== ========== ========== ========== */
 
#include <stdio.h>
#include <stdlib.h>

struct Vertex
{
	int label, value;
};

// Adds an element to the heap and returns the size - O(log N)
int EnqueueVertex(struct Vertex minHeap[], int size, struct Vertex newValue)
{
    ++size;
    minHeap[size] = newValue;
 
    int i = size;
	struct Vertex temp;
 
    while (i >= 1) {
        if (minHeap[i / 2].value > minHeap[i].value) {
            // Parent node is greater than Child Node
            // Heap Property violated
            // So, swap
            temp = minHeap[i / 2];
            minHeap[i / 2] = minHeap[i];
            minHeap[i] = temp;
 
            i = i / 2;
        } else {
            // Parent is less or equal to the child
            // Heap property not violated
            // So, Insertion is done, break
            break;
        }
    }
 
    return size;
}

// Applies the heapify procedure - O(log N)
void Heapify(struct Vertex minHeap[], int size, int index)
{
    int i = index;
	struct Vertex temp;
 
    while ((2 * i) <= size) {
        // Left Child exists
 
        if ((2 * i) + 1 > size) {
            // Right child does not exist, but left does
            
			if (minHeap[i].value > minHeap[2 * i].value) {
                // Left child is smaller than parent
                temp = minHeap[i];
                minHeap[i] = minHeap[2 * i];
                minHeap[2 * i] = temp;
            }
            
            break;
            // Once you reach this level where it does not
        	// have a right child, the heap ends here
            // taking it to the next iteration is pointless
        }
 
        //Both Children exist
        if (minHeap[i].value > minHeap[2 * i].value || minHeap[i].value > minHeap[2 * i + 1].value) {
            // One of the other child is lesser than parent
            // Now find the least amoung 2 children
 
            if (minHeap[2 * i].value <= minHeap[(2 * i) + 1].value) {
                // Left Child is lesser, so, swap
                temp = minHeap[2 * i];
                minHeap[2 * i] = minHeap[i];
                minHeap[i] = temp;
 
                // And go down the heap
                i = 2 * i;
            } else if (minHeap[2 * i].value > minHeap[(2 * i) + 1].value) {
                // Left Child is lesser, so, swap
                temp = minHeap[(2 * i) + 1];
                minHeap[(2 * i) + 1] = minHeap[i];
                minHeap[i] = temp;
 
                // And go down the heap
                i = (2 * i) + 1;
            }
        } else {
            // Parent is lesser than its children
            break;
        }
    }
}

// Deletes the vertex and returns the size - O(log N)
int DeleteVertex(struct Vertex minHeap[], int size, int index)
{
    // Swap the indexed element with the last
    struct Vertex temp = minHeap[index];
    minHeap[index] = minHeap[size];
    minHeap[size] = temp;
 
    int i = index;
    --size;
 
    Heapify(minHeap, size - 1, i);
 
    return size;
}

// Build Heap Procedure - O(N)
void BuildHeap(struct Vertex minHeap[], int size)
{
    int i;
 
    // Simply call heapify() for all nodes
    // except the last one...!
    for (i = size / 2; i >= 1; --i) {
        Heapify(minHeap, size, i);
    }
}
 
int main()
{
    int n, i;
 
    printf("Enter the Initial size of the Min Heap -\n");
 
    scanf("%d", &n);
 
    struct Vertex minHeap[n + 2];
    // Extra size just to demonstrate Insertion
    // and use the array as 1-indexed
    
    printf("Enter %d elements -\n", n);
 
    for (i = 1; i <= n; ++i) {
        scanf("%d%d", &minHeap[i].label, &minHeap[i].value);
    }
 
    BuildHeap(minHeap, n);
 
    printf("\nHeap -\n");
    for (i = 1; i <= n; ++i) {
        printf("%d, %d\n", minHeap[i].label, minHeap[i].value);
    }
 
	struct Vertex node;
	
    printf("\n\nInsert an element -\n");
    scanf("%d%d", &node.label, &node.value);
    n = EnqueueVertex(minHeap, n, node);
 
    printf("\nHeap After Insert -\n");
    for (i = 1; i <= n; ++i) {
        printf("%d, %d\n", minHeap[i].label, minHeap[i].value);
    }
 
    printf("\n\nDelete an Element at index-\n");
    scanf("%d", &i);
    n = DeleteVertex(minHeap, n, i);
 
    printf("\nHeap After Delete -\n");
    for (i = 1; i <= n; ++i) {
        printf("%d, %d\n", minHeap[i].label, minHeap[i].value);
    }
 
    return 0;
}

One thing I would like to point out is that, my heap is an array of structures. Just an integer would suffice, but I made it an array of structures so that it becomes a little generic. So, tomorrow, if you want to you this code as a Priority Queue, you needn’t change anything (or at most the names) in the code. Or, if you wanted to store more information about each node of the Binary Heap, you can simply add attributes to the struct.
Next, for my deletion operation, I swapped the last element and the element to be deleted and called the heapify procedure. This is because, the heapify procedure, checks if the Heap Property is violated and makes the necessary corrections. As the first element of the heap is the smallest, it can be accessed directly by heap[1]. Remember, the arrays in my code are 1-indexed, and all the loops and conditions go with it.
The code is pretty well tested. You can download the input file here.

There is one last topic to discuss before I conclude.

Max Heap – Whatever heaps we have been seeing till now are all Min Heaps. This is because, the Heap Property is such that the minimum element is put on top. What if the Heap Property would just be the reverse of it…? Something like –

Heap Property – Every parent node must have a value greater than or equal to both of its child node.

In this case, by the Heap property, the maximum element would be put on the top and so will be for all the sub-trees or sub-heaps in the structure. This is a Max Heap. Actually, the very first heap that we used in our Build Heap sketch is a Max Heap. Have a close look at it and you will understand the difference. Max Heap differ from Min Heaps only by the Heap Property, due to which the other differences arise.

Whoo…! This has become a really long post…! Thanks a lot to stay with me till the end. I hope this helps you to get started with heaps. It may not be easy to code the Binary Heap, but that’s how you grow your skills in programming. You won’t learn anything if you sit for hours and write the Hello World program over-and-over again. You need to challenge yourself…! Feel free to comment your doubts..! Commenting is super easy if you are a Facebook, Twitter or a Google+ user…! So, don’t hesitate..! 😉 … Keep practicing…! Happy Coding…! 🙂

Depth First Search (DFS) Algorithm


Hello people…! In this post I will talk about the other Graph Search Algorithm, the Depth First Search (DFS). Depth First Search is different by nature from Breadth First Search. As the name suggests, “Depth”, we pick up a vertex S and see all the other vertices that can possibly reached by that vertex and we do that to all vertices in V. Depth First Search can be used to search over all the vertices, even for a disconnected graph. Breadth First Search can also do this, but DFS is more often used to do that. Depth First Search is used to solve puzzles! You can solve a given maze or even create your own maze by DFS. DFS is also used in Topological Sorting, which is the sorting of things according to a hierarchy. It is also used to tell if a cycle exists in a given graph. There are many other applications of DFS and you can do a whole lot of cool things with it. So, lets get started…!

The way the Depth First Search goes is really like solving a maze. When you see a maze in a newspaper or a magazine or anywhere else, the way you solve it is you take a path and go through it. If you find any junction or a crossroad, where you have a choice of paths to choose, you mark that junction, take up a path and traverse the whole route in your brain. If you see a dead end, you realize that this leads you now where so you come back to the junction you marked before and take up another path of the junction. See if that is also a dead end, else you continue to explore the, puzzle as this route contributes in reaching your destination. Well, at least that’s how I solve a maze. But… I hope you get the idea. You pick up a path and you explore it as much as possible. When you can’t travel any further and you haven’t yet reached your destination, you come back to the starting point and pick another path.

In code, you do this by recursion. Because of the very nature of recursion. Your recursion stack grows-grows and eventually becomes an empty stack. If you think for a while you can notice that the way of traversing which I told you above is logically covering only the vertices accessible from a given vertex S. Such traversal can be implemented using a single function that is recursive. But we want to explore the whole graph. So, we will use another function to do this for us. So, DFS is a two-functions thing. We will come back to the coding part later. Now, DFS too can be listed out in a step-by-step process –

  • For a given set of vertices V = {V1, V2,V3…}, start with V1, and explore all the vertices that can possibly be reached from V1.
  • Then go to the next vertex V2, if it hasn’t been visited, explore all the nodes reachable from V2, if not, go for V3.
  • Similarly, go on picking up all the vertices one-by-one and explore as much as possible if it wasn’t visited at all.

Now, how do you tell if a vertex wasn’t visited earlier…? If it has no parent vertex assigned. So what you have to do when you visit a node is –

  • Set the parent vertex of the current vertex to be the vertex from where you reached that vertex. We will use an array to assign parent vertices, which is initialised to -1, telling that the vertices were never visited.
  • When you are starting your exploration from a vertex, say, V1, we assign parent[V1] = 0, because it has no parent. If there is an edge from V1 to V2, we say, parent[V2] = V1.

Let’s look at an example and see how DFS really works –

dfs1

The working of DFS is pretty clear from the picture. Notice how we would assign the parent vertices to each vertex. Once we have visited all the vertices from a given initial vertex V1, we backtrack to V1. What do we really mean by this “backtrack” here is that the recursion control will gradually come back to the function that started explopring from V1. We will understand this once we put DFS in code. And one more thing, whenever we got a choice of going to two vertices from one vertex, we preferred going to the vertex with the greater number. Why is this…? This is because we will be following Head Insertion in our Adjacency Lists to have O(1) Insertion operation. Now, assuming that we insert the vertices from Vertex 1 to Vertex 10, the greater number vertices will end up being in front of the Linked Lists. Take a moment to understand this. Even if you don’t understand, it’s ok…! You will get the hang of it later. Why I really did that was to explain you the concept of Edge Classification.

dfs2

As you can see there are three types of edges, in fact, there are 4 actually. They are –

  • Tree Edge – These are the edges through which we have traversed all the vertices of the graph by DFS. More clearly, these are the edges that represent the parent-child relationship. That is, the tree edge Vertex 1 → Vertex 3 says that, Vertex 1 is the parent of Vertex 3. Just like the parent-child relationship in a tree. Why this is called a “tree” edge is that it happens so that these edges together form a “tree”, or rather a “forest”.
  • Forward Edge – This is an edge which points from one vertex which is higher in the hierarchy of parent-child relationship to a vertex which is a descendant. Observe that Vertex 2 is a descendant of Vertex 1, so the edge Vertex 1 → Vertex 3, is a forward edge.
  • Backward Edge – This is the opposite of forward edge. It points from a descendant Vertex to an ancestor Vertex. In the above diagram, the edge, Vertex 4 → Vertex 3 is a backward edge.
  • Cross Edge – Every other edge is a cross edge. We don’t have a cross edge in the above diagram, but, a cross edge can arise when there is a edge between siblings, two vertices that have the same parent.

Cycle Detection – Using DFS, we can detect if there are any cycles in the given graph. How…? This is very simple. If there exists at least one backward edge, then the given graph will have cycles. Well, telling how many cycles would be there for given number of backward edges is a challenge. But the point is if there is a backward edge, there is a cycle. This is by the very definition of the Backward Edge. It is an edge from a descendant to an ancestor. If we have a backward edge, then there will surely be another path of tree edges from the ancestor to descendant, forming a cycle. The picture below should make things clear.

dfs3

But, how do we compute backward edges in code…? This is a little tricky. We could have a boolean array of size |V| which would hold the status of the vertex, whether it is in the recursion stack or not. If the vertex is in the recursion stack, it means that the vertex is indeed an ancestor. So, the edge will be a backward edge.

Now try to code the DFS, it is basically a recursion process. If you are good with recursion, I’m sure you can get this. Nonetheless, I have put my code below –

/* Depth First Search Algorithm
 * using Adjacency Lists
 *
 * Authored by,
 * Vamsi Sangam.
 */

#include <stdio.h>
#include <stdlib.h>

struct node {
	int val;
	struct node * next;
};

struct node * add(struct node * head, int num)
{
	/* We use Head Insertion for inserting vertices
	 * into Linked List for O(1) Insertion.
	 */

	struct node * p = (struct node *) malloc(sizeof(struct node));

	p->val = num;
	p->next = head;

	return p;
}

void depth_first_search_explore(struct node * list[], int parent[], int vertex, int status[])
{
	if (parent[vertex] != -1) {
		//un-visited vertex
		struct node * temp = list[vertex];
		status[vertex] = 1;

		//recursively visit all vertices accessible from this Vertex
		while (temp != NULL) {

			if (parent[temp->val] == -1) {
				parent[temp->val] = vertex;
				//We started exploring from Vertex -'vertex',
				//so the Vertex - temp->val, it's
				//parent should be our initial vertex

				depth_first_search_explore(list, parent, temp->val, status);
				//Then we recursively visit everything from the child vertex
			} else {
				//Checking if the edge is a Backward Edge
				if (status[temp->val] == 1) {
					printf("\n%d ---> %d is a Backward Edge\n", vertex, temp->val);
				}
			}

			temp = temp->next;
			//After finishing, move on to next Vertex adjacent to
			//Vertex - 'vertex'
		}

		status[vertex] = 0;
	}
}

void depth_first_search(struct node * list[], int length, int parent[], int status[])
{
	int i;

	for (i = 1; i <= length; ++i) {

		if (parent[i] == -1) {
			parent[i] = 0;
			//It is a completely un-visited vertex and we start
			//our DFS from here, so it has no parent, but just
			//to mark it that we visited this node, we assign 0

			depth_first_search_explore(list, parent, i, status);
			//By this we begin to explore all the vertices reachable
			//from Vertex i
		}
	}
}

int main()
{
	int vertices, edges, i, j, v1, v2;

	printf("Enter the Number of Vertices -\n");
	scanf("%d", &vertices);

	printf("Enter the Number of Edges -\n");
	scanf("%d", &edges);

	struct node * adjacency_list[vertices + 1];
	int parent[vertices + 1];
	int status[vertices + 1];
	//Size is made (vertices + 1) to use the
	//array as 1-indexed, for simplicity

	//Must initialize your array
	for (i = 0; i <= vertices; ++i) {
		adjacency_list[i] = NULL;
		parent[i] = -1;
		status[i] = 0;
	}

	printf("\n");

	for (i = 1; i <= edges; ++i) {
		scanf("%d%d", &v1, &v2);
		adjacency_list[v1] = add(adjacency_list[v1], v2);		//Adding edge v1 ------> v2
	}

 	//Printing Adjacency List
	printf("\nAdjacency List -\n\n");
	for (i = 1; i <= vertices; ++i) {
		printf("adjacency_list[%d] -> ", i);

		struct node * temp = adjacency_list[i];

		while (temp != NULL) {
			printf("%d -> ", temp->val);
			temp = temp->next;
		}

		printf("NULL\n");
	}

	depth_first_search(adjacency_list, vertices, parent, status);

	printf("\nParent Array -\n");
	for (i = 1; i <= vertices; ++i) {
		printf("Parent of Vertex %d = %d\n", i, parent[i]);
	}

	return 0;
}

The code above has an additional feature. It prints out the Backward Edges. This is done by the technique explained above, implemented using an integer array. I know it is a dirty way to use an integer array for this, it consumes way too much space. It really needs to be a Boolean array, because we use on two values, 0 and 1. But I wanted to keep my code strictly in C, so I used an integer array. This is the Depth First Search Algorithm. It has a time complexity of O(|V| + |E|), just like the Breadth First Search. This is because, we visit every vertex once, or you could say, twice, and we cover all the edges that AdjacencyList[Vi] has, for all ViV which takes O(|E|) time, which is actually the for loop in our depth_first_search_explore() function. DFS is not very difficult, you just need to have experienced hands in recursion. You might end up getting stuck with some bug. But it is worth spending time with the bugs because they make you think in the perfect direction. So if you are not getting the code, you just have to try harder. But if you are having any doubts, feel free to comment them…! Keep practicing… Happy Coding…! 🙂

You may also like –

More in the Navigation Page ! →

Snakes and Ladders Game Code


Hello people…! In this post I will explain you how to find the shortest path to win the Snakes and Ladder game by using the Breadth First Search (BFS) Algorithm. If you don’t know the algorithm, I suggest you read my post about it. Now, Graph Theory has many applications and I love working with things that have real-world applications, well, off course the other data structures too have their uses but the speciality of Graph Theory is its applications have the closest association with our day-to-day activities. And to show you this and to set an example on how the BFS is actually put into action, I am taking up the example of the very popular game, Snakes and Ladder. Now, this game needs no introduction. We all must have played it in our childhood. I will explain you… How by using Graphs and BFS we can find the Shortest Path to win the game, and, we will state that path and the number of moves of dice it takes too. Have a good look at the Snakes and Ladder board below, we will be using it as an example throughout this post.

snakes-and-ladders

You can see we can reach the finish block by an number of ways, but how do you find out the best…? More importantly, how do you put it as code…? That’s what we are going to do. Now, think of how you can represent the game board in terms of a Graph, by Graph I mean in terms of Vertices and Edges. Don’t be too hard on yourself… Just take the first 7 blocks and try working out on paper what would be the Edge and what would be the Vertices. If you ponder upon this, it is very easy to tell that the numbered blocks on the game board will be our Vertices, then, what will be the Edges…? This depends on how you can go from one block to another on rolling the dice. For now, forget about the ladders and the snakes, just draw the graph for a small portion of the board. It should look somewhat similar to what is in the picture below –

snake1

As you see we would have six ways to leave block 1, depending on the dice roll. Same would be the case for block 2, or, block 3, and so on. Now, there is a very important point to be noted here… Remember, this is Directed Graph…! Because, once you roll 5 and go to block 6, there’s no way to come back. So, the directions are important. Now, let us push up the complexity a bit by adding a ladder to our above sketch in block 6, think what would happen to the edges…?

snake2

If you roll 5 from block 1 you will jump directly to block 27. So is for block 2 when you roll out 4, or, block 3 when you roll out 3 and so on. Now, “logically” speaking, the block 6 does not exists in our graph…! Think about the statement for a while. Whenever you reach block, you are directly jumping to block 27, you don’t stay there. Now, if you were constructing an Adjacency List for this graph…. In the list of adjacent vertices for Vertex 1, would you have Vertex 6 in the list, or Vertex 27…? Vertex 27 of course…! Being at Vertex 6 means being at Vertex 27…! That is why, our edge arrow did not end at Vertex 6… See it…? One more thing, in your Adjacency List, in the list of adjacent vertices for Vertex 6, what would you have…? Nothing…! Because you cannot come to a situation where you would have to stay on Vertex 6 and roll the dice. So the adjacent nodes list corresponding to Vertex 6 should be empty. These two things are very important, when you implement the Adjacency List for the Snake and Ladder board.

Same would be the case, if a snake was there at a block. Logically that block will not exist as a vertex in our adjacency list, and the corresponding edges must be removed. The only difference being that, the edge created due to a snake can lead you to a block of lower value.

snake3

We assume that getting caught by a snake is always unfavorable, and will not add to the progress of a short path. Just to get a better idea of the scenario of a ladder and snake, I have depicted what the Adjacency List would look like for the above two examples shown in the pictures.

snake4

That pic should really clarify all the confusion about the graph. If you still don’t get it, feel free to comment your query…! Now, what do we do once we have the Adjacency List ready…? We just call the Breadth First Search method on that list…! Wait… Really…?! That’s it…? Yes…! You see the hardest part here in solving the Snakes and Ladder by graphs is correctly determining what your Vertices and Edges are. Once you get that, all you have to do is the Breadth First Search in the resultant graph. Then you can get the shortest path from Vertex 1 to Vertex 100. Now, try getting the Adjacency Lit correct and simply call the BFS method. I’m sure you will succeed if you put in a little dedication. But for those who tried and failed. I have put my code below. Before you look at my code, there are a few logics I have used –

  • In the beginning of the program, I added all theSorry for the bad snip…! edges as though the Game Board had no snakes or ladders at all (these number of edges is what I printed), then, I removed the
    respective edges concerning the snakes and ladders one-by-one.
  • When a Vertex ‘n’ has a ladder or a snake, we are supposed to
    eplace the corresponding edges as I depicted in the pictures above, for that, I replaced the Vertex n‘s edge with the new value, in Vertices (n – 1), (n – 2), (n – 3), (n – 4), (n – 5), (n – 6). Because it is only in these vertices that you can find an edge of vertex n.
  • I put all this replacing stuff in a function replace() which takes the Linked List and searches for a value ‘num’ and replaces it with the value ‘replaced_num’ when it finds it.
  • I used an extra element in my array to make them 1 – index based.
  • The number of moves to complete the shortest path would be the level of the last vertex, Vertex 100. Why…?! Think about it for a minute and you’ll get it…!
  • I backtracked from Vertex 100, finding who it’s parent was, then, who the parent’s parent was and so on to find the path and finally printed it in reverse order.
/*
 * Snake and Ladder Shortest Path in C
 * using Adjacency List and
 * Breadth First Search.
 *
 * Authored by,
 * Vamsi Sangam
 */

#include <stdio.h>
#include <stdlib.h>

struct node {
	int val;
	struct node * next;
};

struct node * add(struct node * head, int num)
{
	struct node * p = (struct node *) malloc(sizeof(struct node));

	p->val = num;
	p->next = head;

	return p;
}

void breadth_first_search(struct node * list[], int vertices, int parent[], int level[])
{
	struct node * temp;
	int i, par, lev, flag = 1;

	lev = 0;
	level[1] = lev;

	while (flag) {
		flag = 0;
		for (i = 1; i <= vertices; ++i) {
			if (level[i] == lev) {
				flag = 1;
				temp = list[i];
				par = i;

				while (temp != NULL) {
					if (level[temp->val] != -1) {
						temp = temp->next;
						continue;
					}

					level[temp->val] = lev + 1;
					parent[temp->val] = par;
					temp = temp->next;
				}
			}
		}

		++lev;
	}
}

void replace(struct node * head, int num, int replaced_num)
{
	struct node * p = head;

	while (p->next != NULL) {
		if (p->val == num) {
			break;
		}

		p = p->next;
	}

	p->val = replaced_num;
}

int main()
{
	int vertices, edges, i, j, v1, v2;

	vertices = 100;		//For a 10X10 board
	edges = 0;			//Just to count the edges

	struct node * list[vertices + 1];

	int parent[vertices + 1];
	int level[vertices + 1];	

	for (i = 0; i <= vertices; ++i) {
		//Initialising our arrays
		list[i] = NULL;
		parent[i] = 0;
		level[i] = -1;
	}

	for (i = 1; i <= vertices; ++i) {
		for (j = 1; j <= 6 && j + i <= 100; ++j) {
			list[i] = add(list[i], i + j);
			++edges;
		}
	}

	printf("Edges added so far = %d\n", edges);

	int num_ladders, num_snakes;

	printf("Enter the Number of Ladders and Snakes -\n");
	scanf("%d%d", &num_ladders, &num_snakes);

	printf("Enter the Ladder Edges -\n");
	//Dealing with Ladder Edges
	for (i = 0; i < num_ladders; ++i) {
		scanf("%d%d", &v1, &v2);

		j = v1 - 6;

		if (j < 1) {
			j = 1;
		}

		for (; j < v1; ++j) {
			replace(list[j], v1, v2);
		}
	}

	printf("Enter the Snake Edges -\n");
	//Dealing with Snakes Edges
	for (i = 0; i < num_snakes; ++i) {
		scanf("%d%d", &v1, &v2);

		j = v1 - 6;

		if (j < 0) {
			j = 0;
		}

		for (; j < v1; ++j) {
			//Replacing Vertex value v1 by v2
			replace(list[j], v1, v2);
		}
	}

	//Printing Adjacency List
	printf("\nAdjacency List -\n");
	for (i = 1; i <= vertices; ++i) {
		printf("%d -> ", i);

		struct node * temp = list[i];

		while (temp != NULL) {
			printf("%d -> ", temp->val);
			temp = temp->next;
		}

		printf("NULL\n");
	}

	breadth_first_search(list, vertices, parent, level);
	printf("\nNumber of Moves required = %d\n", level[vertices]);

	/*Just a little mechanism to print the shortest path*/
	int path[level[vertices] + 1];

	j = 100;
	path[0] = j;

	for (i = 1; i <= level[vertices]; ++i) {
		path[i] = parent[j];
		j = parent[j];
	}

	printf("\nShortest Path -\n");
	for (i = level[vertices]; i >= 0; --i) {
		printf("%d -> ", path[i]);
	}
	printf("Finish...!\n");

	return 0;
}

You can think of extending this to a 20 ✗ 20 Snakes and Ladder board or an nn board. You just have to add n2 edges to the graph. Solving puzzles by Graph Theory is real fun. I hope this post helped you. If you have any doubts, feel free to comment them. Keep practicing… and… Happy Coding…! 🙂

You may also like –

More in the Navigation Page ! →

Breadth First Search (BFS) Algorithm


Hello people…! In this post I will explain one of the most widely used Graph Search Algorithms, the Breadth First Search (BFS) Algorithm. Once you have learned this, you would have gained a new weapon in your arsenal, and you can start solving good number of Graph Theory related competitive programming questions.

What we do in a BFS is a simple step-by-step process, which is –

  1. Start from a vertex S. Let this vertex be at, what is called…. “Level 0”.
  2. Find all the other vertices that are immediately accessible from this starting vertex S, i.e., they are only a single edge away (the adjacent vertices).
  3. Mark these adjacent vertices to be at “Level 1”.
  4. There will be a challenge that you might be coming back to the same vertex due to a loop or a ring in the graph. If this happens your BFS will take time. So, you will go only to those vertices who do not have their “Level” set to some value.
  5. Mark which is the parent vertex of the current vertex you’re at, i.e., the vertex from which you accessed the current vertex. Do this for all the vertices at Level 1.
  6. Now, find all those vertices that are a single edge away from all the vertices which are at “Level 1”. These new set of vertices will be at “Level 2”.
  7. Repeat this process until you run out of graph.

This might sound heavy… Well atleast it would sound heavy to me if I heard it for the first time…! Many questions may pop-up in your mind, like, “How are we gonna do all that…?!”. Well, for now, focus on the concept, we will talk about the code later. And remember, we are talking about an Undirected Graph here. We will talk about Directed Graphs later. To understand the above stated steps, examine the picture below –

Breadth First Search Algorithm - Step-by-Step

Breadth First Search Algorithm – Step-by-Step

The sketch clearly shows you how we explore the vertices adjacent to a vertex and mark their levels. If you have noticed, whenever there were two ways of accessing the same vertex from multiple vertices of the same Level, i.e., in the diagram, Vertex 3 was accessible from Vertex 2 and Vertex 8, we preferred its parent to be Vertex 8, the larger number. Why is that so…? We will learn that in a short moment.

The concepts that I wish to emphasize from the above picture are, how BFS can tell you the shortest path from a given vertex (the start vertex) to all the other vertices and the number of edges or, the “length of the path”, would be the Level of that Vertex itself. This is a very important feature of the BFS, you will understand this more clearly when I explain it with the help of an example in a different post.

Now, having got some knowledge about the BFS, it is a good thing to do an exercise on this topic to really get the flow going. Try applying BFS on the Graph given. All you have to do is to implement the step-by-step process and get that final figure which I got above. And make sure you label the Levels and Parents for each vertex in the end.

bfs2

Now, we come to the code part of the Breadth First Search, in C. We use the same Adjacency List that we used in our discussion of Graph Theory Basics. Coming back to our BFS discussion, the level of each vertex is stored in a separate array and so is the case for parent of each vertex. The three arrays are initialized to appropriate values. Now recall our step-by-step process that was stated earlier. Try to put that in terms of pseudocode and then proper code. Take a while to think how you would do that. If you could code it, you are marvelous…! 😀 … If not, don’t fret, I put my code below…!

/* ==========  ========== ========== ========= */
//         Breadth First Search (BFS)          //
//               Algorithm in C                //
//                                             //
//         Functions follow Pascal Case        //
//           Convention and Variables      	   //
//         follow Camel Case Convention        //
//                                             //
//            Author - Vamsi Sangam            //
//            Theory of Programming            //
/* ========== ========== ========== ========== */

#include <stdio.h>
#include <stdlib.h>

struct Edge {
    int vertex;
    struct Edge * next;
};

// Inserts Node to the Linked List by Head Insertion - O(1)
// Returns address of head which is the newly created node.
struct Edge * AddEdge(struct Edge * currentHead, int newVertex)
{
    struct Edge * newHead
				 = (struct Edge *) malloc(sizeof(struct Edge));

    newHead->vertex = newVertex;
    newHead->next = currentHead;

    return newHead;
}

void BreadthFirstSearch(
						struct Edge * adjacencyList[],
						int vertices,
						int parent[],
						int level[],
						int startVertex
					   )
{
    struct Edge * traverse;
    int i, par, lev, flag = 1;
    // 'lev' represents the level to be assigned
    // 'par' represents the parent to be assigned
    // 'flag' used to indicate if graph is exhausted

    lev = 0;
    level[startVertex] = lev;
    // We start at startVertex

    while (flag) {
        flag = 0;
        for (i = 1; i <= vertices; ++i) {
            if (level[i] == lev) {
                flag = 1;
                traverse = adjacencyList[i];
                par = i;

                while (traverse != NULL) {
                    if (level[traverse->vertex] != -1) {
                        traverse = traverse->next;
                        continue;
                    }

                    level[traverse->vertex] = lev + 1;
                    parent[traverse->vertex] = par;
                    traverse = traverse->next;
                }
            }
        }

        ++lev;
    }
}

int main()
{
    int vertices, edges, i, v1, v2;

    printf("Enter the Number of Vertices -\n");
    scanf("%d", &vertices);

    printf("\nEnter the Number of Edges -\n");
    scanf("%d", &edges);

    struct Edge * adjacencyList[vertices + 1];
    // Size is made (vertices + 1) to use the
    // array as 1-indexed, for simplicity

    int parent[vertices + 1];
    // Each element holds the Node value of its parent
    int level[vertices + 1];
    // Each element holds the Level value of that node

    // Must initialize your array
    for (i = 0; i <= vertices; ++i) {
        adjacencyList[i] = NULL;
        parent[i] = 0;
        level[i] = -1;
    }

    for (i = 1; i <= edges; ++i) {
        scanf("%d%d", &v1, &v2);

        // Adding edge v1 --> v2
        adjacencyList[v1] = AddEdge(adjacencyList[v1], v2);

		// Adding edge v2 --> v1
		// Remove this if you want a Directed Graph
        adjacencyList[v2] = AddEdge(adjacencyList[v2], v1);
    }

    // Printing Adjacency List
    printf("\nAdjacency List -\n\n");
    for (i = 1; i <= vertices; ++i) {
        printf("adjacencyList[%d] -> ", i);

        struct Edge * traverse = adjacencyList[i];

        while (traverse != NULL) {
            printf("%d -> ", traverse->vertex);
            traverse = traverse->next;
        }

        printf("NULL\n");
    }

    printf("\nEnter a Start Vertex - ");
    scanf("%d", &v1);

    BreadthFirstSearch(adjacencyList, vertices, parent, level, v1);

    // Printing Level and Parent Arrays
    printf("\nLevel and Parent Arrays -\n");
    for (i = 1; i <= vertices; ++i) {
        printf("Level of Vertex %d is %d, Parent is %d\n",
								  i, level[i], parent[i]);
    }

    return 0;
}

Try using the above example given as practice as the sample input to your code, so that you can easily compare the results. Now, the important point here is when we insert a vertex into the Adjacency List, we follow Head Insertion to get O(1) Insertion. What happens due to this is that, the Linked List will have numbers in the descending order.

So, when we are applying BFS for adjacent vertices of a given node, the Vertex with the higher number is met first. And then we explore starting by that higher numbered Vertex. This is why whenever we had a choice in approaching a vertex, we preferred approaching from the vertex which had the bigger number, why…? Due to Head Insertion…! If you had used Tail Insertion, this would be reversed.

Other Implementations of BFS

This is the overview of Breadth First Search Algorithm. It has a computational complexity of O(|V| + |E|), which is pretty fast. But why O(|V| + |E|)…? If you think about the algorithm a little bit, we cover all the |V| vertices level-by-level, checking all the |E| edges twice (for an Undirected Graph). Once the level is set for a subset of vertices we don’t visit them again. Put an itty-bitty thought into it and I’m sure you’ll get the idea…! 😉 … But, the code above runs slower than O(|V| + |E|)… To achieve the proper complexity we must use a Queue.

We can use BFS in the following scenarios –

  • Shortest Path or Quickest Path (if all edges have equal weight).
  • Finding the Length of Shortest Path between two Vertices.

With the knowledge of BFS you can start solving Graph Theory related problems. It really depends on your logic how you will apply the BFS to the given problem. And there is only one way to develop it… Practice…! So, keep practicing… Feel free to comment your doubts..! Commenting is super easy if you are a Facebook, Twitter or a Google+ user…! So, don’t hesitate..! 😉 … Happy Coding…! 🙂

Go to the Homepage →

Graph Theory Basics


Hello people…! In this post, I will talk about the basics of the Graph Data Structure. Its terminologies, types and implementations in C. Graphs are difficult to code, but they have the most interesting real-life applications. When you want to talk about the real-life applications of graphs, you just cannot resist talking about the Facebook’s Graph Search! Now, I don’t really know that algorithm, but it uses graphs to find out your closest friends, or any other associations you have with the other users.

Other applications include finding the shortest paths, and by shortest path, I mean in the “universal sense”, it could be the shortest path of a Traveling Salesman, or the shortest path to win Snake and Ladder or even the shortest way to solve the Rubik’s Cube…! Amazing, right…?! So, let’s get started…!

Graphs are much clear when defined in mathematical terms. A Graph G (V, E), consists of two sets,

  • Set of Vertices, V
  • Set of Edges, E

As the name suggest, V, is the set of vertices or, the set of all nodes in a given graph and E, is the set of all the edges between these nodes, or the associations between them. So, |V| denotes the number of nodes in a graph and |E| denotes the number of edges. Let us take up a small example to understand these terms –

Undirected and Directed Graph

Undirected and Directed Graph

As you can see, we have two types of graphs, Directed and Undirected Graphs. In Undirected Graphs,the direction for an edge is not defined. This is generally used to indicate that the edge is actually bi-directional in nature, i.e, one can go from V1 → V2, as well as from V2 → V1. This is used to represent the graph where the states (nodes) are re-doable, such as, in a Rubik’s Cube, you can go from one configuration of the cube to the other as well as the vice-versa.

The other type, the Directed Graph restricts the… “traversal”, if you say… to only one direction. For example, in the Snakes and Ladders game, you can play dice and go from Position 5 → Position 10, but you can’t roll the dice such that it gets you from Position 10 ← Position 5… That’s obvious…! So, it really depends on the requirement of the situation, which graph you choose. Generally, we only have to deal with graphs which are purely Directed or purely Undirected. We do not talk about the hybrid type. Some of the other type of graphs are shown below –

Types of Graphs

Types of Graphs

All the graphs which we have discussed till now are Simple Graphs, they do not contain any loops. The ones which do contain loops are Non-Simple. A given graph G can be drawn in any way as long as the sets V and E remain the same. Such a graph is shown in the figure. The same graph is just drawn differently, they both have the same set of vertices and edges. Such graphs are called as Isomorphic Graphs, as the name suggests “iso” means “same”, “morphic” means “shape”, the graphs which have the same shape.

All these Graphs are Connected Graphs, i.e., for any given pair of vertices V1 and V2 ∈ V, there always exists a path between these two vertices. The last graph is the Weighted Graph. Here, the edges are given “weights”. To understand a Weighted Graph, you can think of the vertices as cities and the edges as the distance between them (so they will have some value). You could be asked the shortest path between two cities. So in the context of a Weighted graph, the shortest path may not be the one with least edges. One must keep that in mind.

Now, we will look at the way the graphs are implemented. There are mainly two ways to implement graphs, they are –

  • Adjacency Matrix
  • Adjacency List

Adjacency Matrix

It is a very simple representation where we take a two-dimensional matrix of the size |V| × |V|, i.e., the declaration in C would look like,

int AdjacencyMatrix[V + 1][V + 1];

As the arrays are zero-indexed, we generally prefer to keep the sizes V + 1.Now, initially all the values would be zeroes. We put ‘1’ in an element of the two-dimensional array, AdjacencyMatrix[i][j], if there exists an edge between Vi → Vj. Remember, arrays are zero-indexed. Let us take an example to understand this,

Adjacency Matrix

Adjacency Matrix

I hope it is clear from the example, how we can represent the graph using an Adjacency Matrix. It is very easy to code. All you have to do is create a two-dimensional matrix and assign the values, so, I won’t post the code, but if you have any doubts regarding the code, feel free to comment them. The Adjacency Matrix has its pros and cons which we will discuss shortly.

Adjacency List

It is an array of pointers of size |V| + 1, where each pointer points to a linked list. The linked list holds the nodes which are adjacent to the ith vertex. By adjacent, we mean those vertices that can be accessed from ith node by making a single move. It is a little complex implementation if you are uncomfortable with linked lists. But it is this implementation that we will use for Graph Search Algorithms. This is because we can easily and efficiently know which of the vertices of V are neighbours of a given vertex. And that is what you want, if your are to do a Graph Search. Now, let us take an example to understand the concept of Adjacency Lists.

Adjacency List

Adjacency List

I hope the sketch has made it clear as to how we use the Adjacency List to implement a Graph. As I mentioned before, it is one of the most versatile implementations of the Graph Data Structure. Now, try to code the implementation in C, or any language you like. Please try this at least once by yourself so that you can get brain deep into the Graph Data Structure. It is only a linked list. And one more thing, don’t get confused between pointer-to-an-array and array-of-pointers…! So, if we put the value of |V| + 1 in a variable ‘vertices‘, now the declaration of our Adjacency List would be,

struct node * adjacency_list[vertices];

Remember this is an “array of pointers”, the declaration for “pointer to an array” would look like,

struct node (* adjcency_list)[vertices];

This is due to operator precedence. Some people make mistakes here, so, watch out! If you want to code in C++, we can have a vector where each element of the vector can hold another vector, so, the declaration would be, like,

list< vector<int> > adjacency_list;

Try to code for an hour or two… If you don’t get the code, it’s ok…! I’ve put my C code below. Those who got the code right… You are fabulous…! 😉 … But please take a minute to go through my code, I might have written a slightly better code than yours.

/* ==========  ========== ========== ========= */
//          Adjacency List Graph Data          //
//          Structure Implementation           //
//           in C using Linked List            //
//                                             //
//         Functions follow Pascal Case        //
//           Convention and Variables      	   //
//         follow Camel Case Convention        //
//                                             //
//            Author - Vamsi Sangam            //
//            Theory of Programming            //
/* ========== ========== ========== ========== */
 
#include <stdio.h>
#include <stdlib.h>
 
struct Edge {
    int vertex;
    struct Edge * next;
};

// Inserts Node to the Linked List by Head Insertion - O(1)
// Returns address of head which is the newly created node.
struct Edge * AddEdge(struct Edge * currentHead, int newVertex)
{
    struct Edge * newHead 
				 = (struct Edge *) malloc(sizeof(struct Edge));
 
    newHead->vertex = newVertex;
    newHead->next = currentHead;
 
    return newHead;
}

int main()
{
    int vertices, edges, i, v1, v2;
 
    printf("Enter the Number of Vertices -\n");
    scanf("%d", &vertices);
 
    printf("\nEnter the Number of Edges -\n");
    scanf("%d", &edges);
 
    struct Edge * adjacencyList[vertices + 1];
    // Size is made (vertices + 1) to use the
    // array as 1-indexed, for simplicity
 
    // Must initialize your array
    for (i = 0; i <= vertices; ++i) {
        adjacencyList[i] = NULL;
    }
 
    for (i = 1; i <= edges; ++i) {
        scanf("%d%d", &v1, &v2);
        
        // Adding edge v1 --> v2
        adjacencyList[v1] = AddEdge(adjacencyList[v1], v2);
		
		// Adding edge v2 --> v1
		// Remove this if you want a Directed Graph
        adjacencyList[v2] = AddEdge(adjacencyList[v2], v1);
    }
 
    // Printing Adjacency List
    printf("\nAdjacency List -\n\n");
    for (i = 1; i <= vertices; ++i) {
        printf("adjacencyList[%d] -> ", i);
 
        struct Edge * traverse = adjacencyList[i];
 
        while (traverse != NULL) {
            printf("%d -> ", traverse->vertex);
            traverse = traverse->next;
        }
 
        printf("NULL\n");
    }
 
    return 0;
}

Other implementations of Adjacency List

Comparison between Adjacency Matrix and Adjacency List

Adjacency Matrix Adjacency List
For a Directed Graph, it consumes O(|V|2) space which is often under-utilized in the implementation. For an Undirected Graph also, it consumes O(|V|2) space which is also under-utilized as the generated matrix is symmetric about diagonal and values just repeat. For a Directed Graph it consumes O(|V| + |E|) space which is less, and is utilized optimally. For an Undirected Graph it consumes O(|V| + 2 ✗ |E|) space, which is O(|V| + |E|), which is less.
As the memory allocation is contiguous, data access is slightly faster. It is basically a Linked List, so memory allocation is not contiguous, hence traversal is slow as as compared to the traversal in an array. However, this can be eliminated if we use a Vector of C++ STL Library in the place of the Linked List. But C++ STL Vector is not recommended for graphs that keep growing.
Inserting an edge takes O(1) time, i.e., constant time. Therefore, inserting |E| edges take O(|E|) time, as direct access is available. Inserting an edge can take O(|E|) in the worst case, which is, there is an edge between every pair of vertices, one must traverse the whole Linked List over-and-over causing insertion of |E| nodes to take O(|E|2) time. However, if one follows Head Insertion, inserting a node will take O(1) time, and inserting |E| nodes will take O(|E|) time.
Deleting an edge is very simple, we just set the corresponding element value to 0, which takes O(1) time. Deleting an edge is time-taking as, one would have to traverse the Linked List, which is slow (non-contiguous). In the worst case, it takes O(|E|) time.

 

With the knowledge of Adjacency Matrix and Adjacency List alone you won’t be able to do the competitive programming questions, because you do not know how to explore your graph to get the desired result. For this, you need to know the Breadth First Search (BFS) Algorithm and the Depth First Search (DFS) Algorithm. This post is only to give an introduction. Feel free to comment your doubts..! Commenting is super easy if you are a Facebook, Twitter or a Google+ user…! So, don’t hesitate..! 😉 … Keep practising…. Happy Coding…! 🙂