Depth First Search (DFS) Algorithm


Hello people…! In this post I will talk about the other Graph Search Algorithm, the Depth First Search (DFS). Depth First Search is different by nature from Breadth First Search. As the name suggests, “Depth”, we pick up a vertex S and see all the other vertices that can possibly reached by that vertex and we do that to all vertices in V. Depth First Search can be used to search over all the vertices, even for a disconnected graph. Breadth First Search can also do this, but DFS is more often used to do that. Depth First Search is used to solve puzzles! You can solve a given maze or even create your own maze by DFS. DFS is also used in Topological Sorting, which is the sorting of things according to a hierarchy. It is also used to tell if a cycle exists in a given graph. There are many other applications of DFS and you can do a whole lot of cool things with it. So, lets get started…!

The way the Depth First Search goes is really like solving a maze. When you see a maze in a newspaper or a magazine or anywhere else, the way you solve it is you take a path and go through it. If you find any junction or a crossroad, where you have a choice of paths to choose, you mark that junction, take up a path and traverse the whole route in your brain. If you see a dead end, you realize that this leads you now where so you come back to the junction you marked before and take up another path of the junction. See if that is also a dead end, else you continue to explore the, puzzle as this route contributes in reaching your destination. Well, at least that’s how I solve a maze. But… I hope you get the idea. You pick up a path and you explore it as much as possible. When you can’t travel any further and you haven’t yet reached your destination, you come back to the starting point and pick another path.

In code, you do this by recursion. Because of the very nature of recursion. Your recursion stack grows-grows and eventually becomes an empty stack. If you think for a while you can notice that the way of traversing which I told you above is logically covering only the vertices accessible from a given vertex S. Such traversal can be implemented using a single function that is recursive. But we want to explore the whole graph. So, we will use another function to do this for us. So, DFS is a two-functions thing. We will come back to the coding part later. Now, DFS too can be listed out in a step-by-step process –

  • For a given set of vertices V = {V1, V2,V3…}, start with V1, and explore all the vertices that can possibly be reached from V1.
  • Then go to the next vertex V2, if it hasn’t been visited, explore all the nodes reachable from V2, if not, go for V3.
  • Similarly, go on picking up all the vertices one-by-one and explore as much as possible if it wasn’t visited at all.

Now, how do you tell if a vertex wasn’t visited earlier…? If it has no parent vertex assigned. So what you have to do when you visit a node is –

  • Set the parent vertex of the current vertex to be the vertex from where you reached that vertex. We will use an array to assign parent vertices, which is initialised to -1, telling that the vertices were never visited.
  • When you are starting your exploration from a vertex, say, V1, we assign parent[V1] = 0, because it has no parent. If there is an edge from V1 to V2, we say, parent[V2] = V1.

Let’s look at an example and see how DFS really works –

dfs1

The working of DFS is pretty clear from the picture. Notice how we would assign the parent vertices to each vertex. Once we have visited all the vertices from a given initial vertex V1, we backtrack to V1. What do we really mean by this “backtrack” here is that the recursion control will gradually come back to the function that started explopring from V1. We will understand this once we put DFS in code. And one more thing, whenever we got a choice of going to two vertices from one vertex, we preferred going to the vertex with the greater number. Why is this…? This is because we will be following Head Insertion in our Adjacency Lists to have O(1) Insertion operation. Now, assuming that we insert the vertices from Vertex 1 to Vertex 10, the greater number vertices will end up being in front of the Linked Lists. Take a moment to understand this. Even if you don’t understand, it’s ok…! You will get the hang of it later. Why I really did that was to explain you the concept of Edge Classification.

dfs2

As you can see there are three types of edges, in fact, there are 4 actually. They are –

  • Tree Edge – These are the edges through which we have traversed all the vertices of the graph by DFS. More clearly, these are the edges that represent the parent-child relationship. That is, the tree edge Vertex 1 → Vertex 3 says that, Vertex 1 is the parent of Vertex 3. Just like the parent-child relationship in a tree. Why this is called a “tree” edge is that it happens so that these edges together form a “tree”, or rather a “forest”.
  • Forward Edge – This is an edge which points from one vertex which is higher in the hierarchy of parent-child relationship to a vertex which is a descendant. Observe that Vertex 2 is a descendant of Vertex 1, so the edge Vertex 1 → Vertex 3, is a forward edge.
  • Backward Edge – This is the opposite of forward edge. It points from a descendant Vertex to an ancestor Vertex. In the above diagram, the edge, Vertex 4 → Vertex 3 is a backward edge.
  • Cross Edge – Every other edge is a cross edge. We don’t have a cross edge in the above diagram, but, a cross edge can arise when there is a edge between siblings, two vertices that have the same parent.

Cycle Detection – Using DFS, we can detect if there are any cycles in the given graph. How…? This is very simple. If there exists at least one backward edge, then the given graph will have cycles. Well, telling how many cycles would be there for given number of backward edges is a challenge. But the point is if there is a backward edge, there is a cycle. This is by the very definition of the Backward Edge. It is an edge from a descendant to an ancestor. If we have a backward edge, then there will surely be another path of tree edges from the ancestor to descendant, forming a cycle. The picture below should make things clear.

dfs3

But, how do we compute backward edges in code…? This is a little tricky. We could have a boolean array of size |V| which would hold the status of the vertex, whether it is in the recursion stack or not. If the vertex is in the recursion stack, it means that the vertex is indeed an ancestor. So, the edge will be a backward edge.

Now try to code the DFS, it is basically a recursion process. If you are good with recursion, I’m sure you can get this. Nonetheless, I have put my code below –

/* Depth First Search Algorithm
 * using Adjacency Lists
 *
 * Authored by,
 * Vamsi Sangam.
 */

#include <stdio.h>
#include <stdlib.h>

struct node {
	int val;
	struct node * next;
};

struct node * add(struct node * head, int num)
{
	/* We use Head Insertion for inserting vertices
	 * into Linked List for O(1) Insertion.
	 */

	struct node * p = (struct node *) malloc(sizeof(struct node));

	p->val = num;
	p->next = head;

	return p;
}

void depth_first_search_explore(struct node * list[], int parent[], int vertex, int status[])
{
	if (parent[vertex] != -1) {
		//un-visited vertex
		struct node * temp = list[vertex];
		status[vertex] = 1;

		//recursively visit all vertices accessible from this Vertex
		while (temp != NULL) {

			if (parent[temp->val] == -1) {
				parent[temp->val] = vertex;
				//We started exploring from Vertex -'vertex',
				//so the Vertex - temp->val, it's
				//parent should be our initial vertex

				depth_first_search_explore(list, parent, temp->val, status);
				//Then we recursively visit everything from the child vertex
			} else {
				//Checking if the edge is a Backward Edge
				if (status[temp->val] == 1) {
					printf("\n%d ---> %d is a Backward Edge\n", vertex, temp->val);
				}
			}

			temp = temp->next;
			//After finishing, move on to next Vertex adjacent to
			//Vertex - 'vertex'
		}

		status[vertex] = 0;
	}
}

void depth_first_search(struct node * list[], int length, int parent[], int status[])
{
	int i;

	for (i = 1; i <= length; ++i) {

		if (parent[i] == -1) {
			parent[i] = 0;
			//It is a completely un-visited vertex and we start
			//our DFS from here, so it has no parent, but just
			//to mark it that we visited this node, we assign 0

			depth_first_search_explore(list, parent, i, status);
			//By this we begin to explore all the vertices reachable
			//from Vertex i
		}
	}
}

int main()
{
	int vertices, edges, i, j, v1, v2;

	printf("Enter the Number of Vertices -\n");
	scanf("%d", &vertices);

	printf("Enter the Number of Edges -\n");
	scanf("%d", &edges);

	struct node * adjacency_list[vertices + 1];
	int parent[vertices + 1];
	int status[vertices + 1];
	//Size is made (vertices + 1) to use the
	//array as 1-indexed, for simplicity

	//Must initialize your array
	for (i = 0; i <= vertices; ++i) {
		adjacency_list[i] = NULL;
		parent[i] = -1;
		status[i] = 0;
	}

	printf("\n");

	for (i = 1; i <= edges; ++i) {
		scanf("%d%d", &v1, &v2);
		adjacency_list[v1] = add(adjacency_list[v1], v2);		//Adding edge v1 ------> v2
	}

 	//Printing Adjacency List
	printf("\nAdjacency List -\n\n");
	for (i = 1; i <= vertices; ++i) {
		printf("adjacency_list[%d] -> ", i);

		struct node * temp = adjacency_list[i];

		while (temp != NULL) {
			printf("%d -> ", temp->val);
			temp = temp->next;
		}

		printf("NULL\n");
	}

	depth_first_search(adjacency_list, vertices, parent, status);

	printf("\nParent Array -\n");
	for (i = 1; i <= vertices; ++i) {
		printf("Parent of Vertex %d = %d\n", i, parent[i]);
	}

	return 0;
}

The code above has an additional feature. It prints out the Backward Edges. This is done by the technique explained above, implemented using an integer array. I know it is a dirty way to use an integer array for this, it consumes way too much space. It really needs to be a Boolean array, because we use on two values, 0 and 1. But I wanted to keep my code strictly in C, so I used an integer array. This is the Depth First Search Algorithm. It has a time complexity of O(|V| + |E|), just like the Breadth First Search. This is because, we visit every vertex once, or you could say, twice, and we cover all the edges that AdjacencyList[Vi] has, for all ViV which takes O(|E|) time, which is actually the for loop in our depth_first_search_explore() function. DFS is not very difficult, you just need to have experienced hands in recursion. You might end up getting stuck with some bug. But it is worth spending time with the bugs because they make you think in the perfect direction. So if you are not getting the code, you just have to try harder. But if you are having any doubts, feel free to comment them…! Keep practicing… Happy Coding…! 🙂

You may also like –

More in the Navigation Page ! →

Advertisements

5 thoughts on “Depth First Search (DFS) Algorithm

  1. Just correct me if i am wrong,
    parent[] is used to check whether the vertex is visited or not.
    status[] is used to check whether the vertex corresponds to backward edge or visited ?

    Can you just add to line to clear the use of parent and status array ?

    Like

  2. Pingback: graph theory | programmingalgos

  3. Hi Vamsi,
    The blog is wonderful. I tried to print cross edges and forward edges along with the back edges in the following code.
    http://ideone.com/18xfGW
    Please have a look and tell me where it can fail.

    The logic I used is:
    If we visit a node already visited and we have not yet departed from that node, than it makes a back edge.
    Else, there are two possibilities – cross edge or forward edge. In both these cases we visit a node from which we have alredy departed. So here, we can distinguish between them using the arrival time. If it is a forward edge (going from ancestor to descendent, u -> v), arrival time of u will be less v and if it is cross edge, arrival time of u will be greater than v.

    Like

    • Hi Vikas..! 🙂 … I’m sorry for the late reply..! I got caught up in exams..! 😛 …. Coming to your code… Your logic for the back edge is perfect… Your logic indirectly implies my logic for checking the backward edge… Your logic for the forward edge also seems flawless… However, your logic for the cross edge tickles my brain… As far as I know, if two vertices do not have an ancestor-descendant relationship (one is not the ancestor of the other), then an edge between them is a cross edge… I’m unable to reduce your logic to this… I thought about your cross edge technique for an hour… It is giving the expected output… However, I have a small hunch, that given good time, one MAY be able to find a counter example to your logic… So, though your code seems correct, I suggest you manually check the parents of the nodes for printing the cross edges… Feel free if you have anything else to say…! 🙂 … Have a nice day..! 😉

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s