Graph Theory Basics


Hello people…! In this post, I will talk about the basics of the Graph Data Structure. Its terminologies, types and implementations in C. Graphs are difficult to code, but they have the most interesting real-life applications. When you want to talk about the real-life applications of graphs, you just cannot resist talking about the Facebook’s Graph Search! Now, I don’t really know that algorithm, but it uses graphs to find out your closest friends, or any other associations you have with the other users.

Other applications include finding the shortest paths, and by shortest path, I mean in the “universal sense”, it could be the shortest path of a Traveling Salesman, or the shortest path to win Snake and Ladder or even the shortest way to solve the Rubik’s Cube…! Amazing, right…?! So, let’s get started…!

Graphs are much clear when defined in mathematical terms. A Graph G (V, E), consists of two sets,

  • Set of Vertices, V
  • Set of Edges, E

As the name suggest, V, is the set of vertices or, the set of all nodes in a given graph and E, is the set of all the edges between these nodes, or the associations between them. So, |V| denotes the number of nodes in a graph and |E| denotes the number of edges. Let us take up a small example to understand these terms –

Undirected and Directed Graph

Undirected and Directed Graph

As you can see, we have two types of graphs, Directed and Undirected Graphs. In Undirected Graphs,the direction for an edge is not defined. This is generally used to indicate that the edge is actually bi-directional in nature, i.e, one can go from V1 → V2, as well as from V2 → V1. This is used to represent the graph where the states (nodes) are re-doable, such as, in a Rubik’s Cube, you can go from one configuration of the cube to the other as well as the vice-versa.

The other type, the Directed Graph restricts the… “traversal”, if you say… to only one direction. For example, in the Snakes and Ladders game, you can play dice and go from Position 5 → Position 10, but you can’t roll the dice such that it gets you from Position 10 ← Position 5… That’s obvious…! So, it really depends on the requirement of the situation, which graph you choose. Generally, we only have to deal with graphs which are purely Directed or purely Undirected. We do not talk about the hybrid type. Some of the other type of graphs are shown below –

Types of Graphs

Types of Graphs

All the graphs which we have discussed till now are Simple Graphs, they do not contain any loops. The ones which do contain loops are Non-Simple. A given graph G can be drawn in any way as long as the sets V and E remain the same. Such a graph is shown in the figure. The same graph is just drawn differently, they both have the same set of vertices and edges. Such graphs are called as Isomorphic Graphs, as the name suggests “iso” means “same”, “morphic” means “shape”, the graphs which have the same shape.

All these Graphs are Connected Graphs, i.e., for any given pair of vertices V1 and V2 ∈ V, there always exists a path between these two vertices. The last graph is the Weighted Graph. Here, the edges are given “weights”. To understand a Weighted Graph, you can think of the vertices as cities and the edges as the distance between them (so they will have some value). You could be asked the shortest path between two cities. So in the context of a Weighted graph, the shortest path may not be the one with least edges. One must keep that in mind.

Now, we will look at the way the graphs are implemented. There are mainly two ways to implement graphs, they are –

  • Adjacency Matrix
  • Adjacency List

Adjacency Matrix

It is a very simple representation where we take a two-dimensional matrix of the size |V| × |V|, i.e., the declaration in C would look like,

int AdjacencyMatrix[V + 1][V + 1];

As the arrays are zero-indexed, we generally prefer to keep the sizes V + 1.Now, initially all the values would be zeroes. We put ‘1’ in an element of the two-dimensional array, AdjacencyMatrix[i][j], if there exists an edge between Vi → Vj. Remember, arrays are zero-indexed. Let us take an example to understand this,

Adjacency Matrix

Adjacency Matrix

I hope it is clear from the example, how we can represent the graph using an Adjacency Matrix. It is very easy to code. All you have to do is create a two-dimensional matrix and assign the values, so, I won’t post the code, but if you have any doubts regarding the code, feel free to comment them. The Adjacency Matrix has its pros and cons which we will discuss shortly.

Adjacency List

It is an array of pointers of size |V| + 1, where each pointer points to a linked list. The linked list holds the nodes which are adjacent to the ith vertex. By adjacent, we mean those vertices that can be accessed from ith node by making a single move. It is a little complex implementation if you are uncomfortable with linked lists. But it is this implementation that we will use for Graph Search Algorithms. This is because we can easily and efficiently know which of the vertices of V are neighbours of a given vertex. And that is what you want, if your are to do a Graph Search. Now, let us take an example to understand the concept of Adjacency Lists.

Adjacency List

Adjacency List

I hope the sketch has made it clear as to how we use the Adjacency List to implement a Graph. As I mentioned before, it is one of the most versatile implementations of the Graph Data Structure. Now, try to code the implementation in C, or any language you like. Please try this at least once by yourself so that you can get brain deep into the Graph Data Structure. It is only a linked list. And one more thing, don’t get confused between pointer-to-an-array and array-of-pointers…! So, if we put the value of |V| + 1 in a variable ‘vertices‘, now the declaration of our Adjacency List would be,

struct node * adjacency_list[vertices];

Remember this is an “array of pointers”, the declaration for “pointer to an array” would look like,

struct node (* adjcency_list)[vertices];

This is due to operator precedence. Some people make mistakes here, so, watch out! If you want to code in C++, we can have a vector where each element of the vector can hold another vector, so, the declaration would be, like,

list< vector<int> > adjacency_list;

Try to code for an hour or two… If you don’t get the code, it’s ok…! I’ve put my C code below. Those who got the code right… You are fabulous…! 😉 … But please take a minute to go through my code, I might have written a slightly better code than yours.

/* ==========  ========== ========== ========= */
//          Adjacency List Graph Data          //
//          Structure Implementation           //
//           in C using Linked List            //
//                                             //
//         Functions follow Pascal Case        //
//           Convention and Variables      	   //
//         follow Camel Case Convention        //
//                                             //
//            Author - Vamsi Sangam            //
//            Theory of Programming            //
/* ========== ========== ========== ========== */
 
#include <stdio.h>
#include <stdlib.h>
 
struct Edge {
    int vertex;
    struct Edge * next;
};

// Inserts Node to the Linked List by Head Insertion - O(1)
// Returns address of head which is the newly created node.
struct Edge * AddEdge(struct Edge * currentHead, int newVertex)
{
    struct Edge * newHead 
				 = (struct Edge *) malloc(sizeof(struct Edge));
 
    newHead->vertex = newVertex;
    newHead->next = currentHead;
 
    return newHead;
}

int main()
{
    int vertices, edges, i, v1, v2;
 
    printf("Enter the Number of Vertices -\n");
    scanf("%d", &vertices);
 
    printf("\nEnter the Number of Edges -\n");
    scanf("%d", &edges);
 
    struct Edge * adjacencyList[vertices + 1];
    // Size is made (vertices + 1) to use the
    // array as 1-indexed, for simplicity
 
    // Must initialize your array
    for (i = 0; i <= vertices; ++i) {
        adjacencyList[i] = NULL;
    }
 
    for (i = 1; i <= edges; ++i) {
        scanf("%d%d", &v1, &v2);
        
        // Adding edge v1 --> v2
        adjacencyList[v1] = AddEdge(adjacencyList[v1], v2);
		
		// Adding edge v2 --> v1
		// Remove this if you want a Directed Graph
        adjacencyList[v2] = AddEdge(adjacencyList[v2], v1);
    }
 
    // Printing Adjacency List
    printf("\nAdjacency List -\n\n");
    for (i = 1; i <= vertices; ++i) {
        printf("adjacencyList[%d] -> ", i);
 
        struct Edge * traverse = adjacencyList[i];
 
        while (traverse != NULL) {
            printf("%d -> ", traverse->vertex);
            traverse = traverse->next;
        }
 
        printf("NULL\n");
    }
 
    return 0;
}

Other implementations of Adjacency List

Comparison between Adjacency Matrix and Adjacency List

Adjacency Matrix Adjacency List
For a Directed Graph, it consumes O(|V|2) space which is often under-utilized in the implementation. For an Undirected Graph also, it consumes O(|V|2) space which is also under-utilized as the generated matrix is symmetric about diagonal and values just repeat. For a Directed Graph it consumes O(|V| + |E|) space which is less, and is utilized optimally. For an Undirected Graph it consumes O(|V| + 2 ✗ |E|) space, which is O(|V| + |E|), which is less.
As the memory allocation is contiguous, data access is slightly faster. It is basically a Linked List, so memory allocation is not contiguous, hence traversal is slow as as compared to the traversal in an array. However, this can be eliminated if we use a Vector of C++ STL Library in the place of the Linked List. But C++ STL Vector is not recommended for graphs that keep growing.
Inserting an edge takes O(1) time, i.e., constant time. Therefore, inserting |E| edges take O(|E|) time, as direct access is available. Inserting an edge can take O(|E|) in the worst case, which is, there is an edge between every pair of vertices, one must traverse the whole Linked List over-and-over causing insertion of |E| nodes to take O(|E|2) time. However, if one follows Head Insertion, inserting a node will take O(1) time, and inserting |E| nodes will take O(|E|) time.
Deleting an edge is very simple, we just set the corresponding element value to 0, which takes O(1) time. Deleting an edge is time-taking as, one would have to traverse the Linked List, which is slow (non-contiguous). In the worst case, it takes O(|E|) time.

 

With the knowledge of Adjacency Matrix and Adjacency List alone you won’t be able to do the competitive programming questions, because you do not know how to explore your graph to get the desired result. For this, you need to know the Breadth First Search (BFS) Algorithm and the Depth First Search (DFS) Algorithm. This post is only to give an introduction. Feel free to comment your doubts..! Commenting is super easy if you are a Facebook, Twitter or a Google+ user…! So, don’t hesitate..! 😉 … Keep practising…. Happy Coding…! 🙂

5 thoughts on “Graph Theory Basics

  1. Pingback: graph theory | programmingalgos

  2. Can you tell more about what are you doing in the ‘add’ function? And how the returned value is treated by ‘adjacency_list[v]’?

    Like

    • Sure..! Now, if you look at the Adjacency List diagram, you will see that adjacency_list[v] stores the address of the first node. That is, it acts as a head pointer to the linked list. Now, when we call add() function, we pass this value, the address of the first node to the head variable. So, now, head points to the first node.
      null
      So inside the add() function, the situation looks like –
      null
      Now, we create a new node and assign put some values into it –
      null
      The blue part of the node indicates the next variable whereas the white part indicates the val variable. Now, we assign head to p->next. Then, p is returned. So, the situation becomes –
      null
      As you can see, p->next is pointing to head and the value of p is assigned to adjacency_list[v], which is the address of the newly allocated node. So, essentially, the new node is at the starting of the previous linked list. This is head insertion. This is what we do in add() function. Take your time and think properly for a while and I’m sure you’ll understand what’s happening…! Feel free to ask if you have any more doubts… 🙂

      Like

  3. In the below statement there is a small mistake.it should be “In UnDirected Graphs the direction……..”
    “In Directed Graphs the direction for and edge is not defined. This is generally used to indicate that the edge is actually bi-directional in nature, i.e, one can go from Node 1 to Node 2 as well as Node 2 to Node 1.”

    Like

Leave a comment