Segment Trees


Hello people…! In this post I will talk about one of the most versatile data structure in competitive programming, the Segment Tree. Almost in every contest that you do in the major coding websites, at least one question is asked which uses a Segment Tree in its solution. This is because it is easy to code, and does some useful operations quickly, in logarithmic time. Segment Trees are used where the problem demands these two queries on a given array arr[1 … n] –

  1. Sum of elements in the range arr[L … R]. Where, 1 <= L <= R <= n.
  2. Updating an element arr[i].

You can think of doing this in the naive approach where you directly operate on the input array itself. You could do the first operation in O(n) time by running a loop calculate the sum. The second operation could be done in O(1) time, because, it’s only simply updating. This is good if the number of sum queries asked are very less and the update queries are more.

The next best thing you could think of is, Prefix Sum Array. The Prefix Sum Array is an array where the element arr[i] holds the sum of values from arr[1] to arr[i]. You can construct the Prefix Sum Array in O(n) time and solve the first query in O(1) time. Prefix Sum Array is one of my favourite techniques to find out the sum in a given range. But the update query takes O(n) time. If you want to update the ith element, you’d have to update all the elements from i to n, which takes O(n) in the worst case.

How about doing both of them in O(log n) time…? That’s really quick..! This is the speciality of the Segment Tree. We can get the sum query and the update query done in logarithmic time. The Segment Tree works on the principle of partial sums. If you have read my post on Fenwick Trees (or) Binary Indexed Trees, or if you already know BIT, Segment Tree should be easy for you. For those who don’t know, don’t worry, you’ll have an easier time learning the Binary Indexed Trees than compared to the others. So, let’s get started..!

As I mentioned earlier, the Segment Tree works on the principle of partial sums. What this principle is that we somewhere hold the sum of a few elements and keep building on it. In a Segment Tree, this “partial” is actually “half”. The Segment Tree structurally is a Binary Tree. Every internal node is associated with a range of numbers, whose sum is stored in that node. Now, what we do is partition the array into two halves. The left child will hold the sum of elements in the left partition and the right child will hold the sum of elements in the right partition. You may not get the idea of Segment Tree in the first reading. That’s okay. Give another reading and ponder upon the sketch below for a better understanding.

segment1

I hope it is clear from the picture how the Segment Tree works on the partial sums, which here is the left partition sum and the right partition sum. The leaf nodes represent the individual elements of the input array.

Note that the array we will not store the array in the nodes of our Segment Tree as was depicted in the sketch. This was done in the sketch only to give you the idea. The array will not be stored in the nodes. However, we will store the range limits on which the sum is defined, i.e, we will store the start index and the end index of the span of elements in the input array whose sum is there in the node. We will use a struct for the node which looks like this –

struct node
{
	int value, low, high;
	struct node * leftChild;
	struct node * rightChild;
};

The variable ‘value’ was to store the sum, ‘low’ and ‘high’ is to store the range. The variables ‘leftChild’ and the ‘rightChild’ are to point to the children.

Building the Segment Tree –

The Segment Tree is built in a recursive manner. We could write it in a step-by-step procedure –

  • Allocate memory.
  • Start with the root.
  • Partition the array you are given into two halves.
  • Create nodes for each half.
  • The value of the parent node is the sum of values on its children.
  • This is calculated by recursively calling the procedure on the children.
  • When the array given reduces to one element. The procedure must create a node and return the value.
  • When the parent node has acquired the sum of its children, it must return its current value to carry on the recursion.

The step-by-step procedure may not be clear in the first reading. You will know what I was talking about when you understand the pseudo-code below –

int buildSegmentTree(Array, low, high, SegmentTree)
	if low == high
		SegmentTree->value = Array[low]
		SegmentTree->low = SegmentTree->high = low

		return SegmentTree->value

	mid = (low + high) / 2

	SegmentTree->low = low
	SegmentTree->high = high
	SegmentTree->leftChild = allocate memory using calloc()
	SegmentTree->rightChild = allocate memory using calloc()
	SegmentTree->value = buildSegmentTree(Array, low, mid, SegmentTree)
					   + buildSegmentTree(Array, mid + 1, high, SegmentTree)

struct node * getSegmentTree(Array, ArraySize)
	if ArraySize <= 0
		return NULL

	struct node * SegmentTree = allocate memory using calloc()

	buildSegmentTree(Array, 1, ArraySize, SegmentTree)

	return SegmentTree

I made the pseudo-code a little extra-descriptive so that you can understand the step-by-step procedure and the code simultaneously. If you have any doubts, feel free to comment them.

The time complexity of the code is O(n). This is because in the worst case we would end up building a complete binary tree which will have 2 ✗ n – 1 nodes, and we preform calculations that take O(1) time for every node.

Sum Query –

Now, it is time to discuss the two important queries that we started our discussion with. We will begin with the sum query. We are supposed to tell the sum of all the elements in a given range. Now we will use our ‘low’ and ‘high’ variables in each node of the Segment Tree. The way we do this query mimics the Binary Search. We would be given a well defined range and we will keep comparing that with the range in each node of the Tree. If it fits perfectly, then we simply return the value stored in that node. If it doesn’t we have a few cases to handle. To understand the cases, consider the following notations –

  • NodeLeft – This is the left bound of the range whose sum is stored in the node. We actually named this as ‘low’ inn our struct.
  • NodeRight – This is the right bound of the range whose sum is stored in the node. We actually named this as ‘high’ inn our struct.
  • QueryLeft – This is the left bound of the range whose sum we are supposed to calculate.
  • QueryRight – This is the right bound of the range whose sum we are supposed to calculate.
  • Mid – This is the index upon which the partition occurs to assign values to the children nodes. It is essentially the mean of NodeLeft and NodeRight.
  • Value – This is the value that the current node is holding. Same as the ‘value’ variable in our struct.

Now, let’s talk about the cases we would encounter –

  1. (NodeLeft == QueryLeft && NodeRight == QueryRight) – This is the best thing that can happen. As we discussed a moment ago, we simply return the Value.
  2. (NodeLeft >= QueryLeft && NodeRight >= QueryRight) – This is the situation where among the elements that the current node holds the sum of, a few of the elements are not needed. The range that the current node has covers more than that’s needed. Now, to proceed with the recursion, we use Mid –
    • (QueryLeft <= Mid) → getSumQuery(SegmentTree->leftChild, QueryLeft, Mid) – This condition checks if the query demands the sum of elements which are in the left partition. So, we carry on the recursion with query range concerning only the left partition.
    • (QueryRight > Mid + 1) → getSumQuery(SegmentTree->leftChild, QueryLeft, Mid) – This checks if the query wants elements of the right partition. So, we carry on the recursion with query range concerning only the right partition.

    Note that both the conditions must be checked for the sum query to give accurate results. A special case arises when (QueryLeft == QueryRight), which means that we are looking for one element, which would obviously be either in the left or the right partition. In that case, we don’t have the necessity of  checking the second condition if the first condition is already met. This is because if we know the query asks for an element in the left partition, we have nothing to do with the right partition in this case.

This will be more clear once you put this idea to work by pen-and-paper. Coding this is simple too. Once you are able to code it, you will realise its resemblance to Binary Search. In the worst case, we would have to travel all the way down the tree, which takes O(log n).

Update Query –

What do we do if we were to update an element initially given in the input. How would we have to modify our Segment Tree so that it is still correct? We know that the initial input elements can be found at the leaves of the Segment Tree. So, we should be updating that. How do we traverse down to that leaf? We do this similarly as we did it in Sum Query, where we took a range and went down the tree based on it. If we put it in Sum Query terms, QueryLeft is equal to QueryRight. We search for it in a similar fashion. Then, we update the node which we came searching for. We also keep a note of the difference, and we add the difference to all those nodes which we came by. This is because, when you update a leaf of the Segment Tree, the internal nodes which would get affected are those which contain the partial sum in which the leaf is involved. We already met those nodes in our recursive traversal. When we are “coming out” of recursion, we update them by the difference that was caused by the new value. So, basically we go down the tree and come back. So, this operation takes O(2 log n) which is O(log n).

Take a break for a while and come back to start coding the Segment Tree. You will need to do a lot of pen-and-paper work while coding this data structure. And it is totally worth it. To print your Segment Tree, you could use the Tree Walks, In-order or Pre-order or whichever pleases you. If you are getting stuck, you can always refer to my code below –

/*
 * Segment Tree
 * Data Structure
 *
 * Authored by,
 * Vamsi Sangam
 */

#include <stdio.h>
#include <stdlib.h>

struct node
{
	int value, low, high;
	struct node * leftChild;
	struct node * rightChild;
};

// The recursive procedure which actually constructs the Segment Tre
int buildSegmentTree(int arr[], int low, int high, struct node * SegmentTree)
{
	if (low == high) {
		// we have only 1 element of the array
		SegmentTree->value = arr[low];
		SegmentTree->low = SegmentTree->high = low;

		return SegmentTree->value;
	}

	// Partition
	int mid = (low + high) / 2;

	SegmentTree->low = low;			// Setting bounds to the partial sum
	SegmentTree->high = high;		// that this node is associated with

	// Creating children
	SegmentTree->leftChild = (struct node *) calloc(1, sizeof(struct node));
	SegmentTree->rightChild = (struct node *) calloc(1, sizeof(struct node));
	SegmentTree->value = buildSegmentTree(arr, low, mid, SegmentTree->leftChild)
						+ buildSegmentTree(arr, mid + 1, high, SegmentTree->rightChild);

	// Must return the value to carry on recursion
	return SegmentTree->value;
}

// Allocates memory for the root and calls method
// to construct the Segment Tree
struct node * getSegmentTree(int arr[], int n)
{
	if (n <= 0) {
		return NULL;
	}

	struct node * SegmentTree = (struct node *) calloc(1, sizeof(struct node));

	buildSegmentTree(arr, 1, n, SegmentTree);

	return SegmentTree;
}

// A utility function used for Inorder Walk
void print(struct node * SegmentTree)
{
	printf("%d -> [%d, %d]\n", SegmentTree->value, SegmentTree->low, SegmentTree->high);
}

// In-Order Walk of Segment Tree
void inOrderWalk(struct node * SegmentTree)
{
	if (SegmentTree == NULL) {
		return;
	}

	inOrderWalk(SegmentTree->leftChild);
	print(SegmentTree);
	inOrderWalk(SegmentTree->rightChild);
}

// Recursive procedure to find out the sum of elements
// from arr[low] to arr[high] using the Segment Tree
int getSumQuery(struct node * SegmentTree, int low, int high, int sum)
{
	if (low == SegmentTree->low && high == SegmentTree->high) {
		sum += SegmentTree->value;

		return sum;
	}

	if (low == high) {
		int mid = (SegmentTree->low + SegmentTree->high) / 2;

		if (low <= mid) {
			sum = getSumQuery(SegmentTree->leftChild, low, low, sum);
		} else {
			sum = getSumQuery(SegmentTree->rightChild, high, high, sum);
		}
	} else {
		int mid = (SegmentTree->low + SegmentTree->high) / 2;

		if (low <= mid) {
			sum = getSumQuery(SegmentTree->leftChild, low, mid, sum);
		}

		if (high >= mid + 1) {
			sum = getSumQuery(SegmentTree->rightChild, mid + 1, high, sum);
		}
	}

	return sum;
}

// Updates an element in the Segment Tree and makes the neccessary adjustments
int updateQuery(struct node * SegmentTree, int index, int newValue, int difference)
{
	if (SegmentTree->low == SegmentTree->high) {
		difference += newValue - SegmentTree->value;
		SegmentTree->value = newValue;

		return difference;
	}

	int mid = (SegmentTree->low + SegmentTree->high) / 2;

	if (index <= mid) {
		difference = updateQuery(SegmentTree->leftChild, index, newValue, difference);
	} else {
		difference = updateQuery(SegmentTree->rightChild, index, newValue, difference);
	}

	SegmentTree->value += difference;

	return difference;
}

int main()
{
	int n;

	printf("Enter the size of input -\n");
	scanf("%d", &n);

	int arr[n + 1], i;

	printf("\nEnter %d Integers -\n", n);

	arr[0] = -1;
	for (i = 1; i <= n; ++i) {
		scanf("%d", &arr[i]);
	}

	struct node * SegmentTree = getSegmentTree(arr, n);

	// Printing the Tree after constructing
	printf("\nIn-Order Walk of the Segment Tree -\n");
	inOrderWalk(SegmentTree);

	int low, high;

	// Demonstrating Sum Query
	printf("\nEnter the range for sum query - [1, %d]\n", n);
	scanf("%d%d", &low, &high);
	printf("\nSum = %d\n", getSumQuery(SegmentTree, low, high, 0));

	// Demonstrating Update Query
	printf("\nEnter the index of number and new value to update -\n");
	scanf("%d%d", &i, &low);
	updateQuery(SegmentTree, i, low, 0);

	// Printing the Tree after Updating
	printf("\nIn-Order Walk of the Segment Tree after Update -\n");
	inOrderWalk(SegmentTree);

	return 0;
}

There are many varieties of questions that can be asked based on this Data Structure. I’m sure you’ll start liking and appreciating this data structure as you keep solving many questions on it. If you have any doubts, feel free to comment them. I hope my post has helped you in understanding the Segment Tree. If it did, let me know by commenting…! Keep practising… Happy Coding…! 🙂

You may also like –

More in the Navigation Page ! →

Advertisements

2 thoughts on “Segment Trees

  1. I know I am asking for so much but it’s my personal request if you could explain some of segment tree question from spoj…..because as far i have seen that there is different kind of application of segment tree(using structure and all…) in most of the question…more precisely please teach us how to maintain 2-3 data fields for a node….how to build tree and how to combine children in those scenarios…with the help of question..Thank you bro 🙂

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s