Logic used behind Array Manipulation of HackerRank

These two places helped me to understand this algorithm more clearly. Prefix_sum_array_and_difference_array
Stack Overflow

If you want a simple and direct explanation: Initial, the array is 0 0 0 0 0 cpp after the first operation, 1 2 100 it will become seq1: 100 100 0 0 0 and after second 2 5 100 seq2: 0 100 100 100 100 and after 3 4 100 seq2: 0 0 100 100 0 but when we apply difference array at every step, we will get

cpp diff seq of seq1: 100 0 -100 0 0 diff seq of seq2: 0 100 0 0 0 -100 diff seq of seq3: 0 0 100 0 -100

One important property is the difference sequence of the sum of the sequences is the sum of the difference sequences.

it will give us, cpp 100 100 0 0 -100 -100(for clarity purpose only) or you can add all the sequences as cpp seq1+seq2+seq3 = 100 200 200 200 100 and then find the difference seq or the difference array which is 100 100 0 0 -100 and then find the prefix array.

Why we ignore the first 100??? Read the first article about difference array and prefix sum array!!!!

and after this, do prefix sum cpp 100 200 200 200 100 0 Ignore last 0 as the last index we considered is for clarity purpose only.

so,both these steps find the difference array for us:) cpp a[p]+=sum; if((q+1)<=N) a[q+1]-=sum;


We are basically storing the increment in the starting position and one past the last index in the range. For a b k we will increase +k for all elements in index [a,b] but then the next elements will not be increased. So we are subtracting it, because w.r.t the previous increment all elements to the right of the range will be lesser by -k. We are basically storing all the final values via this increment/decrement.

At last we are calculating the elements on the fly from left to right. If you think more deeply, it is just storing how much one element is bigger than the previous element.

Initially the array will be 0 0 0 0 0.

After the first operation 1 3 3 originally the array elements should be 3 3 3 0 0 but we are storing it like this

3 0 0 -3 0

Meaning

First element is 3 greater than 0.
Second  ->       0 greater than index 1 element.
Third   ->       0 greater than index 2 element
Fourth  ->      -3 greater than index 3 element.
fifth   ->       0 greater than index 4 element.

After the second operation 2 4 4 originally the array will be 3 7 7 4 0 but we store it like this 3 4 0 -3 -4. Just like I described in detail keep in mind that and think that way, you will see that we are not losing information. We just store it in a different way.

Final values will be

0+(3) 0+3+(4) 0+3+4+(0) 0+3+4+0+(-3) 0+3+4+0-3+(-4)

3  7    7       4           0  matches with what we got earlier.

Note how we calculate each element. Just adding previous element with the value by which current element is greater.


Note that this solution works because it is being queried only once. If it is queried m times, then this solution doesn't work because it will time out. Then you will have to dig deeper using advanced data structures like segment trees and/or binary indexed trees.


I'll try to explain my understanding of this:
Each line of input basically describes a sequence, and you are asked to find the maximum value of the sum of these sequences.
For example, if N is given as 5:
the line 2 4 13 describes the sequence [0, 13, 13, 13, 0]
the line 3 5 11 describes the sequence [0, 0, 11, 11, 11].
If those are the only lines we get the result sequence from the pointwise sum of the two, which is [0, 13, 24, 24, 11].

Now another way we can describe the sequences above are by the difference sequences, that is, at index i we will keep the difference between the element at index i and the element at index i-1, and we can get the original sequence by a running sum of the difference sequence.

In the case of the above sequences, the difference sequences are:
[0, 13, 0, 0, -13] for the sequence described by 2 3 13
[0, 0, 11, 0, 0] for the sequence described by 3 5 11
[0, 13, 11, 0, -13 for the sum of the sequences.

One important property is the difference sequence of the sum of the sequences is the sum of the difference sequences.

So what the solution does, for every line, is to sum the difference sequences (which only requires up to 2 operations due to the nature of the sequences), then to find the maximum it takes the running total of the difference sequence, thus getting the elements of the sequence, and holds the maximum value of that running total.

While the example I gave has only 2 lines, this same idea works for any number of lines.

I hope this gives good intuition as to the idea behind the solution.

Tags:

C++

Arrays