Find the Running Median, Hackerrank

We will discuss Find the Running Median problem from Hackerrank here. It is a typical problem if you don’t know about the data structure you can use here.

Click here to get your copy of Coding interview pocket in java having questions asked in Amazon, Microsoft, Facebook and Google.

Problem Statement

The median of a set of integers is the midpoint value of the data set for which an equal number of integers are less than and greater than the value. To find the median, you must first sort your set of integers in non-decreasing order, then:

  • If your set contains an odd number of elements, the median is the middle element of the sorted sample. In the sorted set {1, 2, 3}, 2  is the median.
  • If your set contains an even number of elements, the median is the average of the two middle elements of the sorted sample. In the sorted set {1, 2, 3, 4} , (2+3)/4 = 2.5 is the median.

Given an input stream of  integers, you must perform the following task for each i integer:

  1. Add the i integer to a running list of integers.
  2. Find the median of the updated list (i.e., for the first element through the i element).
  3. Print the list’s updated median on a new line. The printed value must be a double-precision number scaled to 1 decimal place (i.e., 12.3 format).

Input Format
The first line contains a single integer, n , denoting the number of integers in the data stream.
Each line i of the n subsequent lines contains an integer, a[i], to be added to your list.

Constraints

  • 1 <= n <= 10^5
  • 0 <= a[i] <= 10^5

Output Format
After each new integer is added to the list, print the list’s updated median on a new line as a single double-precision number scaled to 1 decimal place (i.e., 12.3 format).

Sample Input
6
12
4
5
3
8
7

Sample Output
12.0
8.0
5.0
4.5
5.0
6.0

Explanation
There are n=6 integers, so we must print the new median on a new line as each integer is added to the list:


Solution : Find the Running Median

In this problem, imagine we have a stream of numbers coming in and we need to find the median each time a new number is added to the current list.

We need to observe following things –

  • The numbers coming in are not sorted, which means we need to sort the array first to find out the median. But this approach can increase the time complexity, we need to think through that also.
  • The numbers coming in are distinct.
  • We are given a fixed count of numbers that we need to deal with in this problem.
  • We need to return an array of double type numbers. So, store the output in an array rather than printing it.

Hint : We can use the concept of heaps here.

Logic

If you can remember the property of heaps that, min-heap and max-heap have minimum and maximum element at the top respectively, than you can start approaching this problem in an efficient way.

Try it now, before reading further.

Algorithm

  1. We need to create two heaps : min-heap and max-heap
    1. min-heap will contain all the elements greater than the median.
    2. max-heap will contain all the elements lower than the median.
  2. Initialize the median to 0
  3. Start comparing each element with median and add them to the specific heaps.
  4. If min-heap.size() == max-heap.size()
    1. If new element(x) is less than current median
      1. Add x to max-heap
      2. New median would be top of max-heap.
    2. If new element(x) is greater than current median
      1. Add x to min-heap
      2. New median would be top of min-heap.
  5. If min-heap.size() > max-heap.size()
    1. If new element(x) is less than current median
      1. Add x to max-heap.
    2. If new element(x) is greater than current median.
      1. Remove top element from top of the min-heap and add it to max-heap.
      2. Add x to min-heap.
    3. Calculate median by taking out top of both min-heap and max-heap and divide the sum by 2.
    4. Update the median.
  6. If min-heap.size() < max-heap.size()
    1. If new element(x) is less than current median
      1. Remove one element from the top of max-heap and add it to min-heap.
      2. Add x to max-heap.
    2. If new element(x) is greater than current median.
      1. Add x to min-heap.
    3. Calculate median by taking out top of both min-heap and max-heap and divide the sum by 2.
    4. Update the median.
  7. Repeat steps 4-6 for all elements.

Visualization of Algorithm: Find the Running Median

Let’s try to visualize the algorithm with the sample input.

dry-run

Code Implementation

As we can see here how a complex problem becomes easy by using a right data structure.

Other heap operations such as peek() and size() take O(1) time and add() takes O(log n). So the overall time complexity is O(n).


Please subscribe and let me know your feedback.

Click here to get your copy of Coding interview pocket in java having questions asked in Amazon, Microsoft, Facebook and Google.

Happy Learning. Happy Coding 🙂

Leave a Reply