Data structures and algorithms

Find Median from Data Stream

Find Median from Data Stream

Find Median from Data Stream: The median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value, and the median is the mean of the two middle values.

  • For example, for arr = [2,3,4], the median is 3.
  • For example, for arr = [2,3], the median is (2 + 3) / 2 = 2.5.

Implement the MedianFinder class:

  • MedianFinder() initializes the MedianFinder object.
  • void addNum(int num) adds the integer num from the data stream to the data structure.
  • double findMedian() returns the median of all elements so far. Answers within 10^-5 of the actual answer will be accepted.
Example 1:
Input
["MedianFinder", "addNum", "addNum", "findMedian", "addNum", "findMedian"]
[[], [1], [2], [], [3], []]
Output
[null, null, null, 1.5, null, 2.0]

Explanation
MedianFinder medianFinder = new MedianFinder();
medianFinder.addNum(1);    // arr = [1]
medianFinder.addNum(2);    // arr = [1, 2]
medianFinder.findMedian(); // return 1.5 (i.e., (1 + 2) / 2)
medianFinder.addNum(3);    // arr[1, 2, 3]
medianFinder.findMedian(); // return 2.0
Constraints:
  • -10^5 <= num <= 10^5
  • There will be at least one element in the data structure before calling findMedian.
  • At most 5 * 10^4 calls will be made to addNum and findMedian.

Try this Problem on your own or check similar problems:

  1. Sliding Window Median
  2. Finding MK Average
  3. Sequentially Ordinal Rank Tracker
Solution:
class MedianFinder {
    PriorityQueue<Integer> min, max;
    public MedianFinder() {
        min = new PriorityQueue<>();
        max = new PriorityQueue<>(Collections.reverseOrder());
    }

    public void addNum(int num) {
        max.add(num);
        if(max.size() > min.size() + 1){
            min.add(max.poll());
        }
        if(!min.isEmpty() && min.peek() < max.peek()){
            int temp = min.poll();
            min.add(max.poll());
            max.add(temp);
        }
    }

    public double findMedian() {
        if(max.size() > min.size()) return (double) max.peek();
        return (max.peek() + min.peek()) / 2.0;
    }
}

/**
 * Your MedianFinder object will be instantiated and called as such:
 * MedianFinder obj = new MedianFinder();
 * obj.addNum(num);
 * double param_2 = obj.findMedian();
 */
Time/Space Complexity:
  • Time Complexity: O(Wlogn + R) where W is the number of insertion/write operations, while R is the number of times we call findMedian operation
  • Space Complexity: O(n)
Explanation:

We maintain two heaps min and max while respecting the following rules:

  • min will always contain half with larger elements coming from the input array/stream with and the minimum of those will be at the top.
  • max will contain half with smaller elements coming from the array/stream, the maximum of which will be at the top.
  • min and max should be of roughly the same size, with max always being the larger heap if there is odd number of elements in the stream. max.size() > min.size() + 1
  • The element at top of max should always be smaller (or equal) to the element at top of min heap (this way we can break the stream into larger and smaller elements groups) !min.isEmpty() && min.peek() < max.peek()).

Doing it this way we can get median in O(1) since we can just pick the largest element of smaller candidates at the top of max heap if there is odd number of elements in the stream, or average of top elements of both heaps if the number of elements is even. We still need to do heap operations leading to O(logn) time complexity for each insertion query and constant time for median query.

comments powered by Disqus

Join Our Newsletter

Don’t settle for anything less than the crown. Join our newsletter and become the King of Interviews! Click here to join now and get the latest updates.