295. Find Median from Data Stream

The median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value, and the median is the mean of the two middle values.

  • For example, for arr = [2,3,4], the median is 3.
  • For example, for arr = [2,3], the median is (2 + 3) / 2 = 2.5.

Implement the MedianFinder class:

  • MedianFinder() initializes the MedianFinder object.
  • void addNum(int num) adds the integer num from the data stream to the data structure.
  • double findMedian() returns the median of all elements so far. Answers within 10-5 of the actual answer will be accepted.
     
Example 1:
Constraints:
  • − 1 0 5 < = n u m < = 1 0 5 -10^5 <= num <= 10^5 105<=num<=105
  • There will be at least one element in the data structure before calling findMedian.
  • At most 5 ∗ 1 0 4 5 * 10^4 5104 calls will be made to addNum and findMedian.

From: LeetCode
Link: 295. Find Median from Data Stream


Solution:

Ideas:
  1. Adding an element to our data structure.
  2. Finding the median of the elements in our data structure.

To efficiently support these operations, we use two heaps:

  • Max heap for the lower half of the numbers
  • Min heap for the upper half of the numbers

Here’s how the data structure works:

  • The max heap allows us to quickly access the largest element in the lower half.
  • The min heap allows us to quickly access the smallest element in the upper half.

By maintaining the sizes of the heaps (such that the max heap is either equal in size to the min heap or has one more element), we can ensure that:

  • If the total number of elements is odd, the median is the top of the max heap.
  • If the total number of elements is even, the median is the average of the tops of both heaps.

When a new element is added:

  1. We first decide which heap to add it to. If the element is less than the current median (top of the max heap), it goes to the max heap; otherwise, it goes to the min heap.
  2. After adding the element, we may need to rebalance the heaps to ensure that the max heap either has the same number of elements as the min heap or just one more.

When we need to find the median:

  1. We look at the sizes of the heaps. If they are the same, the median is the average of the tops of both heaps.
  2. If the max heap has one more element, its top is the median.
Code:
typedef struct {
    int* maxHeap;
    int* minHeap;
    int maxHeapSize;
    int minHeapSize;
} MedianFinder;

MedianFinder* medianFinderCreate() {
    MedianFinder* obj = (MedianFinder*) malloc(sizeof(MedianFinder));
    obj->maxHeap = (int*) malloc(sizeof(int) * 50000); // Allocate memory for max heap
    obj->minHeap = (int*) malloc(sizeof(int) * 50000); // Allocate memory for min heap
    obj->maxHeapSize = 0;
    obj->minHeapSize = 0;
    return obj;
}

void swap(int *a, int *b) {
    int temp = *a;
    *a = *b;
    *b = temp;
}

void maxHeapify(MedianFinder* obj, int i) {
    int left = 2 * i + 1;
    int right = 2 * i + 2;
    int largest = i;
    if (left < obj->maxHeapSize && obj->maxHeap[left] > obj->maxHeap[largest])
        largest = left;
    if (right < obj->maxHeapSize && obj->maxHeap[right] > obj->maxHeap[largest])
        largest = right;
    if (largest != i) {
        swap(&obj->maxHeap[i], &obj->maxHeap[largest]);
        maxHeapify(obj, largest);
    }
}

void minHeapify(MedianFinder* obj, int i) {
    int left = 2 * i + 1;
    int right = 2 * i + 2;
    int smallest = i;
    if (left < obj->minHeapSize && obj->minHeap[left] < obj->minHeap[smallest])
        smallest = left;
    if (right < obj->minHeapSize && obj->minHeap[right] < obj->minHeap[smallest])
        smallest = right;
    if (smallest != i) {
        swap(&obj->minHeap[i], &obj->minHeap[smallest]);
        minHeapify(obj, smallest);
    }
}

void addNumToMaxHeap(MedianFinder* obj, int num) {
    obj->maxHeap[obj->maxHeapSize] = num;
    int i = obj->maxHeapSize;
    obj->maxHeapSize++;
    while (i != 0 && obj->maxHeap[(i - 1) / 2] < obj->maxHeap[i]) {
        swap(&obj->maxHeap[i], &obj->maxHeap[(i - 1) / 2]);
        i = (i - 1) / 2;
    }
}

void addNumToMinHeap(MedianFinder* obj, int num) {
    obj->minHeap[obj->minHeapSize] = num;
    int i = obj->minHeapSize;
    obj->minHeapSize++;
    while (i != 0 && obj->minHeap[(i - 1) / 2] > obj->minHeap[i]) {
        swap(&obj->minHeap[i], &obj->minHeap[(i - 1) / 2]);
        i = (i - 1) / 2;
    }
}

int extractMax(MedianFinder* obj) {
    int max = obj->maxHeap[0];
    obj->maxHeap[0] = obj->maxHeap[obj->maxHeapSize - 1];
    obj->maxHeapSize--;
    maxHeapify(obj, 0);
    return max;
}

int extractMin(MedianFinder* obj) {
    int min = obj->minHeap[0];
    obj->minHeap[0] = obj->minHeap[obj->minHeapSize - 1];
    obj->minHeapSize--;
    minHeapify(obj, 0);
    return min;
}

void medianFinderAddNum(MedianFinder* obj, int num) {
    if (obj->maxHeapSize == 0 || num < obj->maxHeap[0]) {
        addNumToMaxHeap(obj, num);
    } else {
        addNumToMinHeap(obj, num);
    }
    
    if (obj->maxHeapSize > obj->minHeapSize + 1) {
        addNumToMinHeap(obj, extractMax(obj));
    } else if (obj->minHeapSize > obj->maxHeapSize) {
        addNumToMaxHeap(obj, extractMin(obj));
    }
}

double medianFinderFindMedian(MedianFinder* obj) {
    if (obj->maxHeapSize > obj->minHeapSize) {
        return obj->maxHeap[0];
    } else if (obj->maxHeapSize < obj->minHeapSize) {
        return obj->minHeap[0];
    } else {
        return ((double)obj->maxHeap[0] + (double)obj->minHeap[0]) / 2;
    }
}

void medianFinderFree(MedianFinder* obj) {
    free(obj->maxHeap);
    free(obj->minHeap);
    free(obj);
}
11-08 08:46