logologo
  • AI Tools

    DB Query GeneratorMock InterviewResume BuilderLearning Path GeneratorCheatsheet GeneratorAgentic Prompt GeneratorCompany ResearchCover Letter Generator
  • XpertoAI
  • MVP Ready
  • Resources

    CertificationsTopicsExpertsCollectionsArticlesQuestionsVideosJobs
logologo

Elevate Your Coding with our comprehensive articles and niche collections.

Useful Links

  • Contact Us
  • Privacy Policy
  • Terms & Conditions
  • Refund & Cancellation
  • About Us

Resources

  • Xperto-AI
  • Certifications
  • Python
  • GenAI
  • Machine Learning

Interviews

  • DSA
  • System Design
  • Design Patterns
  • Frontend System Design
  • ReactJS

Procodebase © 2024. All rights reserved.

Level Up Your Skills with Xperto-AI

A multi-AI agent platform that helps you level up your development skills and ace your interview preparation to secure your dream job.

Launch Xperto-AI

Finding the Median of a Stream Using Heaps

author
Generated by
Parveen Sharma

16/11/2024

Java

Sign in to read full article

When it comes to handling a dynamic set of numbers, finding the median efficiently is often a challenge that programmers face. The median is defined as the middle value when the numbers are sorted. If there's an even number of observations, the median is usually considered as the average of the two middle numbers. In our context, we'll explore how to maintain this median while processing a continuous stream of numbers.

The Algorithm Overview

We will use two heaps (priority queues) to allow us to efficiently find the median as we insert numbers into our dataset:

  1. Max Heap (left heap): This will store the smaller half of the numbers.
  2. Min Heap (right heap): This will store the larger half of the numbers.

This dual-heap approach allows us to keep track of the median efficiently:

  • The max-heap can access the largest number in the smaller half quickly.
  • The min-heap can access the smallest number in the larger half quickly.

How It Works

  1. Insert: When inserting a new number, we must decide whether it belongs to the max-heap or the min-heap.

    • If the number is less than or equal to the maximum of the max-heap, it goes into the max-heap.
    • Otherwise, it goes into the min-heap.
  2. Balance: After each insertion, we may need to balance the heaps. The sizes of the heaps should not differ by more than one.

    • If the max-heap exceeds the size of the min-heap by more than one, move the maximum from the max-heap to the min-heap.
    • If the min-heap exceeds the size of the max-heap, move the minimum from the min-heap to the max-heap.
  3. Find Median:

    • If the max-heap has more elements, the median is the root of the max-heap.
    • If both heaps are of equal size, the median is the average of the roots of both heaps.

Implementation in Java

Here's a simple yet effective Java solution to manage the insertion of numbers and median calculation:

import java.util.Collections; import java.util.PriorityQueue; public class MedianFinder { private PriorityQueue<Integer> maxHeap; // Max-heap for the lower half private PriorityQueue<Integer> minHeap; // Min-heap for the upper half public MedianFinder() { maxHeap = new PriorityQueue<>(Collections.reverseOrder()); // Store smaller half minHeap = new PriorityQueue<>(); // Store larger half } public void addNum(int num) { // Always add to maxHeap first maxHeap.offer(num); // Ensure the largest of maxHeap is less than or equal to the smallest of minHeap if (!minHeap.isEmpty() && maxHeap.peek() > minHeap.peek()) { minHeap.offer(maxHeap.poll()); } // Balance the sizes of the heaps if (maxHeap.size() > minHeap.size() + 1) { minHeap.offer(maxHeap.poll()); } else if (minHeap.size() > maxHeap.size()) { maxHeap.offer(minHeap.poll()); } } public double findMedian() { if (maxHeap.size() > minHeap.size()) { return maxHeap.peek(); } else { return (maxHeap.peek() + minHeap.peek()) / 2.0; } } public static void main(String[] args) { MedianFinder medianFinder = new MedianFinder(); medianFinder.addNum(1); System.out.println(medianFinder.findMedian()); // Outputs 1.0 medianFinder.addNum(2); System.out.println(medianFinder.findMedian()); // Outputs 1.5 medianFinder.addNum(3); System.out.println(medianFinder.findMedian()); // Outputs 2.0 } }

Explanation of Code

  • Initialization: Two heaps are defined, one as a max-heap and the other as a min-heap.
  • addNum Method: This method handles the addition of new numbers:
    • It first adds the number to the max-heap.
    • Then, it checks if it needs to move elements between heaps to keep them balanced.
  • findMedian Method: Depending on the sizes of the heaps, it calculates the median.

Why Heaps?

Utilizing heaps allows us to achieve efficient O(log n) time complexity for insertions while keeping the retrieval of the median at O(1). This feature becomes especially useful when processing large streams of data, where performance matters.

Example Walkthrough

Let’s consider we have a stream of numbers: 5, 10, 15, 20, 25.

  • Add 5: Max-Heap: [5], Min-Heap: []
  • Find Median: 5
  • Add 10: Max-Heap: [5], Min-Heap: [10]
  • Find Median: (5 + 10) / 2 = 7.5
  • Add 15: Max-Heap: [5], Min-Heap: [10, 15]
  • Balance: Max-Heap: [10], Min-Heap: [15]
  • Find Median: 10
  • And so forth...

Conclusion

Utilizing bi-heap data structures to compute the median on a streaming dataset is an efficient and elegant solution, and understanding it can distinguish you as a strong candidate during developer interviews.

Popular Tags

JavaData StructuresAlgorithms

Share now!

Like & Bookmark!

Related Collections

  • Top 20 DSA Interview Questions Mastery

    23/09/2024 | DSA

  • Mastering Bit Manipulation: Unlocking Binary Power

    08/12/2024 | DSA

  • Advanced Graph Interview Questions in Java

    16/11/2024 | DSA

  • Advanced String-based Interview Techniques

    15/11/2024 | DSA

  • Advanced Priority Queue and Heap Interview Questions in Java

    16/11/2024 | DSA

Related Articles

  • Applications of Left Shift Operator in Data Structures and Algorithms

    08/12/2024 | DSA

  • Sort a Nearly Sorted Array Using Heap

    16/11/2024 | DSA

  • Minimum Window Substring

    15/11/2024 | DSA

  • Flattening a Binary Tree to a Linked List

    13/10/2024 | DSA

  • Understanding the Subset Sum Problem

    13/10/2024 | DSA

  • Swapping Numbers Using XOR

    08/12/2024 | DSA

  • Heap Operations

    16/11/2024 | DSA

Popular Category

  • Python
  • Generative AI
  • Machine Learning
  • ReactJS
  • System Design