Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MinHeap.push() vs MaxHeap.push() performance #206

Open
RickWong opened this issue Jan 4, 2023 · 3 comments
Open

MinHeap.push() vs MaxHeap.push() performance #206

RickWong opened this issue Jan 4, 2023 · 3 comments
Labels

Comments

@RickWong
Copy link

RickWong commented Jan 4, 2023

Currently MaxHeap is implemented as a MinHeap just with reverse comparator. When pushing the same collection (for example an incremental range between 1 and 65535) though, the comparator is called more frequently for MaxHeap than for MinHeap.

let comp = 0;
let heap = MinHeap.from([0], (a, b) => (comp++, a - b));
range(1, 65535).forEach((i) => heap.push(i));
console.log(comp); // 65535

comp = 0;
heap = MaxHeap.from([0], (a, b) => (comp++, a - b));
range(1, 65535).forEach((i) => heap.push(i));
console.log(comp); // 917522

Curiously, when using from() the performance characteristics are flipped:

let comp = 0;
let heap = MinHeap.from(range(1, 65535), (a, b) => (comp++, a - b));
console.log(comp); // 131038

comp = 0;
heap = MaxHeap.from(range(1, 65535), (a, b) => (comp++, a - b));
console.log(comp); // 98286

Why would that be?

@Yomguithereal
Copy link
Owner

Hello @RickWong,

Why would that be?

Your numbers are well in the range of O(log2(n)) so it might just be that you are hitting a normal worst/best case scenario here since you are adding the numbers in order, which is specific case. Can you try adding them in reverse and check if you obtain some sort of symmetry regarding the number of comparisons?

@RickWong
Copy link
Author

RickWong commented Jan 11, 2023

Hello @Yomguithereal. Here's the reverse range push:

let comp = 0;
let heap = MinHeap.from([0], (a, b) => (comp++, a - b));
range(65535, 1).forEach((i) => heap.push(i));
console.log(comp); // 917522

comp = 0;
heap = MaxHeap.from([0], (a, b) => (comp++, a - b));
range(65535, 1).forEach((i) => heap.push(i));
console.log(comp); // 65550

(Also, there seems to be a slight negligible difference when initializing the above heaps with [] instead of [0].)

And the reverse range from:

let comp = 0;
let heap = MinHeap.from(range(65535, 1), (a, b) => (comp++, a - b));
console.log(comp); // 98286

comp = 0;
heap = MaxHeap.from(range(65535, 1), (a, b) => (comp++, a - b));
console.log(comp); // 131038

So back to the first benchmark, given that it's specifically the ascending number range that favors MinHeap.push, I guess the performance question is why does MaxHeap.from perform better than MinHeap.from?

@Yomguithereal
Copy link
Owner

I think this is just a side effect of the ordering and the fact that the underlying binary tree representation must be balanced more often in one case vs. the other. But the asymptotic logarithmic holds correctly there, so from an analytical standpoint, the performance is equivalent at least. You would have to check the code to understand why a case needs more rebalancing than the other but please note inserting elements in order is a known worst case for a heap algorithm. This said, there remain a possibility that my code has a subtle bad branch somewhere, so if you have time to investigate, what could be nice would be to replicate your benchmark using python and its heapq module (on which my code is actually based) to assess whether you see the same kind of discrepancies between min/max number. If not, maybe I did something wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants