Fibonacci Heaps or 'How to invent an extremely clever data structure'

preview_player
Показать описание
I want to tell you about a daunting, but truly fascinating data structure. At first sight, Fibonacci Heaps can seem intimidating. In this video, I'm going to show you all the necessary steps to invent a really clever data structure.

00:00 Introduction
00:50 Priority Queues and Binary Heaps
05:44 Fibonacci Heaps
08:24 Amortized Analysis
10:28 ExtractMin
16:54 DecreaseKey
22:02 3 Questions
28:16 Final Words

Animations created with Manim
Music: Goldberg Variations, J.S. Bach, Kimiko Ishizaka

Sources:
Cormen, Leiserson, Rivest, Stein: Introduction to Algorithms ‒ Third Edition (the chapter on Fibonacci Heaps is available for free on their website under "Material Removed from 3e")
Fredman, Tarjan: Fibonacci heaps and their uses in improved network optimization algorithms.
Vuillemin: A Data Structure for Manipulating Priority Queues

#SoME2
Рекомендации по теме
Комментарии
Автор

There are a couple of things I wanted to mention but eventually decided against because I felt the would break the flow of the video. For those who are interested:
03:22 Throughout the video I'm assuming you can find any given heap element in constant time using some sort of lookup table. This is easier said than done. Depending on your data, you might need to rely on randomization (e.g. universal hashing).
06:08 The textbook Fibonacci Heap implementation uses circular linked lists, while I use regular linked lists. Circular linked lists reduce the space requirements, but they make the operations a bit harder to explain. Just be aware that my operations are slightly different to the ones found in most textbooks.
13:06 This analysis is subtly inaccurate. The size of the root list at this point is O(#trees + max degree), not #trees, because in phase 1, we moved the children of the minimum into the root list. In Big O notation, however, this makes no difference.
14:12 It's actually the maximum degree plus 1 because we need to include degree 0. Again, it doesn't matter in Big O notation.
20:03 One detail I included in the visualization but didn't actually mention: Tree roots are never cut out or marked. They are never cut out, because there part of the root list already. And they are never marked, because it's simply not necessary for bounding the node degrees (cf. 25:50).
27:30 You don't actually need to involve the golden ratio to see that the Fibonacci numbers grow exponentially. It's easy to see that the i-th number in the sequence is at least twice as big as the i-minus-2-th (due to the construction of the sequence and the fact that it grows monotonically).
28:16 There are more Fibonacci Heap operations, which I didn't mention: You can union two Heaps in constant time, simply by concatenating their root lists. You can also implement the operations "Remove" and "IncreaseKey" using a combination of the operations from the video.
28:31 That ExtractMin can't be improved only holds for the general case where your priority queue stores arbitrary values and you find the minimum using comparisons. In special cases (like if all values are integers within a certain range) this running time can be improved.

sithdev
Автор

28:45 There are a couple of extremely important use cases that I know of. In the interest of full disclosure, I've worked on both of these.

First use case, in bioinformatics. Machines that read strands of DNA are not perfect; they make errors. The way we typically fix this is to read lots of copies of the same DNA, and then correct the ones that have low counts using ones that have high counts. For de novo sequence assembly, the algorithm of choice is known as the Tour Bus algorithm.


Second use case, in certain kinds of physical simulation. Physical simulation often involves a time step, which has to be carefully chosen: Too small, and the simulation takes ages to run. Too large, and you lose accuracy.

There are physical systems where the appropriate time step varies widely over the simulation domain. An example is carbon sequestration, were you take CO2 off a power plant (say) and inject it into an underground reservoir, such as an old oil well. These reservoirs are essentially rocks with pores and fractures, so fluid flows at geologic speeds. Even though you might inject CO2 at a speed measured in metres per second, the plume through the reservoir might advance at a speed measured in centimetres per year.

DeGuerre
Автор

This channel needs way more views and subscribers. I was shocked by the production quality relative to view count.

vokuheila
Автор

Really good! I like your thesis that sometimes we overvalue elegance of implementation - often things are just hard to get right. There's also a lesson here that sometimes a data structure will maintain an elegant invariant (e.g., Fibonacci heaps, Red-Black trees) that makes analysis easy, but actually implementing a data structure that maintains that invariant is kinda grungy.

hjfreyer
Автор

The only things I remember from school: Merging two Fibonacci heaps was fast.

PhilipSmolen
Автор

Great video! Another thing I'd like to add though is that in a lot of cases you don't even have to use heaps for priority queues.
For example, in the example problem shown in the video, how many different priorities the router can have? 3? 10? 256? Either way, it is still a really small number in most cases, and it is know in advance. You can then just have a list (a bucket) for every possible priority, and put a message in the bucket with the corresponding priority once it arrives. That's O(1) Insert. You can also store a pointer to the lowest non-empty bucket, that's O(1) GetMin. DecreaseKey is also easy - just move a message from one bucket to another and maybe update the pointer. And ExtractMin just takes one message from the bucket and if it becomes empty, move the pointer to the next bucket until you hit a non-empty one. That operation is just O(#buckets), which is supposedly a small constant number that doesn't increase with the number of messages, so that's just O(1). This way you get a priority queue with _all_ operations being O(1), as long as your priorities are just small integers. Even if you don't know their maximum value in advance, you can just add new buckets later, and the queue will continue being fast as long as that maximum priority is still small.
This data structure is called a Bucket Queue and it is honestly the best choice for most of the applications, given how simple and fast it is. You can even optimize it if priorities slowly grow over time: all that matters is the number of priority values in the heap at any given time.
Of course, you would still need binary of Fibonacci Heaps if your priorities are floating point numbers or more complicated data, as is needed in some algorithms like Dijkstra's

silentobserver
Автор

Really interesting and smart - great presentation
Just a few things:
- You defined the runtime of DecreaseKey in the heap as O(1) by using a hash table in 4:50. But lookup in a hashtable doesn't come for free in many cases and often will be seen in O(log n) (if you're NOT looking at amortized complexity of a perfect implementation tailored for exactly the data you're using)
- You ARE looking at amortized complexity in your heap, but that is something that only experienced programmers (or the opposite if suited with enough ignorance) should be doing and it would have been great if you explained why this can be a trap in some situations. Depending on use-case you rather have a slower but smoother algorithm than one of amortized speed that will just freeze your code flow for a second or so every now and then.
Nothing wrong with amortized complexity per se, but the user needs to be aware of its constraints.

Animaniac-vdst
Автор

Incredible quality, I hope you'll keep producing great content :)

nicowe
Автор

Criminally underrated channel. You explained everything so well and the animations are incredibly well-made!

roelyoon
Автор

Fantastic choice of music! The Goldbergs with their perfect ratios are a very clever nod to the material you're covering as well. Well explained as well.

minerscale
Автор

This is so damn interesting! I wish we covered stuff like this when I studied computer science. Unfortunately we spent wayy to much time drawing absolute useless and boring UML diagrams :/

dasten
Автор

This is the kind of thing a standard library developer might write and just call "heap" without 99% of users ever knowing about it. Crazy complicated but very clever. I guess having at least one operation be O(log n) is unavoidable, but the impressive part is the DecreaseKey operation being constant time while still maintaining the rest. In practice though, a regular binary tree would probably do ok at it, because the priority probably doesn't change that much in a single iteration. It would likely only need to move up 0 or 1 levels. It's one of those times when big O notation can be deceptive, because if you have to balance 0-2 trees vs swapping 0 or 1 nodes (assuming certain constraints), yes those are both constant but one is still greater. But the Fib. tree guarantees it no matter what. Definitely an interesting structure.

DFPercush
Автор

Dude, this is amazing. Incredible quality, rigorous, well explained. By far the best CS video I've seen on YouTube

theandy
Автор

Brilliant visuals, great use of Manim, and crystal-clear presentation! I hope you post more Comp Sci content!

alanyang
Автор

Very nice video! Good explanation and no over-the-top animations.
But...
Why the background music?
I like Bach and have the Golberg Variations on CD (Glenn Gould)
Why not start the video with a recommendation of background music and leave it to the viewer to start a CD or Spotify?
My mind keeps switching between listening to the music and the narration.
And is not that the background music makes it difficult to follow the narration.

olafzijnbuis
Автор

29:20 „Fibonacci Heaps have shown that more often than not a solution to a sprecific problem is messy.“
Love that insight.

chennebicken
Автор

I was struggling to understand this concept and you saved me! Tons of thanks

sanketdatta
Автор

This is THE DEFINITIVE video on Fibonacci heaps and priority queues. Well done!

ianmoses
Автор

I don't often use videos for learning about computer science, but I randomly got this recommended, and the animations were really helpful! Great work!

rfs
Автор

This is a lesson about the importance of accounting for constant time. If you only have a few elements (say, 100k) in your structure to worry about, it really doesn't matter that this super cool algorithm you found outperforms the others... starting at a 1B count and more.

Rezenbekk