Why Linked Lists vs Arrays isn’t a real choice

preview_player
Показать описание

Disclaimer: Commission is earned from qualifying purchases on Amazon links.

Follow me on:

In this video we talk a bit more about data structures and optimizations, specifically we'll get into linked lists vs arrays, how to do common operations on them, and what happens to the underlying memory. These all have impacts on how they perform, it's not solely about big-O, cache locality effects come into play and we can understand in what situations an array or a linked list is expected to perform better. We'll work through some real world examples to bring the point home and get a solid understanding of these data structures.

What's covered:
* What are linked lists
* The importance of contiguous memory, and CPU caches
* Linked list vs arrays, what each operation does and roughly which one is faster
* Memory implications vs arrays
* Closing thoughts, and when I've personally found linked lists useful in my career
Рекомендации по теме
Комментарии
Автор

On Modern x86, the most important thing for code optimization is cache. A cache miss is more expensive than a floating point divide. Arrays offer significantly better cache locality than linked lists. This means that they are almost always the better option on modern x86

chair
Автор

I think in many ways linked lists are of pedagogical interest in computer science, partly because implementing them is often simple. It's a decent example for teaching recursion, it helps you understand other "node"-based structures like trees and graphs. In OS class or something with assembly, you might implement a free list, for which remembering linked lists comes in handy.

But again, that's all part of the conceptual background. Sometimes it helps to know some other option exists, if only to be confident that the option you're using is the better choice.

stapler
Автор

From what I understand, memory is often pretty slow, especially compared with the processor. So rather than just fetching the data needed for the current operation, it instead gets a whole truckload of data at once including both the request and the data around it. Arrays exploit this to its full potential making data insertion and deletion much faster than it could be otherwise since it's moving entire blocks of the array rather than single elements. This alone often makes up for the theoretical savings of using a linked list.

angeldude
Автор

The restaurant tables analogy has got to be the best analogy for arrays in memory I've ever heard.

nightfox
Автор

The most common usage for lined lists are hashtables. If you implement a hashtable and you get a collision, the easiest, yet still efficient way for handling it is to replace the value with a linked list of values. Of course, that will get slow as your linked list grows but if that happens, you have too many collisions anyway and then either your hashing is bad or your hashtable is too small ("too loaded"). The alternatives to a linked list would be using an array but that will buy you little if you plan to have 2 at most 3 collisions before you decide to resize and rehash the table or use an alternative storage location but that leads to a high probability for collisions in the future and slows down the entire lookup for values not in the table as to tell a value is not in the table, you must look for it at all possible alternative storage locations, instead of usually just one.

xcoder
Автор

The most common use for linked lists is when used as an intrusive linked list instead of as a data structure imo, where you have a procedure that recurses on the next item and does quite a bit of work at each level. Or when each node is large compared to cache anyway so that the cache misses is never an issue, but allowing inserts is.

Inheritance hierarchies in dynamically typed languages are basically always a linked list of classes, but you rarely think of it as one. Same for chain maps, leaf nodes of B+ trees, and plenty of other things.

BosonCollider
Автор

Wow, that table placement metaphor is excellent. I will be using that one forever now when explaining vector size vs capacity and why it's so much faster to allocate up front when size is known!

Thank you

Mankepanke
Автор

Overall, great overview. 99% of the time, dynamic arrays are better than linked lists. Even for queues, we can use circular arrays to implement deques which would usuall be better. However, it’s worth noting that when storing very large elements, it could be more performant to use linkedlists as they never have to reallocate and copy a bunch of elements over to the new buffer. The reallocation is obviously quite cheap for small types but for large structs with lots of data, it’s worth considering this cost. Additionally, some types disallow copying entirely, and so it would not be possible to store them in a vector. Finally, it’s worth mentioning that it’s often not too difficult to “magically” get to the correct linked list node because we usually return iterators/pointers to the new nodes as we insert them, which do

jfmhunter
Автор

the main use case i have for single-linked lists is for immutable data structures. with a linked list you can immutably change the head of the list without having to copy the rest of the list.

for example, in Haskell, linked lists is the default data structure because of these reasons. because of laziness in haskell a list is not always actually instantiated memory and can instead be an abstraction for loops, which eliminates many of the downsides. that said, if you want high performance in haskell it’s often worthwhile to replace lists with a more specialized and less flexible data structure, like immutable arrays (or sometimes even mutable arrays, but that’s more dangerous)

asdfghyter
Автор

You absolutely should make a video with that interview question you can't ask anymore!

Metruzanca
Автор

I cannot remember when was the last time I used a linked list, I think C# contains a LinkedList<T> but I am not sure and I have been using C# for 15+ years. I usually use arrays if known size and List if variable size or HashSet/Dictionary for indexed access.
Linked lists are mostly used internally in other structures like the Dictionary.

casperhansen
Автор

It would be very interesting to hear your backstory. You seem like someone with a lot of experience and knowledge in different areas of software.

devsauce
Автор

Perfect timing! I have exam on this in 3 days. Good repetition.

EeizeMode
Автор

7:06
When you need a datastructure with insertion/removal on both ends circular buffers work pretty good.
They also half the copied data on random insertion/deletion.

quantumdeveloper
Автор

at the beginning I was mostly curious about who crabman is, but at the end I am just so grateful for real-life examples and real deep dive in the topic. I am constantly rewatching your videos, and putting the pieces together. thank you! and the "far away from the bathroom" joke was brilliant :D

felleg
Автор

Linked lists are nice for large structures that may take up an entire cache line with one element. There's no point preserving locality in that scenario, and it gives the allocator a bit more breathing room (although idk what magic modern allocators do so maybe not anymore). I find them helpful for embedded programming where cache isn't much of a thing. They also make pretty good message queues, although a circular buffer is arguably better for that.

ttamttam
Автор

For all the schooling and interview questions I have had about linked lists, I have NEVER in my 15 year career every had to use one. Array/List/Vector 100% of the time.

tinytownsoftware
Автор

I think the most value that a linked list has to the vast majority of programmers is as a stepping stone for understanding trees and graphs, which are often represented the same way as linked lists, but with multiple "next" nodes.

alxjones
Автор

I wish I had started my career with you as a mentor. Your explanations are crystal clear and your presentation is entertaining. Thanks for taking the time to teach.

aliphian
Автор

I tend to think of Linked Lists as the fundamental unit of graph data structures. The usefulness isn't always apparent but you can build on the fundamentals they introduce and hybridize with other data structures to make very useful tailor-made graphs.

mimimalloc