Multi-Dimensional Data (as used in Tensors) - Computerphile

preview_player
Показать описание
How do computers represent multi-dimensional data? Dr Mike Pound explains the mapping.


This video was filmed and edited by Sean Riley.


Рекомендации по теме
Комментарии
Автор

Glad you mentioned doing the clever things another time.


Makes me really excited about possible seeing a video on cache optimization, like interlacing the memory and such.

Akronymus_
Автор

Someone get him some brown numberphile paper

elwizo
Автор

"We need bigger paper". The whiteboard behind him: "am i a joke to you"

jaden
Автор

Golden rule is to use powers of 2 as sizes, then you get away with bitwise shifts instead of multiplying. So for a 100x100 dataset, you want to set up a 128x128 grid, then you can address each row by R SHL 7, an instruction done in a single clock cycle, instead of a MUL which afaik takes longer on every platform. You wasted quite a lot of storage, but you gain precious speed.

matwyder
Автор

Dr £ is an excellent explainer of pretty much anything.

olamarvin
Автор

It’s a shame I can only give this video a one dimensional “like”

WoodyWk
Автор

APL was a multi-dimensional array language invented by Ken Iverson at IBM research in the early '60s. It was an early interpretive langage, and was extremely useful in doing anything involving arrays. I loved it! of course, I had a math degree and was comfortable with linear algebra etc. I used it to simulate computer architectures, compute project timelines and risks, model animal behvior, and otherwise have a good time! It had its own obscure set of symbols representing array operations, and was affectionately known as "a write only language", which is probably why it did not catch on.

sleepy
Автор

i was already subscribed of the channel, i am a programming student, i just realized by my self that this was possible, and here i am finding confirmation of my hypothesis.

petroniosilva
Автор

A major advantage arising from keeping everything contiguous in memory is of course cache coherency. This is more significant than the "no need to copy data" explanation given because if were to construct multi-dimensional data from pointers to pointers we would still have that property and also be able to have differing sizes per dimension (jagged arrays). We would however have any lost cache coherency guarantees.

ulteriormotif
Автор

You should do a show on the methods used when storing large image sets like for sky surveys and particle physics. Because the files are so large and one only wants to see a portion of the image efficiency of retrieval is needed. Spoiler, the images are stored so that pixels near each other are near each other in memory. I am trying to remember who gave the talk I found got this from. I think it might of been Dr. Mark SubbaRao from the Adler Planetarium or someone from Fermilab. The solution if I remember correctly is to have the offset being the element parameter of a space filling curve.

mentatphilosopher
Автор

Fascinating, as always. But when you said storing that bitmap as anything but a 1D array didn't make sense ... now I have an urge to implement 2D storage on a disk drive using track as x and sector as y.

It's funny that we use row/column addressed hardware which does a transform to make it seem linear for programs which then transform those addresses back into row/column.

strayling
Автор

How many people tried to wipe the two dots on the whiteboard (in the medium shot) off their screen only to realize they were part of the image?

eatbolt
Автор

I had a sense of how this might work but never looked under the hood and it didn't really click for me until mike said he visualizes higher dimensional data structures as "groups of groups..." which totally makes sense

ACTlVISION
Автор

Peter parker of image processing !, thank you spidy :)

ProfSoft
Автор

Inteliigent explanation. Where are you from ?

BriteRoy
Автор

When I was doing stuff like this back in 2010, I too ended up thinking about the deminsions as sets, or groups, of other sets. I never used strides though, Instead I just itterated through the needed parts of the multi dimensional array. In my mind it was the same as spinning through a rubrikscube with many more sides, only checking the appropriate corners for information.
I guess I was using "tensors" all this time then, since I would say "look at the top left pixel, in the blue layer for the first set of images" by fixing the "set", "layer" and "pixel" and then looping through "images", which is the same as feeding in a bunch a stride commands formed as tensors.

johanlarsson
Автор

How does this get optimized in relation to cpu cache?

squwk
Автор

The graphic demonstrating Y stride is wrong. Brady, what number do we start with?, and remember, we're not doing a numberphile video.

Mr.Beauregarde
Автор

Intuitively it sort of makes sense, but in terms of details I don't understand the efficiency gains of one contiguous array vs multiple nested arrays. Using a 2D-array as an example, wouldn't multiplying the y-stride by the x-stride followed by the lookup be just as fast as using constant values to each perform a lookup? Not familiar with the overhead involved in array lookups, I suppose.

DeusGladiorum
Автор

This video was so relevant to me. I'm just learning c++ and this is like definitely the best way to make multi dimensional arrays.

tristunalekzander