AI’s Hardware Problem

preview_player
Показать описание
Links:
Рекомендации по теме
Комментарии
Автор

And just like that, we're back to cutting-edge computers the size of houses.

beyondobscure
Автор

Your ability to breakdown information into bit sizes that average person can understand is remarkable. I worked in the Electronic Test equipment marked, Great Job.

softballm
Автор

You're on another level bro. I love it. Beautifully presented, in-depth and voiced perfectly. This channel rips.

TrevorsMailbox
Автор

Hello, DRAM design engineer here. Really informative video, and explained in such an easy to understand way! Love it!

Just a quick comment, DRAM memory is generally pronounced "DEE-ram" and Via in TSVs is pronounced "VEE-ah". It's confusing and not intuitive, but hopefully this helps for your future videos!

wheatstone
Автор

When did people start pronouncing DRAM as a whole word instead of just saying D-Ram?

OccultDemonCassette
Автор

I have memories of a project of "modified DRAM chips with internal logic units" around Y2K, I saw a paper probably from MIT but I don't remember whether it was implemented. It looked promising for certain kinds of massively parallel operations such as cellular automata simulations 🙂

leyasep
Автор

As an aside, the book "Content Addressable Parallel Processors" by Caxton C. Foster (1976) discussed the ability to have a mass memory do computation. It is bit serial, but is parallel to all memory locations, meaning that you can do things like multiply all memory cells by the same number and similar operations. Its a good read.

scottfranco
Автор

Informative and concise; thank you.
I notice you pronounce SRAM as "ess-ram" (which has always made sense to me because of the acronym's origins as a *"Dynamic"* extension or iteration of the much older acronym/technology of RAM for Random Access Memory, ) but you also pronounce DRAM as "dram." (I say "dee-ram" because, again, it's a variation on ol' trusty rusty RAM.)
Unfortunately, "dram" is already a word in use outside computing - but in at least two other science fields - as a noun for:
1) A unit of weight in the US Customary System equal to 1/16 of an ounce or 27.34 grains (1.77 grams).
2) A unit of apothecary weight equal to 1/8 of an ounce or 60 grains (3.89 grams).
3) A small draft.
Not like bad or wrong, but maybe worth noting for future usage.
Again, excellent work; enlightening. Keep it up.

stevejordan
Автор

This reminds me of the Connection Machine (CM), made by Thinking Machines Corporation back in the 1980s. The CM had a large number of single-bit processors with a few kilobits of memory each. They were interconnected in a high-dimensional hypercube.

Lower dimensional connections were on-chip and higher dimensions went off-chip. It was programmed in a language called *Lisp. I remember that it seemed way ahead of its time.

gideonsiete
Автор

One thing that always fascinated me was the use of content addressable memory. As I recall, we were using it for decoding micro-code back in the bit-slice mini-computer days. It seems that that approach of combining logic and memory would be an interesting approach to today's AI problem.

makerspace
Автор

I was hoping you were going to get into some of the software solutions that today's neural networks have been able to implement to allow 1000x increases in deep learning architectures while the VRAM has only increased 3x in the same timeframe instead of exclusively the hardware solutions. Stuff like how there have been great advancements in the ability of multiple gpus to communicate with each other efficiently to perform backpropogation which has allowed neural networks be trained on many gpus at a time. At first, a neural network could only be trained all on one gpu, but then the NN got too big to fit onto a single gpu so we figured out how to have a single layer on each gpu, but then the NN got too big for that, so we had to figure out how to have parts of each layer on each gpu. Each step along the way required innovation on the side of machine learning engineers to build exponentially larger neural networks while the gpu VRAM just isn't keeping up

ChocolateMilk
Автор

I can explain the resistive memory outlined at 12:05. In electronics, there are parallel and series circuits. Resistances in series add together, meaning that if you connect the resistances from two memory banks, the resulting resistance can be written to a third memory bank. No logic required. I mean of course there’s logic required, but the memory itself is the input and the output. I have no clue how the memory chips work, but the idea is that you can use the properties of resistors to do addition for you.

timwildauer
Автор

It really sucks that the 08 crash really killed a lot of the proposed solutions the large vendors were looking at the address these issues. If you look into HPs solution (optical computing/rack scale memory) and SUNs solutions they were putting R&D money into before the 08 crash caused all these companies to essentially abandon Their labs.

JessSimpson
Автор

Some researchers have tried to understand what a neural network does to an image when trained for recognition and classification without a pre-set algorithm.
The results are startling; the network gets fixated on the tiny differences and their patterns at the boundaries of the image, and other odd parameters that a programmers would never consider viable.
The fact is that the system works, but relies on some preconditions that can fail all at the sudden.
There is a long way to go in designing a reliable neural network, but there also is something to learn on how numerous are the intrinsic and unknown pre-conditions existing in human perception...

rayoflight
Автор

Circuit-Level CIM has one major limitation that I wish you had discussed. Its susceptibility to PVT (Process, Voltage, Temperature). When storing weights in SRAM Cells and applying 1 or 0 to the Word Line (WL) to perform the MAC operation (WL Multiplies with Weight and then the current on the Bit Line (BL) is the sum of all of the cells in the column) we are performing an analog operation. The BL current will depend on the process variation, supply voltage, and ambient temperature. That is, at two different temperatures, or supply voltages (Battery voltage changes), we will get different results, even with the same die. This makes it unsuitable for "Edge AI" applications. Between two chips or two different columns, we will also get different results, because of the process variation. The accuracy is significantly limited by this. With an Analog WL driven by a DAC, the problem is exaggerated even further. Granted, I do not know what sort of accuracy AI models really require but I imagine it is much greater than what can be offered by CIM in current CMOS processes. Of course, larger processes decrease variation, but the density suffers. The nice thing about conventional computing is that our accuracy does not depend on PV, only our speed. I think integrating DRAM dies with conventional CMOS dies is likely the way forward.

arturo
Автор

I'm currently working on a project using NTC and PTC thermistors to store analog values. With the idea that they will respond differently to frequency of access and will also effect neighbouring cells much like a neural net.

RooMan
Автор

ASIANOMETRY IS THE BEST EDUCATIONAL CHANNEL ON YOUTUBE, NO CONTEST!!!

Your channel truly stands out like a diamond in the rough. There is plenty of stuff I like and watch on YT, but your channel is on an entire different level. You dive deep into complicated subjects over and over, and always do it in a way that is easy to understand. Other channels go deep too, but I frequently find large chunks of the video goes over my head because I don't have a PhD. Every single time I watch your vids, I not only learn new things, but by the end of the video I UNDERSTAND the subject you bring up and feel smarter. Can't sing your praises enough! Take care!!!

KomradZX
Автор

Language models are not the only task requiring huge memory. Another example is genome scaffold assembly (which takes millions of DNA sequence snippets to produce a complete genome of an organism).

RalfStephan
Автор

A DRAM (dynamic RAM) cell is made by a FET transistor and a small capacitor connected to its gate. Since the capacitance is small for reasons of speed, the capacitor lose its charge within milliseconds, and the DRAM has to be constantly "refreshed" (with a read cycle usually), so it can keep the bit properly memorised.
A SRAM (static RAM) cell works on a completely different principle. It doesn't use a capacitor to memorise the value of the bit (0 or 1) and doesn't need a refresh cycle.
A SRAM memory cell is basically a RS Flip Flop (Set - Reset) which keeps the set level until it is re-setted. Therefore, instead of a single transistor, each SRAM cell is made by four to six transistors. So the SRAM cell takes more chip space, but it can run at the same clock speed of the logic circuitry; moreover, the SRAM is much less susceptible to data errors and interferences. The Mars probes use exclusively SRAM memory in their onboard computers.
The SRAM represents the ideal operating computer memory, but it takes six transistors instead of one for each memory cell...

rayoflight
Автор

So ReRAM is what Veritasium talks about in his analog computing video (specifically what Mythic AI are doing)? Seemed really promising.

ChaosTheory