GradIEEEnt half decent: The hidden power of imprecise lines

preview_player
Показать описание
Before the invention of YouTube comments, most people could make remarks that were slightly technically incorrect without fear of immediate public rebuke. The one exception was professors, especially if the classroom included an annoying student such as “Tom 7.” The invention of YouTube was doubly revolutionary: Now anyone can experience being a professor being corrected by an annoying student, but also, corrections can be made years after the fact, and at significant length, as the student cannot be told to “take this offline.” This video is such a lengthy correction and digression, and perhaps itself a fount of mistakes.

Due to the esoteric detail and prodigious whiteboarding, it is almost a “Tom Academy” video. But since I grew a moustache to film it, it qualifies for the Main Sequence.

Keywords: Gradient descent, half-precision floating point, linear operations, F=MA, rounding, exotic transfer functions, machine learning, MNIST, CIFAR-10, chess, fractals, Frobenius, cryptography, fluint8, Motorola 6502.

For SIGBOVIK 2023.
Рекомендации по теме
Комментарии
Автор

he can't keep getting away with it

HBMmaster
Автор

I saw someone describe this video as a "Terrifying perversion of computation" and I thought you'd be proud to hear that.

ripvanwinkle
Автор

"For all practical purposes it doesn't matter."
"What about impractical purposes?"

Thats the spirit i love.

smort
Автор

This is so genius it looped back around to being stupid, then back to smart, then back to stupid.

syncrossus
Автор

"One programmer picked 6 million on purpose, that wan't a typo". These are the kind of engineers who's sheer aura makes me afraid.

papakamirneron
Автор

Sorry to hear your professor strangled you to death, glad you got better

Zebra_M
Автор

the dopamine rush of tom 7 releasing a 55-minute video is unparalleled

y.og.i
Автор

I was just waiting in utter joy for "but we can't decide which instruction to execute, so we'll just do every single one and multiply 255 of them by 0" to drop. My favorite part of shader code is 'why branch when you could multiply by 0' and this is a delightful, horrifying project.

I'm SHOCKED it can do an entire frame in 10 seconds.

Kaiasky
Автор

The hardest I ever worked in college was when my professor told me that something wasn't possible. The pure spite powered me through. It's an amazing force for sure

isaaclinn
Автор

The question was: "Okay, thanks professor, so I understand the mathematical point that the transfer function must be linear but when we implement this on a computer we’ll use some approximation of the real numbers like IEEE-754 floating point, which doesn’t have all the mathematical properties that we’re assuming here like distributivity and associativity, so doesn’t this mean that the doesn’t necessarily hold? like you could in principle have a transfer function that was “linear” but nonetheless exhibited irreducible complexity because of rounding error or things like that, or in principle transfer functions based on values outside of the reals, like NaN and Inf? Plus what about -0? Actually are linear operations even differentiable because when you think about it, they all take discrete
steps so it’s really …"

(for those of you who don't want to go frame by frame through it, as I have)

willoww
Автор

literal spit-take at "maybe we should do this on the computer." - although I started to follow it a bit less at the computer implementation, I enjoyed every second

AlphaPhoenixChannel
Автор

Someday, you're going to say "This is ill-advised, but nobody is stopping me, " and you're going to crash the simulation. I can feel it

OrangeC
Автор

I really love being a bit lost on a new concept for a minute or so, then touching base with a concept I am familiar with, only to see Tom totally abusing and doing evil things to the 5% of the greater whole I understand.

Really keeps me reassured that despite the calm demeanour and pretty diagrams, even the parts I don't understand are pure evil and crimes against numbers.

empty
Автор

When Apr 1 came and went, I went through the five stages of grief over the lack of a Tom7 upload.

Then the madlad uploads on May 1.

PC_YouTube_Channel
Автор

"The Mandlebrot Set is the Radiohead of Fractals" I did a real spit-take all over my computer and am typing this as the water seeps in. Wish my computer a swell retreat into the afterlife.

lysikasaito
Автор

I've had a beef with the way epsilon is used in programming for a while, so I was already laughing when you started complaining about it, then lost it when you revealed that you wrote a whole paper on the topic. The paper was delightful, as usual, and now I have something to send to people who I have petty internet arguments with -- oh sweet vindication!

adsilcott
Автор

This video reminded me of something learned, but have since forgotten, during my time in the CS dept:
Never stop listening when the PhD is talking. If you do, even for a minute, you will be completely lost.

phrygianphreak
Автор

This feels like a victory lap of many prior Tom7 videos. Congrats!
Now it's time to train an AI model to approximate the NES CPU, no?

Wyatt_James
Автор

Tom, you are my hero. I am inspired by your tenacity in pursuing problems simply for the heck of it. Watching your videos makes me want to write math papers about equally ridiculous topics.

alexmueller
Автор

Because I know you appreciate pedantry (and maybe this was deliberate), the phrase "in cold blood" means the crime was done in a calm/systematic/premeditated way. The murderer was cool and/or chill at the time. Getting so frustrated that you snap and strangle a student is definitely a murder in hot blood, not cold

RobertMilesAI
join shbcf.ru