Finding the BEST sine function for Nintendo 64

preview_player
Показать описание

This video was sponsored by Brilliant.

0:00 Einleitung
0:35 Why is sine Important?
3:54 Mario 64's sine
7:04 Ocarina of Time's sine
9:31 Interwoven co/sine
16:58 3rd order sines
18:38 Numerics
20:21 5th order sines
22:27 4th order cosine
23:30 conclusion
Рекомендации по теме
Комментарии
Автор

Using ancient Indian mathematical formulas to make the funny red man go bing bing wahoo even faster, even after going like five massive optimizations that give wild speedups already

Classic Kaze.

prismavoid
Автор

I've been optimising sine tables since the ZX Spectrum. I never realised how many tricks I've missed! ... BRILLIANT ... Thanks for sharing

csbluechip
Автор

it's so cool to see situations where it's actually worth it taking up more CPU cycles than RAM, when a lot of programming problems are the opposite

Temulgeh
Автор

This man knows more about the n64 than nintendo themselves

gabrieltorres
Автор

14:57
The reason he has to make the value negative here is that Bhaskara is only accurate in the range -pi/2 to pi/2.
The calculation for flip sign will be one, if the value is outside of this range. if this is the case the value needs to be shifted and flipped as it is after the if(flipsign) and the ternary operator after the Bhaskara algo proper
Here is a graph showing the process, and also how he gets sine out of this calculation.
BUT Youtube wont let me post it because they are dicks

Zaurthur
Автор

Maybe the 562.5 MegaBytes/s at 5:37 should actually be Megabits/s. 562.5 Mbps = 70.3 MB/s.

SeanCMonahan
Автор

Nintendo has used a lookup table for trigonometry since Super Mario World, where it was used for the rotating sprites (ball and chain, chained platforms, etc)

FabulousJejmaze
Автор

SM64DS also uses an interlaced sine/cosine table, but without mirroring. I never really thought about why that is but given how cache works it makes perfect sense.

pants
Автор

Man it’s always crazy how even a game that’s considered “poorly optimized” like Mario 64 is still optimized incredibly well for its time.
Now a days developers say “oh most peoples computers these days should have about 16 gigs of ram? Why waste time making it run on 8 then?” And then the game still comes out rushed and bad.

totalphantasm
Автор

4:20 Small correction: The table uses 16KB (20KB with cos) of memory, because floats are 4 bytes wide.

cerulityk
Автор

Me (Person who has a literal degree in computer science, built multiple games, and over 5 years industry level experience): This guy is insane and so much is going in one ear and out the other but this is so interesting to watch

darkwalker
Автор

When I was coding 3D dos games 25 years ago, I couldn't understand why my sin table was not that fast, it is just one line of code with no CPU computation... Cache/memory bottleneckes make a huge different, I confirm.

youdj_app
Автор

I have a hunch about the angle flipping necessity: tangent.

The angles in the 1st and 3rd quadrants (going ccw around a circle on the cartesian grid) have the same tangent values, the same with the 2nd and 4th quadrants.
So you have to distinguish when you want the angle from the 1st or 3rd quadrant (and the 2nd or 4th Quadrant).
MS Excel (and Google Sheets iirc) have the 2 argument arctan function to treat the angle calculation as a vector problem, but since Mario 64 doesn't have this, you have to use a from scratch angle discrimination setup, much like what Kaze ended up using.

Creatively_Bored
Автор

This channel is slowly becoming the best showcase of the general outline of optimizing code on Youtube. I love it.

aldendwyer
Автор

In GCC for ARM, when you want to return two 32 bit integers in registers, you return a 64 bit number. To convert between two 32 bit numbers and one 64 bit number, you use a union.

Dwedit
Автор

The video's great! I love seeing how things like this are done on old hardware. It seems to me like it would be hard to understand how anything would be best optimized for modern hardware with preemptive execution, and weird hardware tricks and strange security flaws - more like magic than science. Even though I don't really program much, let alone optimize anything, optimizing code for old hardware seems like it's something that can actually be learned in a concrete and non-mystical way by a human being, even if it takes effort.

porterleete
Автор

A reminder of the actual MIPs cpu cache size (16 KB instruction cache and an 8 KB data cache) brings a whole lot perspective on that LUT size.

brunogm
Автор

as an SNES homebrew dev, it supremely fucks me up that it's faster to *calculate* a sine than to just use a simple LUT...what the hell even is the N64

CommodoreKulor
Автор

This is a coding channel that eventually uses mario to show how programming works.

Rihcterwilker
Автор

I remember from a book called "linear algebra done right" on page 115 it says the best approximation for sin(x), of fifth degree, on the interval [-pi, pi] is given by 0.987862x − 0.155271x^3 + 0.00564312x^5.

tumm
welcome to shbcf.ru