Fast Inverse Square Root — A Quake III Algorithm

preview_player
Показать описание
In this video we will take an in depth look at the fast inverse square root and see where the mysterious number 0x5f3759df comes from. This algorithm became famous after id Software open sourced the engine for Quake III. On the way we will also learn about floating point numbers and newton's method.

0:00 Introduction
1:23 Why Care?
3:21 The Code
4:18 IEEE 754
9:38 Bits and Numbers
12:09 1st Step: Evil Bit Hack
14:46 2nd Step: WTF
17:34 3rd Step: Newton
19:46 Summary

Picture of John Carmack is licensed under CC BY 2.0 from author Drew "Prognar" Campbell.
Рекомендации по теме
Комментарии
Автор

I love the fact that he felt the need to define a constant for "threehalfs", but just threw that 0x5F3759DF in there no problem LOL

Eidolon
Автор

10:23 "for no obvious reason at all, let's take the logarithm of that expression" you basically just summed up the Electrical Engineering experience

owfan
Автор

I thought it was implemented by John Carmac, but according to Wikipedia: "The algorithm was often misattributed to John Carmack, but in fact the code is based on an unpublished paper by William Kahan and K.C. Ng circulated in May 1986. The original constant was produced from a collaboration between Cleve Moler and Gregory Walsh, while they worked for Ardent Computing in the late 1980s." Thanks for educating and for your great work.

diegodonofrio
Автор

I don't think I understood this well enough to explain to my friends how cool it is... but I definitely understood enough to have my mind blown, and to tell them they need to see it

JackRackam
Автор

You know an algorithm is good when one of the steps includes "C doesn't think we should ever do this, so we have to trick it into letting us"

Owen-bkfc
Автор

"I don't know what else to say, that's just how C works"
An excellent explanation of all of C

thepuzzlemaker
Автор

Performing a division AND a inverse root using only multiplications is nothing short of real world black magic. I am sincerely impressed.

the_hiroman
Автор

I remember when I started programming professionally in the 90's. Id's performance tuning was something that us programmers would nerd out about, and I'd often write my code to similar standards. Curious coworkers would sometimes encounter some of my "performance-tuned code" and be like "uhh, why did you do this like that?" I'd explain all about efficiencies and they'd just kind of roll their eyes and be like "okaaayy." We were, after all, just writing client-server apps for small businesses, for which performance tuning was totally pointless. But I wanted to be a game developer, damnit!

LarsSveen
Автор

the "what the fuck" comment was probably added by another engineer reading the code. The creator was a chad.

kuroexmachina
Автор

It's even scarier than "only 1% error"; using the default magic number of 0x5F3759DF actually gives about 0.175% error maximally, dropping thereafter. It is now believed the original author of the function (not Carmack, who never claimed he was), found the constant through trial and error rather than derivation.

If you use Chris Lomont's updated magic number of 0x5F375A86 (which he derived), the error drops to a staggering 0.09% after just 1 iteration.

BTW, the original author is unknown, but speculated to be Canadian Velvel Kahan, Turing Award winner and mathematician, circa 1985.

johnfaustus
Автор

"you don't need to be good at maths to do programming"

The Code:

neofox
Автор

If you're wondering why μ ≈ 0.0430:

We know that log_2(x+1) ≈ x. If we want to see the relative error between them, we can use the function f(x)=log_2(1+x)-x. We want to find the maximum of this function, and we know that it only has one maximum.

Since the function is continuous, we can find its maximum by finding when its derivative equals zero. Using a bunch of rules, we have f'(x)=1/((1+x) ln(2)) - 1. Setting this derivative equal to zero, we see that the maximum of the function occurs at x=(1/ln(2))+1.

Substituting this number back into our first function, the maximum value is log_2(1/ln(2)) - (1/ln(2)) + 1 ≈ 0.860. To minimize the error between this maximum and our minimum, we want μ to be halfway between them. Taking the average of (0, 0.0860) yields our constant, 0.0430.

jgcodes
Автор

This is how legends are born: *man appears out of nowhere, disintegrates the fuck out of ancient code and vanishes into silence*

kennyotsu
Автор

I found out about that inverse square root in 2010 (about) and only now, a decade later, has it been explained well. Thank you!

sgtblu
Автор

Some of my computer science classmates wonder what good learning the IEEE 754 standard and Newton's approximation could possibly do, and the answer is that it helps us to understand a cool video about a quake III algorithm

rijaja
Автор

The breakdown of IEEE 754 is... mind-blowing. That's so supremely clever, both the standard and the explanation.

jansenart
Автор

Oh man, I remember taking a CS class that taught floating point representation with the exponents and mantissas, and I thought to myself "ok this is cool but I'll be programming higher-level than this, so I probably won't use this knowledge in practice." Well, this is a great example of how you might use it!

carykh
Автор

I'm reminded of the single greatest code comment I've ever read: "I do not understand the following line of code, but I've verified it is required for this function to work correctly." It was in a bit of Perl code, a language with several peculiar syntactic quirks.

jamesmelton
Автор

Your 20 minute video explained things better than most of my college courses. The explanation and the way you showed the bit values was really nice. C is the language that keeps on giving. I’ve dabbled in C for many years and I still feel like I haven’t mastered the basics.

meiowalot
Автор

I worked between 2016 and 2019 on the latest revision of the standard, ieee754-2019. You should probably note that the "evil bit hack" which I've used myself many times over the years, is no longer canonical C: You should instead use either a union, or memcpy() to move the bit pattern between the types. In reality, modern compilers will convert all three into the exact same code, i.e. either nothing (if integer and floating point use the same registers), or a move from one type to the other.

TerjeMathisen