x86 Assembly Crash Course

preview_player
Показать описание
Written and Edited by: kablaa

Music:

Licensed under Creative Commons: By Attribution 3.0

Licensed under Creative Commons: By Attribution 3.0

Рекомендации по теме
Комментарии
Автор

This is 1000 of google searches simplified and compressed to 11 minutes. Thank you!

ntsystem
Автор

If you can read assembly everything is Open Source !

piiumlkj
Автор

Holly crap! this was more useful than whole semester on my university. Please keep doing those videos. Thank you very much.

cepi
Автор

the only video I ever watched at 0.75x. Absolutely fantastic.
(80% of explainer videos have to be watched at 1.5x)

chris
Автор

For those who might find it useful, this is the transcription of the video (I wrote it to memorize better and to check it whenever I want):

Compilers read the .c file and convert the code into a sequence of operations: each operation is comprised of a sequence of bytes: operation code or opcode. The instructions executed by reading opcodes is impossible, so Assembly translate these instructions into a human-readable language.
ELEMENTS OF AN EXECUTABLE:
Every C program has 4 main components: heap, stack, registers and instructions. The two main architectures that dictate how a program is compiled and executed are 32-bit and 64-bit architectures.
The heap:
The heap is an area in memory designed for the purpose of manual memory allocation. The inner workings of the heap are incredibly complicated. Memory is allocated on the heap whenever functions as malloc and calloc are called or global or static variables are declared.
Registers:
Registers are small storage areas in the processor, used to store memory addresses, values or anything that can be represented with eight bytes or less. In the x86 architecture there are six general-purpose registers: EAX, EBX, ECX, EDX, ESI and EDI, generally used on an as-needed basis. There are also three registers reserved for specific purposes: EBP, ESP and EIP.
The stack:
The stack is a data structure comprised with elements added and removed with two operations, push and pop: push adds an element at the top of the stack and pop removes the top element from the stack. Each element on the stack is assigned to stack address: elements higher on the stack have a lower address than those on the bottom, so the stack grows towards lower memory addresses. When a function is called, that function is set up with what is called a stack frame: all local variables for that function will be stored in that function stack frame. The EBP register, also known as the base pointer, contains the address of the base of the current stack frame, and the ESP register (stack pointer) contains the address of the top element of the current stack frame. The space between these two registers make up the stack frame of the function currently called. All the stack addresses outside are considered to be junked by the compiler.

Let's take for example a function that takes one variable as a parameter and declares two local integers like that:
void func(int x){
int a = 0;
int b = x;
}

First the value of the argument is pushed onto the stack, then the return address of the function is pushed onto the stack. The return address is the four byte address of the instruction executed when the function has gone out of scope, then the base pointer is pushed onto the stack, then the stack pointer is given the value of the base pointer and finally the stack pointer is decremented to make room for the local variables. The number of bytes that the stack pointer is decremented by may vary depending on the compiler. All the space in memory between the stack and the base pointer is the function stack frame: this sequence of instructions is the function prologue, performed when a function is called. Since "a" is initialized to 0, the value will be moved into the memory address 4 bytes below the base pointer because an integer is 4 bytes, so the local variable is now a location EBP -4. The value of a function argument is stored 8 bytes above the base pointer, which is not in the function stack frame. Values of the stack cannot be moved directly to another location on the stack, so general-purpose registers come in: the value of the argument to the function must first be copied into one of these, then the value is moved into the memory address 4 bytes below the first variable and eight bytes below the base pointer. Both of the local variables have been now initialized and can be used later.

There are two syntaxes that Assembly is normally written in: AT&T and Intel. The instructions are the same.
We use the Intel syntax.
Every instruction has two parts: operation and arguments: operation can take either one or two arguments: if an operation take two arguments they're separated by a comma.
The mov instruction takes two arguments and copies the value of the second one in the location referred by the first one. But, if for example we want to move a local variable (stored at EBP -8) on the stack into the EAX register, if the command were to read "move EAX, EBP -0x8" this would not copy the value of the variable into the register, because EBP -8 is the address on the stack where the variable is located, so instead the instruction would copy the address if the variable into the register. To copy the actual value or what EBP -8 is pointing to we use "[]" (like the dereference operator in C): when they're used the value being pointed to is referenced.

The add instruction takes two arguments: it adds their values and stores the result in the first one. For example, if eax has the value of 10, "add eax, 0x5" updates the value of eax to 15.

The sub instruction works the same way, but the value of the second argument is subtracted from the first.

The push instruction places its operand on the top of the stack: it first decrements the stack pointer and then places its operand in the location that it points to.

The pop instruction takes a register as an argument, moves the top element of the stack into that register and then increments the stack pointer.

The lea instruction (load effective adrress) places the address specified by its second operand into the register specified by the first one. It's usually used for obtaining a pointer into a memory region.

The control flow of an executable is where all of the if statements and loops in the code come together to determine the order in which instructions are executed. Every instruction has an address: this is the area in memory where the instruction is stored. The EIP register always contains the address of the instruction that is currently being executed, so the computer executed whatever the instruction pointer is pointing to and then the instruction pointer will be moved to the next one.

The compare instruction is an equivalent of the sub one, but instead of storing the result into the first argument it will set a flag in the processor that contains the value 0, greater than 0 or less than 0. For example, "cmp 1, 3" subtracts 3 from 1 and since -2 is less than 0 the flag will be set accordingly.

Compare instructions are followed by a jump instruction: it takes an instruction address as its argument, checks the current state of the flag and, depending on the state, sets the instruction pointer to its argument. There are many types of jump instructions: some include jump if equal to, jump if not equal to and jump if greater than.

The call instruction calls a function, whether it be a user-defined function or a PLT function (like printf or scanf). It takes one argument and it will push the return address of the function being called onto the stack and then move eip to the first instruction of the function.

The leave instruction is called at the end of every function and it destroys the current stack frame by setting the stack pointer to the base pointer and popping the base pointer off the top of the stack. The return instruction always follows a leave instruction and, since the base pointer has already been popped off the stack, the return address of the function is now on the top of the stack. The return instruction will pop the return address off the top of the stack and then set the instruction pointer to that address.

DL_GLCH
Автор

I started the video thinking I was going to let down by the length, but since it was so short, it wouldn't hurt to watch. Then I saw how good the video was. Damn.

..
Автор

Did that dude just check back at his notes in the first 3 seconds to remember his name? I fucking love this dude already.

dedballoons
Автор

Damn, that CMP/JMP stuff was so well explained, realizing it's just a SUB instruction essentially makes it and the following JMPs make so much more sense! Thank you!

xbitbybit
Автор

I love you so much for this I wanna cry

Blowyourspot
Автор

It is so funny watching this description of x86 Assembler as if it is one of the universe's major mysteries :)
35 years ago I was writing 8086 / 80186 / 80286 code in PLM and manually checking the intermediate (Assembler) output to make sure that the compiler was being efficient and that no multiplies had sneaked into the code. Now apparently the ability to write ASM286 would make me an international how things have changed between my generation (Computer Science Degree) and new graduates (Software Development Degree) with their 'religious' views on OOP and encapsulation, method and properties - as if they were some form of alternative to knowing HOW a CPU fetches and executes a command and what the different fields in the binary representation of those commands actually do.
Wish I hadn't moved into management 20 years ago - all those C++ programmers who could have been taught assembler, finite state machines, cyclic buffers, Interrupt handlers, real time control, cost of structures in clock cycles - in stead of all this bloatware relying on Moore's law to keep it viable.
As we used to say in the olden days....'Flame off'

occamraiser
Автор

Fantastic and coherent explanation. You earned yourself a sub. This might be the quickest/clearest explanation of the basics of how x86 works.

_nit
Автор

There's such a lack of instructional videos on assembly language and this was so helpful

TheLionheartCenter
Автор

Oh man! You can't leave someone hanging like that! You barely scratched the surface, and I want more! Go ahead and make that second part of this; and I'm sure we'll all watch it.

WarrenGarabrandt
Автор

That was quite a lot of information in a very short time for someone who didn't know anything about x86 architecture before, but I really appreciate you for not bullshitting around. I'll come back to this video once I have read a bit up on those 20 new terms 😄
Thank you for your Video 🙏

tielessin
Автор

I've been coding for 20 years. Recently I was considering adding conditional breakpoints to an app I use. I knew it was going to get messy so I figure a bit of information about how Assembly works might motivate me. I tell you what, this video damn sure made me motivated. Excellent work.

tehfn
Автор

This is literally one of the best instructional videos I have ever watched in my life period. Fantastic job

stephen
Автор

just about everything I was confused about was clarified in 10 min. Thank you! I was disappointed to not see anymore uploads. The video made it seem like there was a series. :[

hehhehdummy
Автор

Normally i watch at 2x speed cause videos go to slow. Here i needed to rewatch parts to try and keep up. nicely done.

zokm
Автор

I'm so glad I found this video... you just explained most of my college Assembly Language course in 10m... now I can do that project I procrastinated till the last possible moment ( it's due today ).

jochedev
Автор

Finally! One video at the perfect pace with the perfect amount of information!

RachayitaGiri