What's inside a .EXE File?

preview_player
Показать описание
What is inside the Windows Executable or the executables for other operating systems? I take a look at the past from the days of DOS until the present and crack open a few EXE files to get a look inside. Explored also is programming without a compiler, linker, or any kind of processing code before execution. Is programming in raw machine code possible?

Altair 8800 Demonstration:

Learning x64 Assembly:

kernel32 lib:
kernel32 and kernel64 are the same thing.

Windows 1.0 online:
Рекомендации по теме
Комментарии
Автор

Actually, Windows hasn't been DOS at its core since Windows Me. Windows XP and later are based on Windows NT, which doesn't use DOS.

Sparkette
Автор

I did actually handcraft an EXE file. I did that as a part of writing a simple compiler for a stack-based language.
The hardest part was making the correct header. That took 6 hours - mostly because Windows never tells you what went wrong, it just refuses to load the file. But when I get that figured out - it was a pretty smooth sail.

rsa
Автор

This wasn't bad, but missed or simplified a lot about the actual exe content. Exe files (or PE files) are organised in sections. There are different sections in it and usually only one contains your code. There are other sections which may contain resources, text or much more importantly import / export sections. While an EXE file usually does not have an export section, it usually has an import section. The content is essentially a special "contract" by the OS and your application. When the OS starts your program, the OS takes care of loading your file into memory of its own process that the OS created. The OS will scan through the import table, look up shared libraries and imported function names and dynamically load those DLL into your application and also resolves those requested methods. That way your program actually has access to certain functions that are either part of the OS or some other utility libraries. The export table usually only exists when compiling a DLL file which internally is also a PE file. Of course the export section serves the opposite of the import section. So the OS can look up a method or other symbol that the library exports when loading the DLL for an application.

You can actually trim out a lot of the unnecessary code from a PE file. In the past I used a very small assembler called Flat Assembler (FASM). There you could even create your own MZ stub without all that message stuff that nobody needs anymore. In the Demoscene it was even common to have the MZ and PE header to overlap. The MZ header contains a special value that determines the position of the PE header. By cleverly offsetting the PE header you could (re)use otherwise unused or irrelevant bytes. I created key hook dlls that were only 2kb in size. Unfortunately Windows expects sections are placed at a certain alignment, so you can not shrink it too much. So Windows expects some empty space between sections. Though the demoscene usually makes use of almost every byte. Since Windows does not care about the content of those alignment sections, you can fill it with your own data.

The fun thing about FASM is that it's source code is available in its own assembly dialect. So it can compile its own source code to produce itself. Of course it's open source. Just google for flat assembler or FASM.

Bunnys
Автор

Rarely these days do you hear people refer to C as high level, but I'm always glad when it is.

steamrangercomputing
Автор

i don't think its true that windows is still running on dos nowadays though. thats my only critique. its running on the NT kernel now and has for a long time. I think that message about not running in dos mode was made for the time before home versions of windows used the NT kernel, so pre windows XP.

SilasonLinux
Автор

7:06 A common misconception in Python is that each line is being read and executed in real time but what actually happens is that the interpreter compiles it to bytecode and saved to memory which will then be executed by the Python Virtual Machine in real time.

apoocat
Автор

Okay, I just want to say, that one of the reasons of big size of the .exe file is compiling mode - Debug. You can basicly see there three calls of third system interruption right after the end of "main" function - they are inserted by compiler to prevent running out of function (if you, for example, forgot to write "ret" instruction). Debug mode generates terrible amount of auxilary code, which can help you in debugging. All your actions, even in assembly, are checked by debugging instruments in runtime to help you in search of mistakes. So for pure research you should better disable all of debug utilities (part of them is still used even in "Release" mode) in project settings. But even with that, this video was interesting, thank you for your work.

fedotttbv
Автор

EXE File also contains Icons, Bitmaps, Cursors, Dialog Defintions. The function LoadBitmapA for example loads a bitmap inside the current exe file. Many of this resources can be viewed (and sometimes edited) with PE Explorer or similar programs.

Gwarks
Автор

That "Gesundheit" killed me🤣🤣

GS
Автор

I remember when MS DOS had a debugger. It was fun to start the debugger and tell it to just "go". Debug would dutifully attempt to execute whatever the IP register was pointing to. The machine would jump off a cliff if it could and you told it to

shackamaxon
Автор

the MZ at the beginning of DOS executables stands for "Mark Zbikowski"... who was one of the main developers responsible for developing the file format.

mkd
Автор

For those interested, I find Dave's Garage "The World's Smallest Windows App" video a fantastic explanation of how you can take out everything but the bare minimum from a PE.

Very interesting video by the way, altough I feel like the viewer is left with more questions than answers. Anyways, keep up the good work and I hope to see your channel grow.

glitchy_weasel
Автор

I'm sure several people have pointed it out by now, but the extra code you were seeing is from the CRT (C Runtime, ) since despite being written in assembly, you were compiling your program as a C program.

Before main is called, there has to be code to do things like, take in the command line and split it up into argc/argv, set up thread local storage, set up floating point numbers, etc. On Windows, this stuff is done by the executable itself, not the system. The code to do it is inserted by the compiler in a way that's transparent to programmers. You can turn it off, but then you'll have to implement those features yourself if you want to use them.

tomysshadow
Автор

CS student so I have a few notes on this:
6:29 When I first learned about Assembly I thought the same. This is NOT true however. Assembly is extremely hard to grasp on a physical scale. It is only when you get into the meat and nitty gritty details of how a processor _actually_ functions that you realize just how close Assembly code actually is to pure machine code. All Assembly effectively does is take a command in (like 'mov') and translate it into 1s and 0s. There is a 5060 page thick "Intel 64 and IA-32 Architectures Software Developer’s Manual" for x86 Assembly detailing what exactly each instruction means, but basically "mov eax, 0x5" gets translated _directly_ into "0xb8 0x05" in hexadecimal, with b8 being the opcode and referring to 'move the following to the eax register'. The instructions that are read are directly sent to something like the processors ALU and directly fed into the connected multiplexer. So the "add" instruction you put in actually controls that specific multiplexer in that specific register.
Now while this is not punching in bits into a machine by hand, you are really not gonna come any closer to controlling the pure bare bones hardware than this.

7:36 I presume you are referring to Python in this case because believe me when I say that every single one of us sucks at Assembly compared to the magic a compiler performs. A compiler is capable of spitting out insanely optimized Assembly code to the point where the only people on this planet capable of writing faster Assembly code than it are the people that actually program the damn things. Compilers do things like higher polynomial functions and division by invariant multiplication to make your code _way_ faster than you could ever do. And those are just some of the incredibly genius ways your code can be improved upon. To _really_ understand the full math a compiler uses to fold and optimize your code you basically need a PhD in Math and Computer Science.

All in all that topic is a thing you can really sink time into. :)

Finkelfunk
Автор

This is one time i wish i could double like a video.
It's a bit oversimplified for more advanced computer users, but for the layman just wanting to learn more this is fantastic.

snippykeegan
Автор

Amazing video! I never really thought about exe files that way before - you explain it so well! I always learn something cool from your channel - thank you!

mattgio
Автор

If I remember correctly, in DOS you could also create COM files as well as EXE. I think these were basically executables for small programs like command line utilities.

UKGeezer
Автор

I really like the accompanying visuals you included at the end!

Matojeje
Автор

Stellar analysis! I learned a lot. 💜 Thanks.

Slurkz
Автор

And this is why we need for the Community to release things more like dev tools instead of production apps. To better understand how things works internally, and to improve them.

carloslecina