What NOT to do: Self Modifying Code - Computerphile

preview_player
Показать описание
How 'not to code' with our "real" programmer - who, as Julian explains, is demoing what NOT to do. Dr Julian Onions tells us more about Mel.


This video was filmed by Julian Onions and edited by Sean Riley.


Рекомендации по теме
Комментарии
Автор

I used self-modifying code to win a code optimization challenge back in university.
We were writing a matrix multiplication program and competing in whose program could multiply two 10 by 10 matrices in the fewest clock cycles. However, you were not allowed to optimize your program specifically for 10 by 10 matrices. I got around that rule by writing a program which started off by optimizing itself for whatever matrix size it was fed.

hellterminator
Автор

Great video but the cake recipe was missing the 12 page backstory of the writer telling us about how their father made this when they were home sick from school.

thender
Автор

Ah, so _this_ is what I *should* be doing, I see.

Supertimegamingify
Автор

Program: *runs*


Also program after 2 minutes: I have no idea who I am and where I came from

ywanhk
Автор

I think this academic might be surprised by what you can get away with writing in a corporate environment.

MrGeekGamer
Автор

The recipe isn't really self modifying code. It just modifies data depending on the time. It's a fine example of bad code though.

dibblethwaite
Автор

The algorithm that doesn’t fly:
1) Switch steps 1 & 2.
2) Flap wings.
3) Goto step 1.

qubex
Автор

A better example of self modifying code using the recipe would be to have some instructions like:
1. Copy this recipe's ingredients and instructions to a blank sheet of paper. Once complete, start following the copied instructions from step 2.
2. Find all numbers in the ingredients this on page which are bigger than 5. For each of those numbers, do the following: add 100 to it, cross out the original number, and write the new number in its place.
3. Find all instructions in this recipe which contain the words "add 100". For each instruction, cross it out.
4. Set oven to 150 degrees C.

5. ... (and the rest)

Notice that this would change the actual instructions you should run.

1. Copy the recipe to a blank sheet of paper == load the program into memory.
2. This instruction modifies the the code (if it said to use 5 eggs, now it will say you need 105 eggs. ) Notice it only modifies the "data" here.
3. This line finds the previous instruction, and removes it. It also finds the current instruction and removes that! This is actual self-modifying code. The purpose of this is in case you want to use this program/recipe again: if you don't remove this code, the next time you go to use it, you will have to use 205 eggs, and then 305 eggs, and eventually you'll notice something weird is happening. ;]
4. And now we start the actual recipe.
5. .. and we run out of eggs.

zenithparsec
Автор

How did ypu manage not to mention metamorphic malware?
Decompile -> rewrite itself -> compile -> infect new host


Its the coolest thing ever

RoskGamer
Автор

I learned more from the comments than from the actual video.

unvergebeneid
Автор

When I was a teenager in the 90s I thought this was the best thing. One thing I quickly learned was that you couldn't modify the next instruction to be executed because at that point it would already be moving through the CPU's pipeline (80486).

After a couple decades of professional software development, yeah, never do this 😅

phasm
Автор

I think the ingredients resemble the data, and the process resemble the instructions. The example given modifies the data rather than the instructions which what all programs do. I dont think this example actually works for self modifying code.

ArabGamesGeeks
Автор

Reminds me of »Core Wars« in its various incarnations, where you have a virtual machine and have specially written programs battle it out with each other. In a block of memory, one program can copy parts of itself, modify any location in memory, overwrite parts of itself and of other programs etc. Whatever program ends up running (or, by any definition, surviving) is declared the winner. Very entertaining if visualized appropriately.

dipi
Автор

There actually WERE valid reasons to use self-modifying code in the past. I once used a piece of 8086 code that modified code bytes that were about to be executed in the next instructions. Those instructions could not be disassembled, namely the byte sequence 0xd4, 0x09. Actually, 0xd4, 0x0a is the byte sequence that disassembles to "AAM", i.e. "ASCII Adjust AX After Multiply". This instruction effectively adjusts a multiplication result to the base of 10 (0x0a). As one can imagine, the sequence 0xd4, 0x09 also works, but to the base of 9. Once you figure that out and set appropriate values, you can execute that code in order to get a specific result which you can test. By correcting the second code byte to another value, say, 0x0b, you can change the result. However, if you patch the second code byte within a few instructions before actually executing it, the code byte sequence has already been prefetched and decoded by the CPU, such that the change does have no effect at all.
What this boils down to is a piece of code that cannot be disassembled and behaves differently when executed normally versus being single-stepped in a debugger (where said code modification actually has an effect), thus making analysing the code hard to almost impossible.
Nowadays, this is effectively useless since executable code segments are usually flagged read-only in memory.

congenio
Автор

self modifying code is usually not the way to go about things, but THE _ABILITY_ to create self-modifying code is POWERFULLY FLEXIBLE, providing something else that not many languages can do. It should not be just discounted stereo-typically as something "bad" to do (even if it often is)

DaveWhoa
Автор

While all modern operating systems make code read-only by default, they also give you ways to get around it, be it a flag in the binary or an API you call at runtime.

hellterminator
Автор

One potentially legitimate use case for self-modifying code is to intercept library calls and substitute your own code (hooking). This can be very useful for purposes of debugging, reverse engineering, etc.

jay_sensz
Автор

Me: <reads the thumbnail text>
Me: YES!! Show me
Me: <reads the title>
Me: what why not?

ltocvfy
Автор

Saw a self modifying coder once. He had a piercing and a tat.

banderfargoyl
Автор

Self-modifying code is interesting.
It lets you do things that might otherwise be impossible.

But as expected of something so powerful, it can very easily cause massive problems if used poorly.

In a modern context the downsides vastly outweigh the upsides.

However, it's worth remembering how people worked around historical limitations.

For instance, imagine having a CPU that doesn't contain any stack based instructions.
You don't HAVE to imagine, because these DID exist.

This might not immediately seem like a problem, but without a stack, you can't do function calls, because you won't be able to store a return address. (doesn't matter whether your CPU has explicit 'function call' instructions. If it doesn't, you can mimic ones using jump instructions combined with manually pushing and popping things onto a stack. However if you don't have a stack... Life gets really complicated.)

So... How do you implement function calls (to be clear on what a function is at this extremely low level - it's where you change the flow of program execution to a different location in memory, then execute some instructions before returning to approximately the same location in memory that you started from) without a stack?

As it turns out, the answer is self-modifying code; You include a jump instruction at the end of your 'function', and before calling the 'function' you write the return address to the memory location that contains the returning jump instruction of that function.

It's an interesting example of a workaround, but also of some of the features of even late 70's CPU designs that we now take for granted but which weren't always present.

You might not immediately know why a CPU needs a stack (as opposed to it just being a nice convenience feature, as a lot of more complex/later instructions are; Nobody absolutely needs a dedicated instruction to store zero to memory for instance, nor a matrix multiply, nor pretty much anything that falls under the category of a SIMD instruction), but as it turns out, a CPU design that has no stack has some very awkward limitations...

KuraIthys