Emulating a CPU in C++ (6502)

preview_player
Показать описание
This isn't a full implementation of the 6502, this is more just a from scratch into in learning how a CPU works by writing an emulator one (in this case the 8-bit 6502).

Links:

Timestamps:
0:00 - Intro
0:29 - The 6502
4:24 - Creating CPU Internals
9:23 - Resetting the CPU
12:48 - Creating the Memory
15:10 - Creating the Execute function
23:32 - Emulating "LDA Immediate" instruction
28:00 - Hardcoding a test program
31:50 - Emulating "LDA Zero Page" instruction
37:20 - Emulating "LDA Zero Page,X" instruction
38:42 - Emulating "JSR" instruction
48:30 - Closing comments
Рекомендации по теме
Комментарии
Автор

BTW, the SP (stack pointer) should only be a Byte (8bits) not a Word (16bits)

aaronjamt
Автор

The CPU is happily executing code and admiring the amazing world around it, when suddenly thinks to itself, "What if I'm living in a simulation?"

darkstatehk
Автор

How astonishing to find this YT suggestion ! I wrote a 6502/6503 emulator in 1987 in C on a PC-XT (8086). Both clocks of 6502 and 8086 were at 4Mhz. The emulation was 400 times slower than the real processor, but it was embedded in a debugger (MS C4-like) and it was possible to set breakpoints, survey memory values, execute step by step, a.s.o... Ahh ! nostalgia...

philippelepilote
Автор

To anyone thinking about coding their own...

Most processors, internally, use predictable bits of the instruction opcode to identify the addressing modes - because, the processor really needs to be able to decode opcodes fast, without having to 'think' about it! Understanding this strategic bit pattern can make writing a CPU emulator SO much easier!

It's been a long time since I coded for 6502 ASM ... but, if you were to plot each instruction in a table, you'd likely notice that the addressing modes fall into very neat predictable columns. This means that you can identify the 'instruction' and 'mode' separately, which then lets you decouple the Instruction logic from it's Addressing logic.

This 'decoupling of concerns' can really help shorten your code and reduce errors _(less code, as every Instruction-Type is "addressing agnostic" ... and less repetition, as each "Addressing logic" is only written once and is shared across all instructions)_

Just an idea for future exploration : )

Unfortunately, sometimes this bit-masking strategy isn't perfect, so you might have to handle some exceptions to the rule.


*My experiences, for what it's worth...*

Last time I emulated an 8-bit fixed-instruction-length processor... I wrote each instruction handler as a function, then mapped them into a function-pointer array of 256 entries. That way (due to ignoring mode differences) several opcodes in an instruction group all called the same basic handler function. I then did the same thing with the modes, in a separate array ... also of 256 entries.

So, every Instruction was invariably a call to : fn_Opcode[memory[PC]] ... using the mode handler : fn_Mode[memory[PC]]

That got rid of any conditionals or longwinded case statements... just one neat line of code, that always called the appropriate Opcode/Mode combination... because the two tables encoded all the combinations.

Hope that makes sense ; )

Obviously, to ensure that this lookup always worked - I first initialised all entries of those tables to point at the 'Bad_Opcode' or 'Bad_Mode' handler, rather than starting life as NULLPTRs. This was useful for debugging ... and for spotting "undocumented" opcodes ; )

It also meant I knew I could ALWAYS call the function pointers ... I didn't have to check they were valid first ; ) It also meant that unimplemented opcodes were self-identifying and didn't crash the emu ; ) As I coded each new Instruction or Mode, I'd just fill out the appropriate entries in the lookup arrays.


But the real beauty of this approach was brevity!

If my Operation logic was wrong, I only had to change it in one place... and if my Addressing Mode code was wrong, I only had to change it in one place. A lot less typing and debugging... and a lot less chance for errors to creep in.

Not a criticism though... far from it!

I just thought I'd present just one more approach - from the millions of perfectly valid ways to code a virtual CPU : )

Understanding how the CPU, internally, separates 'Operation' from 'Addressing' quickly and seamlessly... is damned useful, and can help us emulate the instruction set more efficiently : ) But, ultimately, you might have to also handle various "ugly hacks" the CPU manufacturer used to cram more instructions into the gaps.

By using two simple lookup tables, one for Operation and another for Mode ... you can encode all of this OpCode weirdness in a simple efficient way... and avoid writing the mother of all crazy Switch statements XD

garychap
Автор

are you kidding me? that's a "holy grail" over all the YouTube for the people who studying a CS. Dam, u r an amazing person!

xakkep
Автор

I was 14 when I tought myself to program on a C64. It took me 1 week to figure out that Basic was crap. So I basically learned programming using 6502 Assembler. Today I am a computer scientist, still having the C64 ROM listing from 1984 in my bookshelf. I learned so much from it.

keyem
Автор

If you use uint8_t and uint16_t for your CPU types (in <cstdint>) you make your code basically platform agnostic, now you depend on 16 bit shorts.

ernestuz
Автор

The 6502 is one of the best processors to learn on. Nice and simple and covers most of the concepts.

GaryCameron
Автор

This is the kind of shit that keeps me up at 4AM. Thank you!

BlurryBit
Автор

In the early 80s I wrote several little programs for the 6502 in assembly language, just for fun, on my Apple II. It was always amazing how much faster this was than the same program in Basic language. The 6502 was really a simple design and easy to understand.

richardfeynman
Автор

my thesis has to do with writing a 8085 emulator and i find your videos really useful! you earned my subscription! keep it up :)

nickfffgeo
Автор

I would hit the 'like' button a thousand times if I could.

emanoelbarreiros
Автор

You use the term "clock cycle" for what is actually a machine cycle. Early processors such as the 6502 required multiple clock cycles to execute one machine cycle. The 68HC11 for example needed 4 clock cycles for each machine cycle.

etmax
Автор

On a similar theme, back in the 1980's I wrote an assembler/disassembler pair for the Z80 microprocessor that ran on a Pyramid minicomputer. I used it to work out the full functionality of UHF radio scanner that had a Z80 and associated IO chips as it's central control. I dumped the radio's 16k byte EPROM into a file containing a long string of HEX pairs, disassembled it and printed the result. Then spent a few days looking at the printout and filling it with comments. Made my modifications, also adding all the comments to the disassembled program and used my Assembler to create a new HEX file ready for EPROM programming. Started up the radio and all my mods were working as planned. They were fun days. I doubt I could do what I did back then with today's systems.

sbalogh
Автор

Finally someone who formats their code like a real man: opening brackets in newlines and spaces within parentheses! Upvoted just for that fact alone.

InsertNameHere
Автор

I actually wrote a 6502 emulator in C on my Atari-ST (68000 CPU) in 1987. I was quite proud of it. It used a kind of jump table for the actual instructions. I made an array of pointers to functions, and used the content of the instruction register as an offset into this array to call each Op-code function. For example at A9 was a pointer to the LDA immediate mode function. I started off writing a cross-assembler, and then wanted to test the resulting machine code and so wrote the emulator for it. Amazingly, after all these years I still have the source code!

martinstent
Автор

You might wanna take a look at the <cstdint> header. It defines portable integer types of fixed width.

laka
Автор

Thank you for making this video. But (*sigh*) please stop calling the 6502 old! Some of your audience is older than it is! :D

msthalamus
Автор

Surprisingly it worked despite a bug in FetchWord with cycles++ when it should be cycles- -
Also you should implement the instruction vs mode table to simplify it dramatically. By masking on the opcode bits you can then use a switch statement for the addressing mode. It would reduce the combinations to 23 instruction switch statement and 8 addressing functions.
Btw the pc++ wrapping to 0x0000 is legal so as long as mem is mem[16k] it’s fine.
I hope this isn’t taken as armchairing. The video was fun to see.

rallymax
Автор

1 hour video. I can say I finally learned how computer works. thank you so much <3

eddeveloper