Assembler 6502 emulator
Compilers can only do so much it's up to us humans to design the structure of the emulator to be efficient and fast. Not for the computers' sake, but for the programmers'. The majority of these bigger savings start with having a clear structure. You can make much bigger savings by optimizing the emulator on the algorithmic level, even if the access to the emulator innards happened to be a cycle or two slower than optimal. I meant that compared to the other work the emulator has to do, the access delay is insignificant. If you did take that design philosophy, apply it everywhere, you'd have a slow, crappy emulator.No, that's not what I meant. While 1-2 clock cycles IS nothing, you can't disregard a few cycles' difference in this application.
Assembler 6502 emulator code#
But, if you do, using well-marked pointers will likely lead to faster code than one that uses global variables. Of course, if you do not sprinkle const in your function parameter definitions, this likely does not matter. You cannot tell the compiler which functions modify which global variables, so when the compiler compiles a function that calls other functions, it cannot always tell whether a global variable might be modified, leading to sub-optimal code. It just isn't relevant.įurthermore, if we take into account that one can tell the compiler how each function accesses the pointer (whether the pointer is const, or the data is const, or neither, or both), this tends to let the compiler generate better code. Typical Linux processors, even embedded ones, already tend to have near or over gigahertz meaning the access difference we are talking about is less than one-thousandth of the typical mos-6502 clock cycle. So, direct addressing produces longer code, and indirect addressing uses one register for the pointer otherwise use the same clock cycles.Įven on architectures where there is a difference, we're talking about one or two clock cycles per access. The indirect form using a pointer requires only four bytes (for offsets -128.127 to the pointer), and is only four bytes. On x86, addressing uses the 32-bit immediate form, which requires five bytes, so that any relocation can be done correctly. Aside from using one register to hold the pointer, there is just no difference at all. Pointer-to-structure references use a base register, often %rdi if specified as first parameter to a function. Global variables are addressed using %rip. If using a global gives a performance advantage (which I'm not sure it does actually)I'm pretty sure it does not give a meaningful performance advantage. I suspect it is a psychological effect that the types of data structures you use, also affect the patterns of code you create.) (I don't know why, but whenever global variables are used a lot in C, the code also tends to become spaghetti. In fact, I think it tends to push your mindset towards modularity, and yield better organized, more readable code. Your functions will pass the pointer to the state all over, and you may need to add fields to describe at least some of the emulator internal state, but in my experience, those are not downsides. There is very few downsides to this approach. Some have grown warts afterwards to support multiple interpreters in one process, but those tend to be fragile and prone to leakage.) That means you can't have more than one independent interpreter state in one process. Instead of using a separate state variable, most scripting languages just use global variables. (Most embedded scripting languages have a very similar problem, Lua being the exception that comes to mind.
Also, you only need to provide the one pointer to the state structure to your functions. Since the mos_6502_t structure contains the entire state of the emulated machine, you can emulate more than one in a single process. The code is then not dependent on host byte order, and should be much easier to write. #define PC_HIGH(stateptr) ((stateptr)->pc / 256U)For the different addressing modes, define macros similar to above. #define PC_LOW(stateptr) ((stateptr)->pc & 255U)