I've ported snes9x to the Odroid GO. Although I consider this very cool, it is very very early in development and it is not currently playable and might never be.
The ESP32 is, on paper, capable of running SNES at full speed. However snes9x is written to be portable rather than ultra efficient on each specific architectures, which limits the amount of optimization that can be done short of rewriting it. In other words my experiment will probably never run perfectly but I will push it as far as my skills allow me to!
I've made some modifications, it is running a bit faster! Still not playable unfortunately. Video has been updated as well as the zip.
Super Mario World is currently running 3.5x too slow and in the previous build it was about 5x.
This is a quick update to let you know that I will stop working on this project. The biggest reason is that I cannot figure out how to profile function calls on the esp32 (or if it is indeed possible). The snes9x execution flow is pretty complicated so manually adding tracking code proved to be futile.
If anyone is willing to teach me how to profile function calls on the esp32 I'd be very happy to have a second look.
But for the record, and if someome wants to pick up where I left it, here's a list of things I've tried or considered so far:
Modify the emulation timings. I managed about 5% speed gain before the games were too badly glitchy. It's not an ideal route to break games so I dismissed it.
Use both cpu cores. I moved some of the PPU rendering logic to the second score. While it does provide good performance, it is incredibly glitchy if not synchronized. Once synchronized the performance gain was a lot more modest, 10% perhaps. I'm sure this is an avenue worth exploring again in the future. Sound can also be in its own thread, when implemented. But the rest of the emulator cannot be broken down further.
Reassess the types used everywhere. I've gained a few percents by using smaller types where possible, but the code is large and it takes time to test nothing breaks. Some structures benefited from being realigned as well.
Dynamic recompiler. I have not tried it, but the ESP32 is indeed able to load code in RAM and execute it.
Move key ram blocks to internal RAM instead of SPIRAM. I've gained a few percent. I was hoping for more of a difference but apparently SPIRAM and Internal ram are similar in speed..? Further experimentation on which blocks of RAM is been put in internal RAM could be needed.
These users thanked the author ducalex for the post: