I decided to go ahead with the C/C++ port of avrcore.js.
https://github.com/blakewford/avrcoreI imagine I will still do occasional bug fixes / minor feature improvements to RetroMicro, but in the short term I am going to devote most of my time to building out this new component. Once the new piece is battle ready, I will reintegrate it with the RetroMicro source.
For those interested, here is my justification.
At the start of the project, I made a deliberate choice to build the simulated CPU in pure Javascript. The bulk of my skill actually lies in C/C++, but I knew that various forms of Javascipt / Native interop existed and at the time my primary interest was extreme portability. To a large degree that goal has been accomplished as I regularly test the core unmodified per platform on Chrome and Firefox for a decent set of disparate devices as mentioned earlier in the forum. This has helped me define reasonably solid interfaces between the CPU and UI layers which I will maintain through the transition to the new CPU build.
Where this strategy sort of fell apart for me is performance. Historically, I started creating an ATMega32u4 simulator which only ran code generated from avr-gcc, then I moved over to support code compiled through the Arduino IDE, and finally I added support for the ATMega328 / Gamebuino. The Arduino projects that run on my software stack demonstrate well, but the game engine for the Gamebuino taxes the real hardware in a way that my current stack just can't stand up to. Due to the history of the project and the lack of a performance bound use case it took me awhile to find / accept this issue. For reference, most targets I run currently peak at 1Mhz simulation.
I have been looking into resolutions for awhile now, and I cannot find any straightforward pure Javascript solutions. Granted I am no Javascript expert, but I have been tracking and improving relative performance since the start of the project. For instance, I regularly run the Google Closure Compiler on my source and merge the changes. Additionally I fix any performance related warnings from the Developer tools available in Chrome(Like breaking up the large instruction switch statement into 2 switch statements). This has the downside of occasionally making the code less human readable, but the performance has increased significantly over time. On rarer and more risky occasions, I look for instruction patterns in the compiled source of a .hex file and make assumptions about what that code intends to do(timers / interrupts) and then make performance based choices not dictated to the engine by the coded instructions. I would sum all of these techniques up as hacks and tricks which I would prefer not to have to implement, but without them my code could not run fast enough to represent even the simplest Gamebuino title.
I have been evaluating another Javascript hat trick, asm.js, in an effort to keep my software stack sane. In short this technology converts C/C++ code into Javascript, presumably with all of the best available tips and tricks baked into the final product. In order to do this evaluation, I had to rewrite a small piece of avrcore.js in C. From here I evaluated both the current implementation and the proposed using this simple program.
#include <avr/io.h>
int
main (void)
{
//Set as output pin
DDRB = _BV (3);
//Write value
PORTB = 0x1;
return 0;
}
It takes roughly 10 instructions to run this entire program. From this I generated a series of performance metrics. For the current software stack there is little difference between Chrome and Firefox, ~30 milliseconds on my test machines. With asm.js it is about the same on Chrome but ~10 milliseconds on Firefox. Here's my take away. I am not sure how familiar the audience is with various browser wars, but at times it can be pretty crazy. The morale of the story for my purposes is not to pursue anything to engine specific: V8, Nitro, Gecko, etc. The asm.js project has some adoption in other browsers, but for my use case I am not seeing the benefits cross platform. Also even if all of the platforms saw similar Firefox levels of performance increase that would only put my primary test machines running in the ~3Mhz range when I am aiming for 16. So where does this leave me?
I currently support three primary variants Desktop, Android, and HTML5. This helps me fill in strategic gaps with the other emulators, Simbuino and gbsim. To keep my niche I would like to keep servicing all of these in about the same way that I am now. Product level Android support, Good HTML5 support(My prototyping place, iOS, OSX, FFOS, Tizen, BBOS, other???), and automated test level support for Desktop. After running a series of tests, It is my belief that my best course of action is to do the work of building a native library for the Desktop and Android releases while deploying the asm.js solution to service the web. Some numbers with the test case above using the new stack.
Javascript ~30 milliseconds
Desktop(native) ~140 microseconds
Android(native) ~4 milliseconds (Snapdragon S3, circa 2011)
No sure how these numbers will scale, but I was interested in documenting them. There is too much value in giving up on js in the Desktop / Android usecases not to do it. I ran this test on an older Android device that I have rooted in order easily run custom executable files, so the performance on my Nexus 5 is likely much better.
This is why I am redoing a large subset of completed work in the CPU and taking break from more obvious features. As a bonus certain asm.js ready browsers will also see performance gains albeit less significant. For a period of time, I may make the Android build CPU implementation switchable.
Added bonus. If the Simbuino and gbsim devs are interested in contributing to the new from scratch CPU core perhaps we could now all share a common source base?