A small benchmark for functional languages targeting web browsers

2 July 2022 (programming haskell idris javascript)

I had an idea for a retro-gaming project that will require a MOS 6502 emulator that runs smoothly in the browser and can be customized easily. Because I only need the most basic of functionality from the emulation (I don't need to support interrupts, timing accuracy, or even the notion of cycles), I thought I'd just quickly write one. This post is not about the actual retro-gaming project that prompted this, but instead, my experience with the performance of the generated code using various functional-for-web languages.

As I usually do in situations like this, I started with a Haskell implementation to serve as a kind of executable specification, to make sure my understanding of the details of various 6502 instructions is correct. This Haskell implementation is nothing fancy: the outside world is modelled as a class MonadIO m => MonadMachine m, and the CPU itself runs in MonadMachine m => ReaderT CPU m, using IORefs in the CPU record for registers.

The languages

Ironing out all the wrinkles took a whole day, but once it worked well enough, it was time for the next step: rewriting it in a language that can then target the browser. PureScript seemed like an obvious choice: it's used a lot in the real world so it should be mature enough, and with how simple my Haskell code is, PureScript's idiosyncracies compared to Haskell shouldn't really come into play beyond the syntax level. The one thing that annoyed me to no end was that numeric literals are not overloaded, so all Word8s in my code had to be manually fromIntegral'd; and, in an emulator of an eight-bit CPU, there's a ton of Word8 literals...

The second contender was Idris 2. I've had good experience with Idris 1 for the web when I wrote the ICFP Bingo web app, but that project was all about the DOM manipulation and no computation. I was curious what performance I can get from Idris 2's JavaScript backend.

And then I had to include Asterius, a GHC-based compiler emitting WebAssembly. Its GitHub page states it is "actively maintained by Tweag I/O", but it's actually in quite a rough shape: the documentation on how to build it is out of date, so the only way to try it is via a 20G Docker container...

Notably missing from this list is GHCJS. Unfortunately, I couldn't find an up-to-date version of it; it seems the project, or at least work on integrating with standard Haskell tools like Stack, has died off.

To compare performances, I load the same memory image into the various emulators, set the program counter to the same starting point, and run it for 4142 instructions until a certain target instruction is reached. To paper over the browser's JavaScript JIT engine etc., each test runs for 100 times first as a warm-up, then 100 times measured.

Beside the PureScript, Idris 2, and GHC/Asterius implementations, I have also added a fourth version to serve as the baseline: vanilla JavaScript. Of course, I tried to make it as close to the functional versions as possible; I hope what I wrote is close to what could reasonably be expected as the output of a compiler.

Performance results

The following numbers come from the collected implementations in this GitHub repo. The PureScript and Idris 2 versions have been improved based on ideas from the respective Discord channels. For PureScript, using the CPS-transformed version of Reader helped; and in the case of Idris 2, Stefan Höck's changes of arguments instead of ReaderT, and using PrimIO when looping over instructions, improved performance dramatically.

Implementation Generated code size (bytes) Average time of 4142 instructions (ms)
JavaScript 12,877 0.98
ReasonML/ReScript 27,252 1.77
Idris 2 60,379 6.38
Clean 225,283 39.41
PureScript 151,536 137.03
GHC/Asterius 1,448,826 346.73

So Idris 2 comes out way ahead of the pack here: unless you're willing to program in JavaScript, it's by far your best bet both for tiny deployment size and superb performance. All that remains to improve is to compile monad transformer stacks better so that the original ReaderT code works as well as the version using implicit parameters

To run the benchmark yourself, checkout the GitHub repo, run make in the top-level directory, and then use a web browser to open _build/index.html and use the JavaScript console to run await measureAll().

Update on 2022-07-08

I've added ReScript (ReasonML for the browser), which comes in as the new functional champion! I still wouldn't want to write this program in ReScript, though, because of the extra pain caused it lacks not only overloaded literals, but even type-driven operator resolution...

Also today, I have received a pull request from Camil Staps that adds a Clean implementation.

« Another year, another RetroChallenge 
All posts
 Cheap and cheerful microcode compression »