Category Archives: Web programming

why you should not use AudioWorklet

Note: The below text is rather technical and meant for Web developers/architects that have an interest in respective WebAudio related infrastructure.

I have been successfully using the old (meanwhile “deprecated”) ScriptProcessorNode infrastructure since around 2012. Since 2018 there is a “modern” replacement API which supposedly solves all the “problems” that Google’s browser developers (and their friends in the Mozilla subsidiary) had identified in the old API. The “modern” API has now been around for 5 years and one would think that any teething troubles of that “new” API should meanwhile be resolved. Therefore everybody should replace his old ScriptProcessorNode garbage with the cool “modern” stuff immediately, right?

One of the main flaws of the old ScriptProcessorNode infrastructure is that it “lives” in cohabitation with the browser’s main UI thread, i.e. it competes for the same CPU core that is used for “all” the remaining UI code. This means that if the CPU is “too slow”, then the UI code may interfere with the music playback and vice versa the music logic may negatively impact the responsiveness of the UI. How relevant that concern is will depend on the actual hardware of the end user (and the amount of work that the web page has to perform). I am the proud owner of a 10 year old Intel i7-4790K desktop processor (which I used for all the measurements in the later parts of this post) but users of modern smartphones may obviously be less fortunate.

For context here some measurements with my respective webSID player: The computational needs of that player are somewhat proportional to the number of SID chips (see “MOS 8580″) that have to be emulated. Below a visualization of the impact an increase from 1 to 2 and then to 3 SID chips has on the browser’s UI thread utilization (see quota needed for”Scripting”). In the webSID example the player requires about +6% (on my machine) of the available processing capacity for each additional SID chip that is emulated.



On my machine and for the above example (which doesn’t use much UI logic) there still seems to be quite a hefty reserve of “idle” CPU time. But that picture can change if the UI also starts to use the CPU. Below the same 3-SID emulation from above but this time the UI uses the streamed data of the 12 SID channels (4 per SID) to draw some graphs on HTML Canvas. It is obvious that as a function of rising UI CPU consumption there is some point at which it would be nice to offload the music related CPU use to somewhere else so that the UI has the complete capacity of the CPU core available to itself.



Let’s try out the AudioWorklet infrastructure and see what happens! For the sake of the experiment I am using my old ScriptProcessorNode based “V2” music player (the old version can the tested here: https://www.wothke.ch/webV2M/). You can try out yourself how well or how badly it fares on the device of your choice. (Let me know your experience/results in the comments!)

I’ve created a new AudioWorklet based version of that player that you can test here: https://www.wothke.ch/webV2M_worklet/



The page actually delivers on the “move CPU load elsewhere” goal: Below on the left the old ScriptProcessorNode based implementation and on the right the new AudioWorklet based implementation (these are measurements from the two pages liked above, i.e. the UI needs some CPU to draw the 16 voice streams). On the hardware that I am using, the new implementation frees up 15.6% “Scripting” of the available CPU capacity to whatever else the UI might want to do with it (at the same time there seems to be a +0.7% increase of “System” CPU use – something that one might want to keep an eye on in specific use case scenarios).



So it’s “mission accomplished”, right? Let’s go for it! Or why not? Unfortunately there are some severe trade-offs.. and once again history seems to be repeating.

You remember that incident when the Google “hotshots” (and their vassals) set out to protect us users from pages to play unwanted audio some years ago? Where they then came up with that brilliant “solution” that forced developers to rewrite their existing code to add “user clicking” facilities to be used before any audio could ever be played (uselessly wasting endless amounts of the developer’s time all over the world just to make sure that the user now has to first click the “claim your free i-Phone” button – before the page screams “welcome to porn-emporium” from your office cubicle)? Well done geniuses…

Working with the AudioWorklet infrastructure I often got the impression that once again it was the clueless intern that was tasked to design the new API.. 😦 Below some additional observations I made in the brief period that I prototyped my respective V2 player.

1) The new API adds multiple levels of asynchronous behavior during the construction phase of the AudioWorkerNode. In scenarios where the new code (AudioWorkerNode) is supposed to replace existing old code (ScriptProcessorNode) in the context of existing surrounding code this may be really annoying.

In the old ScriptProcessorNode model there already was some async handing needed if you were using WebAssembly, i.e. the ScriptProcessorNode needed to use some kind of callback to report when the WebAssembly part was actually ready to be used. (If your existing code doesn’t already use that approach.. then life will be even more interesting for you.)

In the new model a “processor” first has to be loaded asynchronously and this is the 1st callback (Promise) that you’ll need to handle before you can even create the AudioWorkletNode. By default that load will report back before the WebAssembly code that it may be using is actually ready to be used. And the only way to get information about what the state on the “processor” side is, is via asynchronous messaging (both ways). Or alternatively you might use the SharedArrayBuffer construct – which comes with severe limitations of its own. And all of this adds more levels of asynchronous callbacks to your code. In my use-case the “processor” obviously isn’t ready to play anything before it has the actual music file that should be played, so that has to again be initialized via asynchronous messaging (two ways before your UI side player code actually knows what the current state is). Other players may need to load additional chains of additional files (e.g. instruments) before they are ready to play – adding more respective roundtrips.

The added complexity obviously affects the whole software development process (from the design of the more complex solution to the debugging of potential problems).

2) To make the above “more interesting”, the designers of the API have added some “random” limitations into the mix. For example there actually are “Atomics” APIs that are designed for “concurrency” scenarious, and which allow to let one thread wait for some shared memory to be updated from some other thread – which makes sense since it potentially avoids wasteful “busy waits”. But the designers blocked use of these APIs from the UI thread – since it “would potentially be bad for user experience” (e.g. if a baboon doesn’t know what he is doing while using it).. so now it is “busy waiting” for everybody instead. Or you can completely rewrite your existing code from the ground up to embrace the new model – which you probably had intended to do from the start, right? “welcome to porn-emporium”!

3) SharedArrayBuffer seems to be one of the nicer and more useful features in the above context. But after the turmoil surrounding the Spectre exploit a few years ago, commonly used browsers had already temporarily blocked that feature completely in panic. So if you invest your time to write code based on the “modern” infrastructure be aware that this rug might be pulled from under your feet at the blink of an eye. Certain APIs apparently are deemed to be “dangerously powerful” and I am amazed to see how the geniuses at Google address these threats. In the case of SharedArrayBuffer you now have to make specific webserver configurations just to get Chrome into the mode which should actually be the default. “welcome to porn-emporium”!

4) I found tool support for the AudioWorklet infrastructure (e.g. Chrome’s DevTools, etc) ridiculously lacking – considering that it is a “feature” that has been officially released 5 years ago. I find it quite strange that respective flaws have apparently gone largely unnoticed – and one explanation that comes to mind is that the feature might not actually be used to the extent that its proponents want to make us believe. Examples: Chrome (v117) does not even show in the “Network” tab what “processor” files it has actually loaded and worse it keeps loading old cached versions of “processor” files instead of the updated versions of those files (not always) – even after shift-reload and browser restart (eventhough Chrome does load the correct version of the exact same file if it is directly opened via its URL). Imagine the fun that you’ll have when trying to figure out why some of your users see a different versions of your code than what you have actually deployed! Even diagnosing simple syntax errors in the “processor” file becomes unnecessarily challenging when Chrome doesn’t give a proper error message and basically only tells you that it could not use the respective code due to “some problem in your ‘index’ file” (that behavior actually seems to depend on the specific type of syntax error and a stray ‘.’ will break the debug info where a stray ‘a’ might not.. LOL). “welcome to porn-emporium”!

5) Google uses AudioWorklet to push its https agenda – eventhough there is absolutely no rational explanation why people should not be allowed to load music playing pages by whatever means THEY chose to use. But Google always knows better what is best and will protect you from yourself if necessary! Therefore pages that use the “modern” AudioWorklet cannot play music when loaded via http. “welcome to porn-emporium”!

6) It is an obvious precondition for realtime music players that the logic producing the music must not take longer than it takes to play the produced audio data. Violations of this precondition will potentially interrupt the continuous playback and lead to clicking or stuttering noises. In the case of the old ScriptProcessorNode it may typically be a 2048 (it may range from 256 to 16k) sized output that is produced, corresponding to 0.046 seconds of audio (at the typical sample rate of 44100Hz). For AudioWorklet that output is always 128 sized, i.e. the interval is typically 16x shorter (and may be up to 128x shorter as compared to the ScriptProcessorNode implementation). This means that in the ScriptProcessorNode case, variations in the actual processing needs can have temporary peaks (e.g. GC kicking in or waveforms that are more expensive to calculate than others) without hurting the playback – as long as the average processing time fits into the available longer time window. Whereas AudioWorklet is much more sensitive to variations – which must average out within each of the much shorter 128 -samples windows. If respective processing time variations are an issue in your existing audio generation logic then you might need yet additional workarounds (e.g. separate Worker to produce larger buffers in the background) when migrating the code to AudioWorklet.

7) I don’t have a smartphone to use my pages on, so I did the “next best” thing to at least get an indication for what to expect by running Chrome-DevTool’s “Lighthouse” analyzer. (Based on past experience respective Lighthouse reports have to be taken with a large pile of salt.) But I was interested to see how the two version of my V2M player supposedly score for Desktop as compared to Mobile devices in the categories “Performance” and “Best practices”. The 3 year old Chrome version that I used first reported 100/100 for both versions and all scenarios except for the old version (ScriptProcessorNode) on Mobile that scored “only” 98/100.
I then repeated the same test on the recent version of Chrome (v117) . Here the old version scored 99/100 on Desktop and 76/100 on Mobile. While the new (AudioWorklet) version scored 97/100 on Desktop and 76/100 on Mobile (i.e. performance supposedly worse on Desktop than the old version..). So these test results suggest to me that for this page the end user experience will be pretty identical for both versions of the page.
I did reach a point where the new version actually performed better than the old version when I simulated a “6x CPU slowdown” via Chrome’s DevTools. However the prospect of having code still work on older/slower devices may be deceiving: Users of respective old devices may often chose to not update their OS and/or browser to the latest versions and other features that you might want to use on your pages (e.g. ES6 classes, WEBGL2, etc) would prevent them from even using your page anyway.

8) From an audio visualization perspective, AudioWorklet’s 128-samples output is an advantage. You no longer need the hacks needed in the ScriptProcessorNode scenario in order to show the currently played data with a precision that exceeds the playback length of the used audio buffer.

9) Be careful to not prematurely jump at any conclusion with regard to some audio playback issue that you might be having. At some point I found that one of my pages produced ugly clicks in Chrome during its audio playback for no good reason: When analyzing the CPU load in the DevTools, respective overall load was well below 50%. But it got really interesting when I then tried to “Record” a “Performance” trace using the DevTools. As soon as the DevTools were in “Record” mode the same song that glitched in “normal” mode suddenly played flawlessly (and it restarted to glitch as soon as I stopped to “Record”). When the browser implemention is garbage you might end up getting the short end of the stick regardless of which particular feature you are using. “welcome to porn-emporium”!


So to return to the clickbait title.. should you use AudioWorklet (as a replacement for existing legacy ScriptProcessorNode implementations or in general)?

I’d say only if you verified that you actually need its potential benefits. The extra price you pay for using the “modern” API is quite high (in terms of extra software development, testing, deployment and maintenance cost, unnessessary lack of http support, extra technological risks, cross-browser issues) and often the “old” API is totally good enough and much easier to use. If AudioWorklet solves a problem that you actually have (only) then you should go for it.

“Google’s” approach of “deprecating” the ScriptProcessorNode and pushing the use of AudioWorklet resembles the move of an imaginary “handyman’s guild” to “deprecate” the use of nails and pushing the use of screws.. “welcome to porn-emporium”!

playing with shadertoy

Shadertoy is a nice little project that lets users program simple WEBGL fragment shaders that are then directly published as pages on their site. The subset of WEBGL features available for use on respective pages severely limits what can and what cannot be programmed on shadertoy.com. But as the name implies the site is meant as a playground that allows to quickly play with fragment shader code – without having to deal with the necessary glue code (vertex shaders, etc).

When publishing their fragment shaders, users also publish their source code, something that other users can then tinker with (credits for derived work can usually be found in the comments, or code may directly be marked as “forked” from somewhere else).

I sometimes play on the site myself and below a selection of respective shaders that I have done these last few years (if your device offers WEBGL2 support, you might want to directly look at the respective pages on shadertoy.com):

2019

twisted thingyhttps://www.shadertoy.com/view/3dXSDB

isolines & capsuleshttps://www.shadertoy.com/view/wsSXzh

isolines https://www.shadertoy.com/view/3sSSRR

2020

kleinian skullshttps://www.shadertoy.com/view/wtsyRB

RGB glitchhttps://www.shadertoy.com/view/WlXcDr

deep-sea mandelboxhttps://www.shadertoy.com/view/wllczj

fractional part patternshttps://www.shadertoy.com/view/wtXczf

colored moirehttps://www.shadertoy.com/view/wtXczl

2021

warrior lissajoushttps://www.shadertoy.com/view/NllGDn


2022

Fork of speakershttps://www.shadertoy.com/view/DtlGW8

2023

neon lights tunnelhttps://www.shadertoy.com/view/cddSRM

Tequilla Rainbowhttps://www.shadertoy.com/view/ddy3DD

audio reactive Discoteqhttps://www.shadertoy.com/view/mlfBDX

metaballs v0.2https://www.shadertoy.com/view/cldBzN

an old player revisited

Ten years after my first sc68 web port I thought it might be time to update the code base to the latest sc68 dev version. This new version supposedly has improved SNDH playback capabilities and I used the occasion to play with WEBGL and also add a visualization of the emulation’s internal voice streams. A live demo can be found here: http://www.wothke.ch/webSC68


PS: I updated PlayMOD to now also use this version.

m68k reminiscences

I had done the original Web port of UADE back in 2014 without bothering about its existing implementation. Though I had owned an Amiga back in the nineties, I had never actually written much software on that machine. And though I had done some m68k assembly programming on the AtariST I cannot say that I still remembered much about the respective Motorola 680×0 CPUs.

But it bothered me that UADE’s emulation still doesn’t support certain Eagleplayers – after all these years (i.e. it is incapable to play certain Amiga music files). So I decided that it might be a fun exercise to fix at least some of those flaws myself.. “If you want something done right, do it yourself”… right? 😉

The linked page shows my resulting webuade+ version: As compared to the original, this version has an added “audio.device” implementation, added multi-tasking support and added support for shared library loading (in addition to various small fixes). It supports (at least) these additional players: Andrew Parton, Ashley Hogg, PlayAY (except ZXAYEMUL), Digital Sound Creations, FaceTheMusic, G&T Game Systems, Janne Salmijarvi Optimizer, Kim Christensen, Mosh Packer, Music-X Driver, Nick Pelling Packer, TimeTracker Titanics Packer, UFO and some Custom modules.

While this little “project” was a fun “code reviewing” exercise (trying to make sense of UADE’s original m68k ASM and C code based emulator implementation), with some reverse engineering (disassembling portions the Amiga’s Kickstart OS) thrown in, it was also a stark reminder of what it meant to program back in the days..

software archaeology/puzzle..

This was the first time that I’ve tried my luck at reverse engineering a Windows *.exe with the goal of porting the respective functionality to the Web. Matter-of-factly it was a rather pointless (who needs another old music player in 2022?) but fun undertaking, that gave me a pretext to play with some new tools and refresh my “code reviewing” skills. The hobby project had one clear win condition: The result would either work perfectly or the project would end as a humiliating defeat..

To come to the point: the project was a success that can be tried out here: www.wothke.ch/webIXS

But lets take a step back shall we? “IXSPlayer Version 1.20” was originally created by the no longer existing “Shortcut Software Development BV” about 20 years ago.

The music player belongs into the “Impulse Tracker” family but what sets it apart is the way by which it generates the used audio sample data. At the time it must have looked like a promising idea to save the limited Internet bandwidth by using the smallest music files possible.. and via compression and audio synthesis this player uses ridiculously small song files that are only a few thousand bytes long (see status information in the player widget). For comparison: mp3 files are typically several million bytes (megabytes) long.

As we now know, Internet bandwidth was about to evolve quite dramatically and only a few year later, people could not care less how many megabytes some silly TikTok or Youtube video might be wasting. So ultimately the idea with the micro music-files finally did not get much traction.

Music files for this player can be found on the Internet (see modland.com). In the modland.com collection respective files are listed under “Ixalance”. I am therefore also using that name here.
The only *.ixs format player that Shortcut Software ever seem to have released to the public is a small Win32 demo-executable:

This demo player obviously only works on Windows. It only plays one song at a time (in an endless loop). And there is a flaw in its “generated cache-file naming” which may cause songs to load the wrong files and then not play correctly.

I had gotten in touch with the original developers to check if they might provide me with the source code of their player (so that I could adapt it for use on the Web – like I had already done for various other music formats in my playMOD hobby project). But unfortunately those program sources seem to have been lost over the years. The above Windows *.exe was indeed the only thing left. So that is what I used as a base for my reverse engineering.

Greetings go to the Rogier, Maarten, Jurjen and Patrick who had created the original player at “Shortcut Software Development BV”. Thank you for letting me use this reverse engineered version of your code.

Non-software engineers can safely stop reading here 🙂 All others might find useful information for their future reverse engineering projects below.

Stage 1

The original developers had told me that the player had been an “ImpulseTracker” based design and that it had been written mostly in C++. This sounded promising since C/C++ can be cross compiled quite easily to JavaScript/WebAssembly using Emscripten. I therefore set out to find some decompiler that might allow me to transform the x86 machine code from the *.exe back to its “original” C++ form. To make it short: If you want to do this kind of thing professionally you should probably buy IDAPro – or as a hobbyist like me you can try your luck with Ghidra and the free demo of IDAPro as a supplement (other tools like boomerang, cutter, exe2c, retdec, etc seem to be a waste of time).

What to expect?

A tool like Ghirda has a sound understanding of different calling conventions used by typical C/C++ compilers (__fastcall, __stdcall, __thiscall). Based on the stack manipulations performed in the machine code this allows it to correctly identify the signatures of the used C/C++ functions 95% of the time (Ghidra struggles when FPU related functions are involved but that can be fixed by manual intervention, i.e. overriding the automatically detected function signatures). Obviously respective tools also know most of the instruction set of the used CPU/FPU which in this case allowed Ghidra to translate most of the x86 gibberish back into a more human readable C form:

With no knowledge about the used data structures that code is still quite far from the C code that this will eventually turn out to be:

The decompiled code will often be a low-level representation of what the machine code does – rather than what the original C abstraction might have been, e.g. though technically correct the below code:

in the original C program would probably rather have read:

Similarly the “1:1” mapping of the optimized machine code:

must still be manually transformed to its “original” form:

Most importantly in order to get meaningful decompiler output it is indispensable to find out what the used data structures are.

But before diving into that jigsaw puzzle it makes sense to narrow down the workspace: In my case I knew that I was looking for IXS music player logic – while most of the executable was probably made up of uninteresting/standard VisualStudio/MFC code (some of which Ghidra was able to automatically identify). String constants compiled into the code then allowed me to identify/sort out additional 3rd party APIs. (The relative position of stuff within an excutable is useful to get an idea of what belongs together.)

After a tedious sorting/tagging process, the result was a set of functions that *probably* belong to IXS library that I am looking for – and that I could now export as a (still incomplete) “C-program” at the click of a button (since Ghidra does not seem to be suitable for an iterative process there was no point to do that just yet).

Time to identify data structures

I had been tempted to presume that virtual function tables should be one aspect of C++ code that can be easily identified in the machine code – but it seems I was mistaken. Even the additional “ooanalizer” tool – that seemed to be promising at first – mostly discovered useless CMenu (etc) classes – but none of the stuff that I was looking for (its extremely slow running “expert system” approach seems to be incapable to reliably search the memory for arrays that point to existing functions.. something that I had to do manually as a fallback).

At this point IDAPro’s debugger also comes in handy: When dealing with virtual function calls a simple breakpoint quickly eliminates any doubt regarding where that call might be going (this is quite crucial when dealing with Ghidra’s flawed stack calculations whenever virtual function calls are involved ).

The fact that I knew that the code was probably “Impulse Tracker based” obviously helped: When seeing an “alloc” of 557 bytes the chance of it not being an “ITInstrument” is just very slim (luckily there is a specification of the respective “Impulse Tracker” file format). Memory allocation/initalization code per se is a good place to look for the data structures used in a program.

At this point you may not know what the variables mean, but you already know what types they are – and offsets used to access data start to make sense.

Once the modelling of the data structures and the “list of the interesting functions” is reasonably complete, it is time to switch gears and enter the next development stage (it will be inevitable to come back to Ghidra from time to time to clarify open issues and it helps if variable/function names perserve some kind of postfix that allows to match them to the original decompiler output during the later development stages). But not before I mention some Ghidra pitfalls:

Ghidra pitfalls

Most of the time Ghidra works pretty well and I would not have been able to do this project without it. But there are instances where the decompiler fails miserably (I am no Ghidra expert and there might be “power user” workarounds that I am just not aware of.):

In some instances you don’t get around looking at the filthy x86 opcodes one by one (something I had hoped to avoid) to figure out manually what some piece of code actually should be doing:

The above shows the original x86 code on the left and Ghidra’s decompiler output on the right. The code seems to use LOGE2 and LOG2E constants and “f2xml” and “fscale” operations – which are known to be used in “C math pow()” implementations. But what seems to be plausible code at first glance is just total garbage – since Ghidra completely ignores the “fyl2x” operation which is actually quite important here.

Another weak point are arrays which are used as a local variable in some function. Ghidra may turn what was originally a 100 bytes array into a local “int” variable and then happily use those 4 bytes as an “arraybuffer” to poke array accesses into random memory locations. (As a workaround it helps to manually override the function’s stackframe definition. Eventhough this has problems of its own, like Ghidra introducing additional shadow vars for some of the data that is already explicitly defined in the stackframe.)

In general, Ghidra’s calculations with regard to pointers into a function’s stack frame (i.e. its “local vars”) leave a lot to be desired (Let’s say function A (among other local variables) has an array and it wants to pass a pointer to that array to a function B that it is calling.). Here the array address calculated by Ghidra is often just wrong. (Again IDAPro’s debugger comes in as a life saver to figure out what those offsets really should be. It seems save to presume that IDAPro is the far superior tool in this regard. But I guess you get what you pay for..)

Ghirda seems to be out of its depth when stuff is “simultaneously” processed on the CPU and on the FPU – and the decompiled code may then perform operations out of order – which obviously leads to unusable results.

Similarly Ghidra seems to completly ignore the calling conventions declared in virtual function tables. Consequently all its stack position calculations may be completly incorrect after a virtual function call.

Finally Ghidra’s logic seems to go totally bananas when a function allocates aligned stack memory via __alloca_probe.


Stage 2

The now exported “C program” from stage 1 is now ready to become an actually functioning program. At this point I obviously want to make as little changes as possible to the respective code: In order to not add additional bugs to the problems that undoubtedly already are present in the exported code. Also there isn’t any point to start cleaning up yet since there is a high risk that additional “program exports” may still be needed which then would require time consuming code merging.

So the first goal is to get the original multi-threaded MM-driver based player to work – like in the original player, just without the UI. That thing has been originally built using Microsaft’s C compiler and libs? Then that’s exactly what I’ll be using. And this tactic actually worked well: still slightly flawed at first but I got the exported code to actually produce audio for the first time.

Since I am aiming for a single-threaded Web environment, the multi-threading and Microsaft specific APIs have to go next: Standard POSIX APIs are available on Linux and they will be available in the final Emscripten environment as well. Therefore a simple single-threaded Linux command line player that just writes the audio to a file is the next logical step.

The code now works fine on Windows as well as in the Linux version. I am confident that the exported code has sufficiently stabilized and it is time for a cleanup (until now everything was still in one big *.h and one big *.c file – as exported by Ghidra). This is the moment where you want to have an IDE with decent refactoring support. Since I am an old fan of IntelliJ, I decided to try the trial version of CLion for the occasion. And though I found the “smart” auto-completion features of their editor rather annoying, the refactoring worked well (and the UI crap can be turned off somewhere in the settings).

Stage 3

But will it also work on the Web? Obviously it won’t! Intel processors are very lenient with regard to their memory alignment requirements, i.e. an Intel processor does not care what memory address it reads a 4-byte “int” from. And this is the platform that the exported program had been originally designed for. The processors of many other manufacturers are more restrictive and require a respective address to be “aligned”, i.e. the address for a 4-byte “int” then must be an integer divisible by 4. The Emscripten environment that I am targeting here shares this requirement. All relevant memory access operations must consequently be cleaned up accordingly – once that cleanup is done the code actually runs correctly in a Web browser.

Feeding the IXS player output to something that actually plays audio in a browser (see WebAudio) requires additional JavaScript glue code: I already have my respective infrastructure from earlier hobby projects and this part is therefore largely copy/paste that I will not elaborate on here.

But one extra bit of work is still needed: The IXS player generates the data for its instruments whenever a song is first loaded and that operation is somethat slow/expensive and blocking a browser tab for several seconds just isn’t polite. But one solution that “modern” browsers propose is the Worker API that allows to do stuff asynchronously in a separate background thread. This means that the orginal program must be split into two parts that then run completely independently and only talk to each other via asynchronous messaging. Finally there is the browser’s “DB feature” that allows to persistently cache the results of the expensive calculation so that it doesn’t even need to be repeated the next time the same song is played. So that’s what I do: Worker asynchronously fills the DB with respective data if necessary and the Player pulls the data from the DB and triggers the Worker when needed. Bingo! (All that remains to be done now is for the Google Chrome clowns to fix their browser and make sure that it isn’t their DB that blocks the bloody browser.. it just isn’t polite!)

digital waveguide based audio synthesis of a piano


https://www.wothke.ch/webPiano/

After my past retro “SID chip” audio synthesis experiments I thought it might be interesting to try out what more modern audio synthesis approaches have to offer. It isn’t a Steinway grand piano yet, but contrary to a Steinway it can be turned into a bell tower with the turn of a knob 🙂

The implementation is based on Balázs Bank’s thesis: “Physics-Based Sound Synthesis of the Piano” (see http://home.mit.bme.hu/~bank/thesis/pianomod.pdf) and the various papers that are cross referenced in that document. I highly recommend reading Bank’s thesis since it gives a much broader overview of the subject matter than the more specialized research papers usually do.

I have to admit that my math proficiency is somewhat rusty and I seem to have forgotten much of what I had once learned more than 20 years ago. In addition, much of the audio signal processing theory is simply new to me. Consequently some of the terminology used in the respective papers was totally alien to me (some of it still is) and it sometimes felt like reading a chinese text automatically translated by Google. I recommend Julius O. Smith’s page here (the “PHYSICAL AUDIO SIGNAL PROCESSING” section in particular), which provides a ton of background information useful in this context: https://ccrma.stanford.edu/~jos/Welcome.html

My “webPiano” page should work in any browser that supports WebAudio. However the UI was done for desktop computers and the layout will probably not work well on small smartphone displays.

Joomla in hindsight

It’s been some years that I had played a bit with Joomla (at the time version 2.5 was “the thing”). Since it was only kind of a learning experience and the resulting pages were of a hobby project nature, I did not spend any money on extensions but used what was available at the time in some “free to use” version. Still I ended up using about half a dozen 3rd party “extensions” (basic “booking” functionality, etc) some of which I had to customize significantly to make them cover my requirements. But the result did what I wanted and it had probably taken less time than if I had programmed everything from scratch myself.

One “doesn’t look a gift horse in the mouth”.. but some of the “Joomla/extensions” code was obviously subpar, and their authors apparently hadn’t even had an idea how to properly overload methods in sub-classes nor had they known the difference between static and non-static methods, etc. .. but I guess you get what you pay for.

I cannot say that I was thrilled by Joomla’s approach to composing pages either .. writing some “article” and putting it into some “menu” structure is easy enough. But having to define separate “modules” elsewhere (e.g. for the included JavaScript files that a specific page might be needing) and then attaching those via tedious/slow admin-GUI “checkbox clicking” to some “menu item” or some placeholder from the site’s template, etc.. Alas, whenever I came back after some month to make just some minor adjustment it always took excessive amounts of time just to remember how those things where actually connected (since much of the stuff comes from the DB it doesn’t help to do a quick text search on the file system to look for something).

Green banana software

Joomla is actively “improved” and there is what looks like a continuous stream of new releases. A look into the respective https://developer.joomla.org/security-centre.html shows that respective releases are not always an improvement and some severe security flaws where actually absent in older versions and then introduced in some “cool new update” (like some “Severity: High” bugs (CVE-2019-10946 that affected versions 3.2.0 through 3.9.4, CVE-2019-9713 that affected versions 3.8.0 through 3.9.3, etc).

Some other software flaws actually go unnoticed for years before they are eventually noticed by the Joomla developers, e.g.  “Severity: High” CVE-2017-9933 affects 1.7.3 – 3.7.2. (I can almost hear the 2016 sales pitch of the Joomla acolytes, “what you are still using 2.5.9? you must upgrade to 3.5 immediately! older versions are such a security risk!”.. haha, very funny..)

The depressing thing though is that (even for the officially supported/current Joomla versions there are no separate security patches. Instead the official advice always is to update to the next version – which supposedly fixes the problem. This means that you cannot get the 3 files that fixed a specific bug separately but instead you may get a 10Mb zip file that introduces a ton of other changes at the same time (the only exception seem to be the EOL fixes here: https://docs.joomla.org/Security_hotfixes_for_Joomla_EOL_versions/de ).

Joomla’s versioning and updating policy is weird .. or rather disturbing. Some kind of automated updating support is available in the admin GUI – except that it is “somewhat limited”:  My first Joomla instance had been using 2.5.4 and the second one 2.5.9, interestingly their admin-GUI tells me that 2.5.4 should be upgraded to 2.5.5 while my 2.5.9 instance happily tells me that no automatic upgrades are possible. So even if I had ever wanted to update to the last 2.5 version (which I think would have been 2.5.29) then even what should be a minor-sub-version update seems to be something too risky to perform automatically.. seriously?

Regarding security

As indicated above, the software quality of the Joomla core is not that great and
there are tons of more or less severe issues that “pop up” in the various versions. I’d say it is prudent to not expect much with regards to Joomla security and select potential projects accordingly.

From the beginning it is probably a good idea to restrict access as far as possible, e.g. by activating the web server’s basic HTTP authentication for the “/administrator” GUI functionalities.

A Joomla instance isn’t suitable for a “never touch a running system” approach and most sites will probably be trapped in the “update to the very latest version” hamster wheel, thereby volunteering as beta tester for whatever green bananas Joomla wants to field test. (You better not use any 3rd party “extension modules” unless you are absolutely confident that the respective provider will still be there tomorrow to get you an updated version for the next Joomla version – or else you’ll end up rewriting those portions of your site.)

Personally I chose the different approach of just back-merging the code changes for the “Severity: high” Joomla fixes into my old code base. Thus avoiding to find replacements for the long gone “extensions” still available for my old version. (This is of course an absolute  no-go and I am most certainly a risk for the Internet and maybe for world peace as well…)

Green banana software meets planned obsolescence

It never fails to amaze me how PHP could ever grow such a large following: The poor design decisions taken in early “versions” are so obvious that even newer versions (thankfully) start to reverse them (see “backward incompatible changes”).

But hey, everybody has the right to design a crappy programming language and then learn from his mistakes. The problem with this crappy language is that it comes with an expiration date: “Each release branch of PHP is fully supported for two years from its initial stable release“.

Like a light-bulb that wants to be replaced after 1000h of use. Only here it works even better.. no need to be broken, let’s replace every 24 month. Add some “backward incompatible changes” and you have a printing machine for money/extra work.
So much wasted opportunities.. just imagine “ANSI C is end of life and all the old programs must be ported to Java8 by the end of the month!”.. splendid, why did nobody think of that one earlier?

So it happens that my Web hoster informed me that “he will no longer be hosting PHP5 by the end of the month and would I please migrate everything to PHP7”. But of course! I had no plans for the weekend anyway, f*** you very much!

Obviously Joomla 2.5 could not know about PHP7 yet and the Joomla support doesn’t want anybody to use those old very dangerous legacy versions anyway. (Support in the Joomla universe means: getting help when migration to the new version went south.)

Spoiler: In spite of the “Joomla support” propaganda – old Joomla 2.5 (with the manually added security patches) can be “easily” ported to PHP7.

  1. Search for “->$” to find the following indirect variables usage pattern: change “$a->$c[$b]” to “$a->{$c[$b]}” in order to preserve the original semantics in PHP7
  2. Search for “$key = key($this->_observers)” which no longer works here since the foreach loops no longer update the internal state of the array (add “$key++;” within the foreach loops instead) .
  3. Replace preg_replace with respective preg_replace_callback based impls.

After this the Joomla instance will start again and you can go after the deprecation warnings (etc) if you want to cleanup properly.

I do not recommend to use an old version – or any Joomla version for that matter (you saw the Joomla security issue tracker)! but if you are desperate..

PlayMOD online chiptune music player

My voyage into the realm of legacy computer music had originally started with my webSID music player and later continued with my extended webUADE version of the uade Amiga music emulator.

I still have fond childhood memories of my respective C64 and Amiga home computers since these devices ultimately triggered my career in software engineering. Whereas most of the capabilities of respective 40 years old home computers obviously look quite lacking from today’s perspective, their audio features have aged rather gracefully and I feel that the audio stuff is much better suited to preserve the nostalgia – e.g. as compared to the blocky pixel graphics or underwhelming computing power.

I later learned that even though the above devices where obviously the best that ever existed (cough), other people share similar nostalgia but with regard to other devices. In many cases emulators for respective devices already existed on some platform and all that was missing were respective ports so that it would be possible to use them on a Web page. Since this is basically the same thing that I had already done for my webSID and webUADE players I started to also port some of the existing 3rd party emulators (many of which I somewhat enhanced in the process).

Over the years the number of respective JavaScript/WebAssembly based music emulators in my toolbox has grown to around 30 and it was time to put them to good use: PlayMOD combines all of “my” Web emulators within one UI to provide “all-in-one” online browsing and music playback for some of the largest “legacy computer music” collections available on the Internet:

The modland.com collection contains about 455’000 music files from various legacy home computer and game consoles and the vgmrips.net collection adds another 62’000 primarily arcade system songs. The PlayMOD project derives its name from “module files” (MOD) music – which signifies computer music that has been created using some kind of tracker software (the term “tracker” goes back to Karsten Obarski’s Ultimate SoundTracker on the Commodore Amiga from 1987). However, in addition to actual MOD-files the used collections also provide a large number of other music formats, e.g. many of the older stuff would be usually referred to as “chiptune music” today. You may use the Basics tab directly on the PlayMOD page for more background information.

When looking for a MOD or chiptune player, PlayMOD provides the best coverage available due to its combined use of different emulators/player engines. PlayMOD is probably the only comprehensive cross-platform online player in existence today.

There are hundreds of different legacy music file formats involved and the emulators available in PlayMOD currently allow to play more than 99.9% of what is available in the two collections. This avoids having to manually find and install a suitable player for each exotic format (which otherwise may be a tedious task, and a player may not even exist for the platform of your choice, e.g. see Ixalance).

The PlayMOD web page allows to browse the folder/files available in the respective collections but it does not host any of the music files. In order to play a song, the user’s browser will directly retrieve the selected file from the respective collections (see links above) and then play it. Consequently the page will only be able to play the music while the ‘modland’ and ‘vgmrips’ servers are available.

The respective modland and vgmrips collections document the evolution of computer music during the past 40+ years. Having everything consolidated in one place allows to easily compare the capabilities of respective legacy sound systems (e.g. by comparing how the compositions of the same composer sounded on different platforms) or to just indulge in reminiscences.

playmod

The PlayMOD user interface is based on the design/codebase originally created by JCH for his DeepSID. I wasn’t keen on creating a UI from scratch so I am glad that I could reuse JCH’s already existing stuff – eventhough it had to be heavily modified. The PlayMOD UI is still in a prototype/proof-of concept stage and the quality of the used meta data (e.g. composer information) leaves a lot to be desired due to it having been automatically generated based on questionable quality raw data.

Obviously, legacy computer music could also be preserved by just creating some recording from the original hardware, and as can be seen on youtube, many people already use that approach. Indeed the approach of using an emulator will not always be as accurate as the use of the original HW (on the other hand recordings may suffer from lossy data compression – a problem that an emulation does not have). As compared to the original music files, recordings may use up much more memory and consequently network bandwidth, but today that isn’t the issue that it might have been 10 years ago. However emulation avoids the additional recording/publishing step and new music files can immediately be listened to – without having to wait for somebody with the suitable infrastructure to provide such a recording. (There actually are still “scenes” where people create new music for legacy systems today.)

From a “legacy computer music preservation” perspective the emulation approach also has the benefit that it not only preserves the end result but also the steps taken to achieve it. It allows for interactions that would not be possible with a simple recording. (Mileage may vary depending on the “original” music file format.)

Example: The “Scope” tab in the below screenshot shows the output of the different channels that some “Farbrausch V2” song internally uses to create its stereo output, i.e. an emulation approach allows to look at the “magic” that is happening behind the scenes.

scopes

Similarly a respective emulation could still be tweaked during playback, e.g. by turning certain features on/off, or by using different chip models.

Playing with WebAssembly

I recently noticed that ’emscripten’ meanwhile allows to also generate WebAssembly output. WebAssembly is touted (see e.g. https://hacks.mozilla.org/2017/03/why-webassembly-is-faster-than-asm-js/) to use less space and to run more efficiently (i.e. faster) than previously existing web technologies.. sounds great!

With my portfolio of various ’emscripten’ compiled chiptune music players this seemed like the perfect opportunity to just give it a try (If you want to try this yourself, make sure to really get the latest ’emscripten’ version! Also be warned that the new ‘clang’ version that ’emscripten’ is using now, is more strict with regards to existing C/C++ standards and you may need to fix some of your old bugs in the process..).

Due to the fact that web browsers will load the *.wasm WebAssembly files asynchronously, existing bootstrapping logic may need to be reworked (you’ll need to wait for  the notification of the loaded ’emscripten’ module that it is actually ready – don’t even think about using the SINGLE_FILE hack, it won’t work in Chrome!).

In the case of my chiptune players, migration fortunately wasn’t a big deal (my player was already prepared to deal with asynchronously loaded stuff) and soon I had the first *.wasm results. And from a size perspective, those output files already were good news: In their old asm.js incarnations some of my emulators are rather bulky and in total the size of the nine emulators originally summed up to more than 11MB. The better optimizer used in the new ’emscripten’ already managed to bring those asm.js versions down to about 10MB – but with *.wasm that now shrinks to 5MB. Nice!

I then went about measuring the performance of the different versions (I tested using Chrome 64 and FireFox 57 on a somewhat older 64-bit Win10 machine). I was using my all-in-one “Chiptune Blaster” page as a test-bed (see https://www.wothke.ch/blaster/ and https://www.wothke.ch/blasterWASM/). I patched the music player “driver” to measure the time actually spent within the various emulators while they are generating music sample output. I started measuring after each emulator had already returned some sample data (i.e. its program code had already been used) and then measured the CPU-time that it took to generate 10 seconds worth of sample data (i.e. the numbers in the below table are “CPU ms / sec of music output data”, i.e. smaller is better):

wasm

I repeated my measurements multiple times (6x) and eventhough the results were – for the most part – reproducible, they fluctuated considerably (e.g. +/-10%). Any single digit percentage measurement is therefore to be taken with a pinch of salt. In Chrome there were even some massive hiccups (maybe some background garbage collection? see “(*) worst times” in parenthesis). The above table shows the “best” result that I ever observed for the respective scenarios.

Interestingly with regards to the “better performance” claim, the results are not really conclusive yet. There are some finding though:

  • Chrome users may typically experience a massive performance improvement from WASM.
  • FireFox’s asm.js implementation already performs much better that Chrome’s. For Chrome users, WASM here is actually only the 2nd best choice – for most scenarios the performance benefit of switching to FireFox here is even bigger.
  • For FireFox users the situation here is more complicated. It really depends on the specific program: Some may run massively faster, but some may actually run slower than their asm.js equivalent!

PS: I had only briefly looked at Edge but asm.js performance is slightly worse than Chrome’s and WASM is almost 2x slower than Chrome’s.

An important thing that I did not mention yet are startup times: WebAssembly is designed to be parsed more easily than respective JS code, the asynchronous loading may then also speed things up (in case your browser really puts those multiple CPUs to good use..).

And indeed this is where Chrome (and even Edge) actually shines: For the old asm.js version of my page it takes about 3 seconds for Chrome (4 seconds for Edge) to locally load/display it on my PC. For the new WASM version it’s barely more than 1 second (also for Edge)! FireFox somewhat disappoints here: It also improves on the 4 seconds for the old asm.js page, but the new WASM version still takes 2 seconds to load/display (Chrome/WASM may not be too bad after all.).

  • So WebAssembly may not always improve execution speed, but combined with the greatly improved startup time it is really nice!

269 Life

In addition to showing some cool ray-marching based realtime graphics, my latest web page is dedicated to “269 Life” and the matching title is meant to attract some extra attention to that meaningful movement (see http://www.269life.com).

The realtime WEBGL web page can be found here:  https://www.wothke.ch/269life/. You’ll need some sort of 3D graphics accellerator and a Chrome browser to use it. Or you can have a look at a youtube recording here.

I am not repeating the information that can already be found in the comment of the youtube video here. Instead I’ve added some background information regarding the techniques that I used in the page.

This slideshow requires JavaScript.

All the fractal graphics are created using Knightly’s “pseudo kleinian” algorithm (see example code in “Fragmentarium”) as a base and parameterizing it with various “distance estimate” functions. An “orbit trap” based implementation is used to color the result. Depending on the specific “scene” a number of reflections is calculated (up to three). The “Phong Blinn” shading model is finally used in combination with a standard “ambient occlusion” implementation to render the fractal background (basically the same impls that I had previously used in “modum panem”).

Three different approaches are used to display text elements:

  1. “Flat” texts are created by using a font texture that is then displayed via simple triangles (two per character).
  2. Texts like the title or the ones in the “greetings” section are then based on extruded 3d fonts (see standard THREE.js examples).
  3. Finally there are the “particle based” texts that explode in the “greetings” section – which are created using regular canvas text rendering.

A “bokeh” post-processing is applied to the resulting page to create a “depth of field” effect. (The respective implementation is derived from Dave Hoskins work.) The “bokeh” post-processing is also used to create some interesting distortion effects on the overlayed title text (which is not using the same z-buffer).

This slideshow requires JavaScript.

Finally the “greetings” scene showcases the combination of “standard” THREE.js elements (particles, extruded texts, etc) with the shader generated fractal background: By having the fractal shader propagate its z-buffer information, the regular THREE.js overlays are later clipped correctly (thanks to Marius for the respective depth calculation – see boxplorer2).

The “neon signs” here are created via a post-processing pass that adds a “glow” as well as a “god’s ray” effect. A simple random noise based shader is used to create the purple “northlights” on the horizon and 20’000 confetti particles provide for some action.

Thanks again to Wolf Budgenhagen and LMan for letting me use their music.

Design a site like this with WordPress.com
Get started