Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
A tale of two clocks – Scheduling Web Audio with precision (web.dev)
40 points by lioeters on July 3, 2022 | hide | past | favorite | 22 comments


This really needs a (2013) in the title (or at least a "pre-Spectre/Meltdown").

The "up-and-coming" (back in 2013) high-resolution timer performance.now() exists on all browsers for a long time by now, but is clamped to milliseconds on some browsers because of the Spectre/Meltdown mitigations. The web audio timer is also affected by this precision reduction, but I don't know what's the current situation across browsers (I was actually hoping that the article would talk about this).

Anti-fingerprinting settings can also heavily reduce timer precision.

In short: timing on the web platform is (currently) an incredible mess.


I believe all the major browser vendors now support the notion of "secure context," where you need to use TLS, <Cross-Origin-Opener-Policy: same-origin, Cross-Origin-Embedder-Policy: require-corp> headers, and maybe a couple more requirements. They say you regain access to high precision timers, as well as SharedArrayBuffers and anything else that got lost during the Spectre/Meltdown mitigations. It appears to work in my limited testing.

This secure context makes it harder to support various random ad networks, but if you're deploying your app via the web and don't need to monetize it by ads, it's a way forward.


It also doesn't work on popular web hosters that don't allow to change the web server configuration (like Github Pages).

(there is a workaround however by injecting the required response headers on the client side, but who knows how long that's going to work: http://stefnotch.github.io/web/COOP%20and%20COEP%20Service%2...)


Add to that energy saving features that suspend tabs and you can get variable length, setTimeout(() => { console.log('timers'); }, 1);

...

...

...

...

timers

see: https://developer.chrome.com/blog/timer-throttling-in-chrome...


I made a library based on this article which I use as a starting point for all my music app projects. It's useful as the main timing code for things like drum machine and sequencers etc. https://github.com/errozero/beatstepper


Interesting!

The way I understand it, Web Audio API only lets one schedule audio source nodes with start and stop methods. As I needed to schedule something that was not audio related, I ended up creating silent oscillators of almost zero length and relying on the 'ended' event.

But running a scheduled callback is better! It's not immediately clear to me how you do it. Can you maybe explain your code a little?


Hey, sure I'll try... if you mean usage of the library:

It runs your callback every 16th note, slightly before it is actually due to play, this can vary by a few milliseconds each time but that doesn't matter as the actual audio context start time for the note is passed in to the callback, and you use that to actually schedule the events for that 16th note, eg: osc.start(time).

You can schedule 32nd notes etc too by using the stepLength property that is also passed in, time + (stepLength/2) would be a 32nd note.

Hope that makes sense? I do need to write a better description on the github page of what it actually is.

The inner workings of the library itself are mostly just as described in the article with a few tweaks.


> This clock is exposed on the AudioContext object through its .currentTime property, as a floating-point number of seconds since the AudioContext was created. [...] Since there are around 15 decimal digits of precision in a “double”, even if the audio clock has been running for days, it should still have plenty of bits left over to point to a specific sample even at a high sample rate.

It is mindboggling to me that a floating-point representation was designed into this. The required precision is well known and won't change. It is also well understood that this precision is required "locally" at possibly large future values. Even having the value wrap is mostly not a problem. Floating point is the opposite of designing for these things… (and no other audio API I'm aware of does this.)

(Okay, after doing some quick math, a double precision float should be OK for more than 100 years. The mismatch still irks me…)


Javascript doesn't have integer types, only a 'number' type which is equivalent to a 'double', so even if the clock would use an integer representation internally it would need to be exposed as a floating point value in the WebAudio API.


Just because you have lots of digits available doesn't mean you actually have that much precision or accuracy.


"The Web Audio API exposes access to the audio subsystem’s hardware clock."

I can't speak for Windows and Mac but in ALSA this isn't possible.


> in ALSA this isn't possible.

The ALSA documentation seems to disagree with you?

https://www.alsa-project.org/alsa-doc/alsa-lib/group___timer...

(disclaimer: I have not used this API and don't know if it is missing some functionality — is it?)


If it wasn’t clear from the article, the web is not ready for any kind of low-latency interactive audio yet.


The article is from 2013 and very outdated, in the meantime there are Audio Worklets which improve the situation somewhat:

https://developer.mozilla.org/en-US/docs/Web/API/AudioWorkle...

...doesn't change the fact that WebAudio is an incredibly overengineered contraption of course.


For anyone else curious about browser support for the AudioWorklet API: https://caniuse.com/mdn-api_audioworklet


This is somewhat insane. WHY is no one looking at how protocols explicitly invented to solve this problem work?

PTP precision time protocol, albeit implemented on the wrong layer to be directly useful here, solves it by timestamping frames.


It's not clear to me that any of the challenges the article addresses could be improved by the use of a clock synchronization protocol, such as PTP - it is not within the gift of application software to modify the audio sample rate clock, nor the display refresh rate.

Beyond what the article discusses, I can imagine long-running audio applications where it might be useful to build an estimate of the rate at which the system time drifts relative to the sample rate clock. Even so, we're still in the realm of the application software having to take these slightly divergent timing sources as given.


Or just have a way for a chipset to expose it's clock instead of all the methods people use to guess using buffer fullness.


Isn’t timestamping frames (events) essentially what osc.start/stop (in this example) is doing, by allowing you to pass a timestamp in the future for an event which will then be internally scheduled on the audio thread?

The issue really is that all your code is running on the UI thread with JavaScript, so you can’t accurately time stuff, whereas in C++ you’d write this stuff to execute on the audio thread callback. That said, this is now possible with AudioWorklets - but I don’t think you could interact in a sample accurate way with browser-supplied WebAudio nodes from there, so ultimately you still end up needing to schedule them for a future time if accuracy matters.


This would be way overengineered, all that's needed is a local monotonic time source with a good enough precision of somewhere between 1 to 10 microseconds. There's no need to synchronize the time with other sources, unless the goal is to control peripheral devices (which is out of scope for WebAudio anyway).

The other problem of (traditional) WebAudio is that all work needs to happen on the browser thread which runs in time slices at a very low frequency (usually 60Hz or the display refresh rate).

There's now a solution called audio worklets, but those aren't exactly trivial to work with unless all work can happen in the worklet thread (for instance you need to write your own ring buffer to push commands or sample data from the browser to the audio thread).


> There's no need to synchronize the time with other sources,

The problem isn't synchronicity, it's syntonicity, albeit only to a limited degree. What you need to know is: how much is your local audio sample clock off from the source audio clock, and from your (distinct) video output clock.

But microseconds precision is indeed perfectly fine, ideally with some way to get simultaneous timestamps from the different clocks at least for the local clocks.

I don't see PTP playing into this either, if anything the associated hardware timestamping features might be a "nice to have to make it perfect", but noone I know wires the audio clock into the PTP timestamping unit.


No consumer chipsets that I know of allowing drifting the clock to match an external clock source so PTP is a non starter. And I've not seen many pro sound cards with drifting support either.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: