the author literally picked random projects from github tagged as matrix, without considering their prevalence or whether they are actually maintained etc.
if you actually look at % of impacted clients, it’s tiny.
meanwhile, it is very unclear that any sidechannel attack on a libolm based client is practical over the network (which is why we didn’t fix this years ago). After all, the limited primitives are commented on in the readme and https://github.com/matrix-org/olm/issues/3 since day 1.
It being unclear that they're exploitable doesn't justify the fact that the reference implementation had known weaknesses that were only documented in the issue tracker and a readme buried several directories deep and not referenced from elsewhere! It also doesn't justify failing to proactively inform any of the clients making use of olm that these issues had been identified, and now anyone who does feel that this is inside their code's threat model has to rush to port to the new implementation after public disclosure.
Edit: but also if you're going to argue that it's unclear that these issues are remotely exploitable, it'd be helpful to discuss why - are there external constraints that ratelimit them such that the number of packets required is infeasible in any length of time? Is all the affected key material ephemeral and rotated before it's realistic to get a large enough sample? Just vibes?
> the author literally picked random projects from github tagged as matrix, without considering their prevalence or whether they are actually maintained etc.
I was very clear in my methodology: I grabbed everything tagged with that GitHub topic, and filtered out projects that were archived or marked as "old".
> meanwhile, it is very unclear that any sidechannel attack on a libolm based client is practical over the network (which is why we didn’t fix this years ago).
This is not an attitude that inspires confidence.
It's one thing to accidentally ship cryptography code with side-channels. It's another entirely to knowingly do so, and not fix it sooner.
> if you actually look at % of impacted clients, it’s tiny.
it's pretty much any client that has E2EE and is not Element. in my earlier quick look at Alpine Linux repos this included: Fluffychat, Nheko, gomuks, NeoChat, Chatty, weechat-matrix. then i already know that still didn't catch at least Cinny, which also is in Alpine, but includes libolm as wasm.
it's literally all "Featured clients" listed on Matrix.org except Element and Element X.
i'll put it another way. if anything other than New Vector is "tiny" and doesn't matter, is Matrix a "rich ecosystem"?
I usually keep myself out of HN discussions but some clients's issues are listed at https://github.com/NixOS/nixpkgs/pull/334638 from Nix's side, which includes Cinny, Fluffychat, nheko, which are the 3 non-Element "Featured clients" listed on Matrix.org.
In general, can timing attacks on a vulnerable implementation be prevented if sonething else guarantees the operation takes constant time?
For example, if I add a sleep after a non-constant-time operation to ensure it always takes 1s, would that prevent timing attacks? At least with this you can use a faster algorithm, and let the OS do useful work with the rest of the time.
Theoretically? Sure. By holding your response to a fixed deadline, you're still using a variable time operation under the hood, but turning it into a fixed time operation, being careful not to leak that timing information to an external observer. So, no problem.
Practically? It's a whole lot harder. Having done exactly this in a non-cryptography related situation (sleeping a timer while trying to get time bins as close as possible to a fixed size for binning events without special hardware), you run into all manner of interesting sources of variability. Using hardware is often far simpler if it's available.
Back in a cryptography context, it is exponentially harder yet because you now have to worry about what other side channels are exposing your timing information. For example, if you're "letting the OS do useful work with the rest of the time" - what is it doing? If it's off doing anything else that I can observe, then I can time that instead. And we're back to square one.
Eliminating side channel leaks is entirely non-trivial.
> If it's off doing anything else that I can observe, then I can time that instead. And we're back to square one.
Wouldn't these kinds of leaks always be present though? The scheduler can swap your process out at any time, regardless of whether it explicitly calls sleep() or not, so even with a constant time algorithm you'll have some variability.
Unless I misunderstood (which is entirely possible, not a crypto expert) that article is answering a different question: whether adding random delays can thwart timing attacks. The GP was asking about implementing a constant-time delay.
Ok, let's assume you know how long an operation is supposed to take, but want to stop it from leaking if it returns faster.
To accomplish this, you would need a high-resolution timer that runs before the operation, after the operation, and then you can subtract the two and wait. And maybe that might work.
But you're not running in kernel space.
The resolution of timing information you'd get from this tactic would be noisy, and may reintroduce the signal you were trying to hide. Especially if you're running on multiple threads or physical processors, and aren't careful enough to constrain the operations and timings to the same core.
All it will do is slow your software down and maybe not actually help.
start = gettime()
variable_time_crypto_operation()
end = gettime()
sleep(1sec - (end - start)) // Use a loop if necessary here
If the operation takes a couple ms and the sleep was 1sec, then how much information would realistically leak here? Sure the sleep might not be perfect, but I'd imagine the actual timing difference would get lost just in how much longer the sleep is.
Somewhat tangential, but there are much better options if you're looking for opportunities for optimization. It's literally trying to improve efficiency by skimping on safety features, like trying to save on vehicle weight by removing unnecessary seatbelts or crumple zones. Eliminating side channels concincingly is very difficult, you're just better off taking the tiny performance hit and virtually* eliminating that vector instead of trying to come up with a novel low-density seatbelt.
(I say virtually, because even constant time crypto isn't bulletproof - GoFetch, a recent Apple M-series CPU vulnerability inadvertently broke the "constant" part because of a quirk of the prefetcher. Side channels are hard, no need to make it harder.)
so theoretically using timers and adding sleeps could solve the problem but practically there are better ways to achieve the goal of no timing information leakage?
Might be slightly OT, but let's go. I think that this highly relates to how translations units are organized in a software project. At some point I think that mutual dependency is not a problem as long as the units form a unique "package".
This soesn't bode well for it as an option for secure messaging. Wouldn't be a problem if I didn't see people suggesting it as one.