Thanks for open-sourcing this! Roughly, what's the performance overhead from running code under hermit? I'm wondering if this could be used for doing benchmarking with less variance on non-deterministic platforms such as the JVM (I assume hermit is "deterministic enough" that the JIT and GC threads of the JVM will run the same code on every execution?)
Alas the performance overhead in realtime is not great yet. It still uses ptrace currently, which often results in a multiple-X slowdown (but at least it doesn't "subscribe" to every syscall like strace does, because some are naturally deterministic). Reverie's whole design is to make it support swappable backends, and this ptrace backend is just the reference implementation. The `experimental/reverie-sabre` directory in the Reverie repo contains our high performance backend, but it's still work-in-progress. It uses binary instrumentation and in our early experiments is 10X faster than our current backend in the worst case (i.e. strace is >10X faster when rewritten with reverie-sabre and run on a program that does nothing but syscalls).
But to the second part of your question about deterministic benchmarking, that is really a separate question. Hermit defines a deterministic notion of virtual time, which is based on the branches retired and system calls executed by all threads. When you run hermit with `--summary`, it reports a total "Elasped virtual global time", which is completely deterministic:
$ hermit run --summary /bin/date
...
Elapsed virtual global (cpu) time: 5_039_700ns
Therefore, any program that runs under hermit can get this deterministic notion of performance. We figured that could be useful for setting performance regression tests with very small regression margins (<1%), which you can't do on normal noisy systems. Compilers are one place I've worked where we wanted smaller performance regression alarms (for generated code) than we could achieve in practice. We haven't actually explored this application yet though. There's a whole small field of people studying performance modeling and prediction, and if one wanted to try this deterministic benchmarking approach, they might want take some of that knowledge and build a more accurate (correlated with wall time) performance model, more realistic than Hermit's current virtual time that is.
Well, the starting datetime at the beginning of execution in the container is whatever you set it to:
$ hermit run --epoch=2022-01-01T00:00:00Z /bin/date
Fri Dec 31 16:00:00 PST 2021
We, somewhat eccentrically, put it in last millennium by default. It used to default to the original Unix epoch back in 12/31/1969, but that was causing some software to be very unhappy ;-).
The reproducibility guarantee is that the behavior of the program is a deterministic function of its initial configuration. The epoch setting is one aspect of that initial configuration (as are file system inputs, RNG seeds, etc).
> We, somewhat eccentrically, put it in last millennium by default.
Hah, that is still going to make some TLS software and their certificate tests very unhappy! Does it show that I've ran into a similar issue before? ;)
But of course, it's trivial to fix with the --epoch parameter :)
I really love(d) Scala for introducing me to the whole idea of Optionals.
I wish for the life of me I felt like I could approach Scala at a time when it wasn't going through huge flux (I have shitty luck). I spent a good amount of time pre-version 2.10 :( and then recently went to have a look but saw Dotty (version 3.0?) coming by the end of 2020 and I was like "well, FML, time to wait a few more years and try again."
Anyone have any tips for using the Scala ecosystem effectively these days? Should I just wait for 3.0? Is it going to be a long winding road of breaking changes until a "3.11" version?
Is there a good resource for what folks are using it for these days? It seems like all the projects I used to know are ghostly on Github (but that could also be the fact it has been quite a few years, heh). Or do most folks just pony-up and use plain ol' Java libraries while writing their application/business logic in Scala?
> Is there a runtime lib? That includes e.g. Scala collections for the browser for example?
Pretty much all Scala code compiles with Scala.js, that includes the Scala standard library, as well as part of the Java standard library that has been reimplemented in Scala.
> How many kilobytes?
Scala.js does whole program optimization and dead code elimination, so there's not a single number that can be given here, you pay for what you use.
> so there's not a single number that can be given here, you pay for what you use.
While true, in practice as soon as you use, for example, `List`, `Map`, and `Option`, along with various basic collection operations (`flatMap`, `map`, `zip`, ...) you've pulled in basically the entire standard lib*
Wind up paying about a 160KB baseline "tax" to use Scala in the browser, which is well worth it considering that you can ditch jQuery and related plugins (DataTables, Validation, etc.), shedding far more weight than Scala.js in the process. I find it to be a huge win, wish I could use it in current bloated Angular 5 + TypeScript project :\
* This is due in part to existing collections library that is not so amenable to dead code elimination (or at least google closure compiler can't track the full reach of CanBuildFrom's tentacles :)) Maybe this will change in Scala 2.13 collections overhaul, have my doubts, doesn't sound like the new design is radically different than the current one.
You can already enable tracking protection globally in the options. It's on by default for private browsing, but it does break a few sites (like any adblocker).