risk? certainty. it's pretty much guaranteed. the most capable models are already behind closed doors for gov/military use and that's not ever changing. the public versions are always going to be several steps behind whatever they're actually running internally. the question is what the difference will be between the corporation and pleb versions is
I've built trading bots that run monte carlo sims on historical data... numpy works but gets slow on large backtests, and pytorch feels like overkill when I just want fast array math without managing GPU memory. If this can drop in and handle the heavy lifting automatically i could see use for it
Does the guess cache sync across machines at all, or is it purely local? I jump between my laptop and desktop for AoC and it'd be annoying to resubmit a 'too low' answer I already tried elsewhere.
Sorry for the delay, I missed this! Great question. The guess cache is local by default, but you can set ELF_CACHE_DIR to point it at a shared folder like iCloud, Dropbox, Google Drive, or anything that syncs across your machines. That will keep your guess history unified so you do not risk resubmitting a too low or too high answer. Happy puzzle solving!
normalized → MATCH
Expected: EF28EA082C882A3F9379A57E05C929D76E98899E151A6746B07D8D899644372F
Actual: EF28EA082C882A3F9379A57E05C929D76E98899E151A6746B07D8D899644372F
kmeans → MATCH
Expected: DA96D0505BCB1A5A2B826CEB1AA7C34073CB88CB29AE1236006FA4B0F0D74C46
Actual: DA96D0505BCB1A5A2B826CEB1AA7C34073CB88CB29AE1236006FA4B0F0D74C46
Hashcheck PASSED — outputs match golden hashes.
---------
Next step is probably benchmarking this against sklearn? Accuracy comparison and performance hit from all the rounding operations.
Anyone here working in maritime auditing, medical data, or other regulated stuff - would you actually use something like this? Trying to figure out if crypto- verifiable analytics is solving a real problem or just a cool technical exercise.
Author here, appreciate you running it and posting the hashes.
Re: whether this is useful beyond being a cool exercise:
sklearn:
Yeah, sklearn is obviously faster and great for day to day work. The reason this project doesn’t use it is because even with fixed seeds, sklearn can still produce different results across machines due to BLAS differences, CPU instruction paths, etc.
Here the goal isn’t speed, it’s to make sure the same dataset always produces the exact same artifacts everywhere, down to the byte.
Where that matters:
A few examples from my world:
Maritime/industrial auditing: a lot of equipment logs and commissioning data get “massaged” early on. If later analysis depends on that data, you need a way to prove the ingest + transformations weren’t affected by the environment they ran on.
Medical/regulatory work: clinical models frequently get blocked because the same run on two different machines gives slightly different outputs. Determinism makes it possible to freeze analytics for compliance.
Any situation where you have to defend an analytical result (forensics, safety investigations, audits, etc).
People assume code is reproducible, but floating-point libraries, OS updates, and dependency drift break that all the time.
So yeah sklearn is better if you just want clustering.
This is more like a “reference implementation” you can point to when you need evidence that the result wasn’t influenced by hardware or environment.
for backtesting trading strategies this could be useful... i've had sims give different results across machines and never knew if it was real or fp drift. how is the performance on real-world sized datasets?
reply