LLMs need to compress information to be able to predict next words in as many contexts as possible.
Chess moves are simply tokens as any other.
Given enough chess training data, it would make sense to have part of the network trained to handle chess specifically instead of simply encoding basic lists of moves and follow-ups. The result would be a general purpose sub-network trained on chess.
Adding to my sibling comment, time is also mostly used as a coordination system.
Being offset by a few minutes would make aligning meetings with your remote coworker an even bigger nightmare than it is now.
Using caps doesn’t make this affirmation any more true.
While you’re correct that it does massively help, money is only a resource, which you can use to trade for a lot of things, but there are people, things and abstract concepts that money can’t buy.
Money buys recovery from failure, which is a double-edged sword. It doesn't buy the ability to learn the right lessons, and eventually the money teaches that failure doesn't really matter, because there's always another chance to get it right.
Which is why it is probably better to be just constrained enough financially that your first attempt really matters to your bottom line, but have enough to be able to pull it off.
Smart people don’t consider themselves better than people around them.
This article is so annoying.
It has some truth to it, e.g. the mismatch between incentives and success, or even the corporate bullshit to avoid saying hard truths are on point.
But the second part lacks the proper reasoning needed to establish the self-proclaimed intelligence of it author.
Comparing software development’s product building with fencing competitions is so far-fetched, yet most of the arguments of the second part build on this premise.
You don’t need amazing individuals to build a great thing, what you need is a great team. That’s why most people try to address the system. It’s hard to do when you factor in individual incentives, but it’s not because everybody is dumb, it’s because everybody (smartly) addresses their own self-interests above the company’s.
>Smart people don’t consider themselves better than people around them.
He probably doesn't work in software. There's hubris in spades in that realm. The worst is when people's ego is significantly bigger than their actual ability.
Of course you patch it, but you don’t assume that every system affected by this 0-day got exploited.
You try to check if some were and it’s obvious that people at Microsoft are doing exactly that.
Not saying that MS’s response was great, but I agree with GP that the whole thing is hyberbolic.
> Of course you patch it, but you don’t assume that every system affected by this 0-day got exploited.
Uhh, what? Of course you do. Why give the benefit of the doubt to hackers who hacked you with malicious intentions? That's the type of security nonsense that I'd expect from... Well, Microsoft lol
If you find yourself owned by, and not only from a 0-day, then yes, you wipe everything clean and re-build with mitigations in place from the start as to not get reinfected in the process.
That's pretty much the only option if you safeguard valuable data for your customers. Yes, it's expensive to get breached, so take precautions to make it a rare event and contain it as much as possible when it happens.
I don't think the article is unreasonable. This is cloud infrastructure sold to companies with defense industry contracts where breaches are taken seriously.
I mean, yes, obviously, you have malware on a box you rotate that box. They had keys and they rotated the keys. But the implication here is that the attacker could have done anything and therefor they have to destroy everything, which is unreasonable.
Rotating keys are far from enough. If your keys are compromised, you need to revoke everything. Then you need to assess what the impact is and wipe anything the compromised keys had access to during the period.
This is not theoretical. When the openssl fiasco hit, I worked in a place under financial regulation. Not even the defense sector, which is under much stricter rules. We had to go through all logs to ascertain customer data was intact, and since leaking private keys did not leave a trace in the logs we then wiped clean all systems these keys secured.
This was a massive undertaking to coordinate and minimize downtime for customers but it was deemed necessary to comply with security regulations. To hear that a big juggernaut such as Microsoft doesn't even do this without facing much consequences is mind boggling. I can not understand how that would ever pass an audit.
Everything a potentially compromised key has signed, yes. What are we discussing here? This is standard procedure by every compliance processes I have ever had the misfortune to work with, but for quite good reasons. Hope alone won't pass an audit.
Evertime a 0day thar granted privilege escalation was found on installed bins/libs, we ran a script that looked at setsuids on anything and everything and did a report on what was found. We managed to find a crypto miner once.
Obviously I won't run it on my personal computer, but i'm not renting my pc to anyone.
Great article, the example makes the point really clear.
It probably applies just as well to other languages by the way.
Two small notes after completing my read:
1. You're using `npm prune --omit=dev` after the build. It's probably fast enough, but I'm wondering if it wouldn't be worth it to install the dependencies in two steps: first `dependencies` in a first image, then in a second image, copy those and rerun `npm install` to also install the `devDependencies`.
You could then copy in the final image:
- The production dependencies from the `deps` image
- The built code from the `build` image
2. Also, it's probably a detail, but you're copying the entire app folder in the final image, maybe you could save some extra bytes by only copying the built artifacts (e.g. `dist/` in most setups).
1. That's interesting, I actually never thought about that. It could work in practice and could even be parallelised! It would just download some, or most, packages twice. I'll test that, thanks.
2. Indeed, it's easier when building a small repo but with a monorepos you need to be careful about copying the /dist at the right place + the package.json to not break runtime import. I think once again it's balance between what would be perfect and what we can live with.
Chess moves are simply tokens as any other. Given enough chess training data, it would make sense to have part of the network trained to handle chess specifically instead of simply encoding basic lists of moves and follow-ups. The result would be a general purpose sub-network trained on chess.