Far from solved! Though, like seemingly everything, it has benefited from the transformer architecture. And computer vision is kind of the "input", it usually sits intersecting with some other field i.e. cv for medical analysis is different to self driving is different to reconstruction for games/movies.
I think the terminology is just dogshit in this area. LLMs are great semantic searchers and can reason decently well - I'm using them to self teach a lot of fields. But I inevitably reach a point where I come up with some new thoughts and it's not capable of keeping up and I start going to what real people are saying right now, today, and trust the LLM less and instead go to primary sources and real people. But I would have never had the time, money, or access to expertise without the LLM.
Constantly worrying, "is this a superset? Is this a superset?" Is exhausting. Just use the damn tool, stop arguing about if this LLM can get all possible out of distribution things that you would care about or whatever. If it sucks, don't make excuses for it, it sucks. We don't give Einstein a pass for saying dumb shit either, and the LLM ain't no Einstein
If there's one thing to learn from philosophy, it's that asking the question often smuggles in the answer. Ask "is it possible to make an unconstrained deity?" And you get arguments about God.
They seem to have some notions of pipelines and metrics though. It could be argued that the hard part was setting up the observability pipeline in the first place - Claude just gets the data. Though if Claude is failing in such a spectacular way that the report is claiming, yes it is pretty funny that the report is also written by Claude, since this seems to be ejecting reasoning back to gpt4o territories
There's no way the AI is a priori understanding codebases with millions of LoC now. We've tried that already, it failed. What it is doing now is setting up its own extremely powerful test harnesses and getting the information and testing it efficiently.
Sure, its semantic search is already strong, but the real lesson that we've learned from 2025 is that tooling is way more powerful.
That's cool! I've always wanted to learn how kernel devs properly test stuff reliably but it seemed hard. As someone who's dabbled in kernel dev for his job. Like real variable hardware, and not just manual testing shit.
Honestly, AI has only helped me become a better SWE because no one else has the time or patience to teach me.
No you're right. I initially thought you were wrong but it is sus.
My intuition for a priori cut something along the lines of, "Even if you had the entire source code in your head at once, there's limits to reasoning about it". Computability is one hard result. You also have to interact with the real world on a wide variety of hardware systems, or even just a wide variety of systems if you create an API - how do you reason past the abstraction boundary reliably without actually having tests and interacting with systems and getting feedback? Not really possible unless LLM's control everything. More philosophical questions (such as "is our 'correct' actually the right thing?") we grant the easy case that everybody's in consensus - the "easier" problems show up either way.
But getting to the point of "understanding in principle every piece of linux" is pretty undefined and practically doesn't seem possible for a singular LLM or a human. This also seems really hairy for smuggling in whatever implicit premises you want to swing the issue either way.
But personally I (and many other people) have seen late 2025 models get extremely good, and that precisely is because they actually started doing deep tooling and like, actually running and testing their code. I was not getting nearly as much value out of them (still a decent amount of value!) prior to the tooling explosion, not even MCPs were good. It was when they actually started aggressively spawning subshells and executing live tests. But I guess using a priori/posterioi isn't really a useful split here?
Yeah, maybe you are right. But is doing math and reasoning about Turing machines a priori? If so, then it seems plausible to me that reasoning about a codebase (without running it) is also ‘a priori’.
The fact that it's millions of LoC is borderline irrelevant in that context, you don't need to have it all in context to find bugs in a handful of files.
There are so many heavy hitting cracked people like daniel from unsloth and chris lattner coming out of the woodworks for this with their own custom stuff.
How does the ecosystem work? Have things converged and standardized enough where it's "easy" (lol, with tooling) to swap out parts such as weights to fit your needs? Do you need to autogen new custom kernels to fix said things? Super cool stuff.
But the users would have to maintain their own forks then. Unless you stream back patches into your forks, which implies there's some upstream being maintained. Software doesn't interoperate and maintain itself for free - somebody's gotta put in the time for that.
I think as long as AI isn't literal AGI, social pressures will keep projects alive, in some state. There definitely is something scary about stealing entire products as a mean for new market domination - e.g. steal linux then make a corporate linux, and force everybody to contribute to corporate linux only (many linux contributors are paid by corporations, after all), and make that the new central pointer. That might be worst case scenario - then Microsoft, in collusion (which I admit is far fetched, but def possible), could completely adopt linux for servers and headless compute, and enforce very strict hardware restrictions such that only Windows works.
> But the users would have to maintain their own forks then.
I suppose the idea would be, they don't have to maintain it: if it ever starts to rot from whatever environmental changes, then they can just get the LLM to patch it, or at worst, generate it again from scratch.
(And personally, I prefer writing code so that it isn't coupled so tightly to the environment or other people's fast-moving libraries to begin with, since I don't want to poke at all of my projects every other year just to keep them functional.)
The LLM can a priori test on all possible software and hardware environments, test all possible edge cases for deployment, get feedback from millions of eyes on the project explicitly or implicitly via bug reports and usage, find good general case use features given the massive amounts of data gathered through the community of where the project needs to go next, etc?
Even in a world with pure LLM coding, it's more likely that LLMs maintain an open source place for other LLMs to contribute to.
You're forgetting that code isn't just a technical problem (well, even if it was, that would be a wild claim that goes against all hardness results known to humans given the limits of a priori reasoning...)
> The LLM can a priori test on all possible software and hardware environments, test all possible edge cases for deployment, get feedback from millions of eyes on the project explicitly or implicitly via bug reports and usage, find good general case use features given the massive amounts of data gathered through the community of where the project needs to go next, etc?
Even if that's the ideal (and a very expensive one in terms of time and resources), I really don't think it accurately describes the maintainers of the very long tail of small open-source projects, especially those simple enough for the relevant features to be copied into a few files' worth of code.
Like, sure, projects like Linux, LLVM, Git, or the popular databases may fit that description, but people aren't trying to vendor those via LLMs (or so I hope). And in any case, if the project presently fulfills a user's specific use case, then it "going somewhere next" may well be viewed as a persistent risk.
Yeah, the funny thing that Linux being open-source is absolutely in line with capitalism. Just look at the list of maintainers - they are almost all paid employees of gigacorps.
It is just an optimization that makes sense -- writing an OS that is compatible with all sorts of hardware is hard, let alone one that is performant, checked for vulnerabilities, etc.
Why would each gigacorp waste a bunch of money on developing their own, when they could just spend a tiny bit to improve a specific area they deeply care about, and benefit from all the other changes financed by other companies.
And the GPL makes it all work - as no single gigacorp can just take the whole and legally run with it for their gain, like they could if it was say MIT or BSD licensed.
So you have direct competitors all contributing to a common project in harmony.
Well, GPL is good but I think this setup would still be a local optimum for gigacorps, were it MIT or so. They are using plenty of MIT libraries, e.g. Harfbuzz.
It would just simply not make sense for them to let other companies' improvements go out of the window, unless they can directly monetize it. So it doesn't apply to every project, but especially these low-lying ones would be safe even without any sensible license.
Agents can read the binary that makes up a compiled file and detect behavior directly from that. I've been doing it to inspect my own builds for the presence of a feature.
> it's unbelievable watching the market's seemingly unlimited ability to coopt, repackage and in turn sell literally anything, even a religion and philosophical system which would be completely opposite to a consumer society.
In some sense, this is one manifestation of what Nietzche said was a good state. A scrappy, anti-metaphysical system that doesn't need to rest on grand notions of reason or morality (not that there is no reason or morality, but in the traditional Western metaphysics sense; I find that people often conflate the two, I certainly did at one point), that simply outcompetes, adapts, and comes out on top.
On the other hand, I think Nietzche would have hated the outcome and would have worked to further refine his philosophy. I wonder what his thoughts would be in the 21st century.
Also through your comment, I realize I don't actually understand subtle differences in Eastern philosophy. Confucianism would have been up Nietzche's alley (no metaphysics), but Buddhism is a weird mix of "metaphysics" in the sense of spirits and gods, but not "metaphysics" in the Western Platonic tradition, and is in fact in many ways opposite to many of the dualities and boundaries that Western metaphysics creates.
Classic kafka trap! The mere sign of resistance is a sign of a deeper psychological incompatibility that fundamentally needs to be worked through until you agree with the state.
reply