Reading these comments aren't we missing the obvious?
Claude Code is a lock in, where Anthropic takes all the value.
If the frontend and API are decoupled, they are one benchmark away from losing half their users.
Some other motivations: they want to capture the value. Even if it's unprofitable they can expect it to become vastly profitable as inference cost drops, efficiency improves, competitors die out etc. Or worst case build the dominant brand then reduce the quotas.
Then there's brand - when people talk about OpenCode they will occasionally specify "OpenCode (with Claude)" but frequently won't.
Then platform - at any point they can push any other service.
Look at the Apple comparison. Yes, the hardware and software are tuned and tested together. The analogy here is training the specific harness,caching the system prompt, switching models, etc.
But Apple also gets to charge Google $billions for being the default search engine. They get to sell apps. They get to sell cloud storage, and even somehow a TV. That's all super profitable.
At some point Claude Code will become an ecosystem with preferred cloud and database vendors, observability, code review agents, etc.
Anthropic is going to be on the losing side with this. Models are too fungible, it's really about vibes, and Claude Code is far too fat and opinionated. Ironically, they're holding back innovation, and it's burning the loyalty the model team is earning.
I think you have it exactly backwards, and that "owning the stack" is going to be important. Yes the harness is important, yes the model is important, but developing the harness and model together is going to pay huge dividends.
This coding agent is minimal, and it completely changed how I used models and Claude's cli now feels like extremely slow bloat.
I'd not be surprised if you're right in that this is companies / management will prefer to "pay for a complete package" approach for a long while, but power-users should not care for the model providers.
I have like 100 lines of code to get me a tmux controls & semaphore_wait extension in the pi harness. That gave me a better orchestration scheme a month ago when I adopted it, than Claude has right now.
As far as I can tell, the more you try to train your model on your harness, the worse they get. Bitter lesson #2932.
OpenAI, Anthropic, Google, Microsoft certainly desire path dependence but the very nature of LLMs and intelligence itself might make that hard unless they can develop models which truly are differentiated (and better) from the rest. The Chinese open source models catching up make me suspect that won't happen. The models will just be a commodity. There is a countdown clock for when we can get Opus 4.6+ level models and its measured in months.
The reason these LLM tools being good is they can "just do stuff." Anthropic bans third party subscription auth? I'll just have my other tool use Claude Code in tmux. If third party agents can be banned from doing stuff (some advanced always on spyware or whatever), then a large chunk of the promise of AI is dead.
Amp just announced today they are dumping IDE integration. Models seem to run better on bare-bones software like Pi, and you can add or remove stuff on the fly because the whole things open source. The software writes itself. Is Microsoft just trying to cram a whole new paradigm in to an old package? Kind of like a computer printer. It will be a big business, but it isn't the future.
At scale, the end provider ultimately has to serve the inference -- they need the hardware, data centers & the electricity to power those data centers. Someone like Microsoft can also provide a SLA and price such appropriately. I'll avoid a $200/month customer acquisition cost rant, but one user, running a bunch of sub agents, can spend a ton of money. If you don't own a business or funding source, the way state of the art LLMs are being used today is totally uneconomical (easy $200+ an hour at API prices.)
36+ months out, if they overbuild the data centers and the revenue doesn't come in like OpenAI & Anthropic are forecasting, there will be a glut of hardware. If that's the case I'd expect local model usage will scale up too and it will get more difficult for enterprise providers.
(Nothing is certain but some things have become a bit more obvious than they were 6 months ago.)
Thinking about this a little more -> "nature of LLMs and intelligence"
Bloated apps are a material disadvantage. If I'm in a competitive industry that slow down alone can mean failure. The only thing Claude Code has going for it now is the loss making $200 month subsidy. Is there any conceivable GUI overlay that Anthropic or OpenAI can add to make their software better than the current terminal apps? Sure, for certain edge cases, but then why isn't the user building those themselves? 24 months ago we could have said that's too hard, but that isn't the case in 2026.
Microsoft added all of this stuff in to Windows, and it's a 5 alarm fire. Stuff that used to be usable is a mess and really slow. Running linux with Claude Code, Codex, or Pi is clearly superior to having a Windows device with neither (if it wasn't possible to run these in Windows; just a hypothetical.)
From the business/enterprise perspective - there is no single most important thing, but having an environment that is reliable and predictable is high up there. Monday morning, an the Anthropic API endpoint is down, uh oh! In the longer term, businesses will really want to control both the model and the software that interfaces with it.
If the end game is just the same as talking to the Star Trek computer, and competitors are narrowing gaps rather than widening them (e.g. Anthropic and OpenAI releases models minutes from each other now, Chinese frontier models getting closer in capability not further), then it is really hard to see how either company achieves a vertical lock down.
We could actually move down the stack, and then the real problem for OpenAI and Anthropic is nVidia. 2030, the data center expansion is bust, nVidia starts selling all of these cards to consumers directly and has a huge financial incentive to make sure the performant local models exist. Everyone in the semiconductor supply chain below nvidia only cares about keeping sales going, so it stops with them.
Maybe nvidia is the real winner?
Also is it just me or does it now feel like hn comments are just talking to a future LLM?
That was true more mid last year, but now we have a fairly standard flow and set of core tools, as well as better general tool calling support. The reality is that in most cases harnesses with fewer tools and smaller system prompts outperform.
The advances in the Claude Code harness have been more around workflow automation rather than capability improvements, and truthfully workflows are very user-dependent, so an opinionated harness is only ever going to be "right" for a narrow segment of users, and it's going to annoy a lot of others. This is happening now, but the sub subsidy washes out a lot of the discontent.
You're right, because owning the stack means better options for making tons of money. Owning the stack is demonstrably not required for good agents, there are several excellent (frankly way better than ol' Claude Code) harnesses in the wild (which is in part why so many people are so annoyed by Anthropic about this move - being forced back onto their shitty cli tool).
the fat and opinionated has always been true for them (especially compared to openai), and to all appearances remains a feature rather than a bug. i can’t say the approach makes my heart sing, personally, but it absolutely has augured tremendous success among thought workers / the intelligensia
I think there is a huge gap between people who has a good CLAUDE.md (or similar), or those who doesn’t.
When I first tried, the created code was garbage. Now that I slowly built my memory, several thousands of manually written examples and guidance, it can generate quite reliably, when it doesn’t need literally anything outside of those…
That being said, most of the vibe coded codebases (in reality every single one which I saw) use garbage memory, and consequently have garbage output.
So the same thing is terrible and great at the same time. People who give time, and people who is fine producing garbage (huge majority) says it’s great. People who just tried it out, and don’t have the luxury to potentially waste days and weeks, say that it’s bad. All of these are true at once.
I think their branding is cementing in place for a lot of people, and the lived experience of people trying a lot of models often ends up with a simple preference for Claude, likely using a lot of the same mental heuristics as how we choose which coworkers we enjoy working with. If they can keep that position, they will have it made.
I'm a very experienced developer with a lot of diverse knowledge and experience in both technical and domain knowledge. I've only tried a handful of AI coding agents/models... I found most of them ranging from somewhat annoying to really annoying. Claude+Opus (4.5 when I started) is the first one I've used where I found it more useful than annoying to use.
I think Github Co-Pilot is most annoying from what I've tried... it's great for finishing off a task that's half done where the structure is laid out, as long as you put blinders keeping it focused on it. OpenAI and Google's options seem to get things mostly right, but do some really goofy wrong things from my own experiences.
They all seem to have trouble using state of the art and current libraries by default, even when you explicitly request them.
I've only used the default selection, whatever it is in VS Code. Even paid for a year at one point as I was first using it with some SQL schema generation and it was pretty useful, kind of as a super auto-complete.
If the default option isn't at least arguably the best option I can't really speak to that. I would suggest that maybe metrics on a given set of technologies be done and that based on the project in use, that it should choose the best option dynamically by default. Such as C#+MS-SQL vs Node+Postgres vs Python+Matlab+DuckDB.
The competition angle is interesting - we're already seeing models like Step-3.5-Flash advertise compatibility with Claude Code's harness as a feature. If Anthropic's restrictions push developers toward more open alternatives, they might inadvertently accelerate competitor adoption. The real question is whether the subscription model economics can sustain the development costs long-term while competitors offer more flexible terms.
I don't think many are confused about why Anthropic wants to do this. The crux is that they appear to be making these changes solely for their own benefit at the expense of their users and people are upset.
There are parallels to the silly Metaverse hype wave from a few years ago. At the time I saw a surprising number of people defending the investment saying it was important for Facebook to control their own platform. Well sure it's beneficial for Facebook to control a platform, but that benefit is purely for the company and if anything it would harm current and future users. Unsurprisingly, the pitch to please think of this giant corporation's needs wasn't a compelling pitch in the end.
"Training the specific harness" is marginal -- it's obvious if you've used anything else. pi with Claude is as good as (even better! given the obvious care to context management in pi) as Claude Code with Claude.
This whole game is a bizarre battle.
In the future, many companies will have slightly different secret RL sauces. I'd want to use Gemini for documentation, Claude for design, Codex for planning, yada yada ... there will be no generalist take-all model, I just don't believe RL scaling works like that.
I'm not convinced that a single company can own the best performing model in all categories, I'm not even sure the economics make it feasible.
> pi with Claude is as good as (even better! given the obvious care to context management in pi) as Claude Code with Claude
And that’s out of the box. With how comically extensible pi is and how much control it gives you over every aspect of the pipeline, as soon as you start building extensions for your own, personal workflow, Claude Code legimitely feels like a trash app in comparison.
I don’t care what Anthropic does - I’ll keep using pi. If they think they need to ban me for that, then, oh well. I’ll just continue to keep using pi. Just no longer with Claude models.
Apple can do those things because they control the hardware device, which has physical distribution, and they lock down the ecosystem. There is no third party app store, and you can't get the Photos app to save to Google Drive.
With Claude Code, just export an env variable or use a MITM proxy + some middleware to forward requests to OpenAI instead. It's impossible to have lock in. Also, coding agent CLIs are a commodity.
> At some point Claude Code will become an ecosystem with preferred cloud and database vendors, observability, code review agents, etc.
i've been wondering how anthropic is going to survive long term. If they could build out an infrastructure and services to complete with the hyperscalers but surfaced as a tool for claude to use then maybe. You pay Anthropic $20/user/month for ClaudeCode but also $100k/month to run your applications.
Using an API key is orders of magnitude more expensive. That's the difference here. The Claude Code subscriptions are being heavily subsidized by Anthropic, which is why people want to use their subscriptions in everything else.
I think the people who use more than they pay for vastly outnumber those who pay for more than they use. It takes intention to sign up (not the default, like health care) and once you do, you quickly get in the habit of using it.
This move feels poorly timed. Their latest ad campaigns about not having ads, and the goodwill they'd earned lately in my book was just decimated by this. I'm sure I'm not the only one who's still just dipping their toes into the AI pool. And am very much a user that under utilizes what I pay for because of that. I have several clients who are scrambling to get on board with cowork. Eliminating API usage for subscription members right before a potentially large wave of turnover not only chills that motivation it signals a lack of faith in their marketing, which from my POV, put out the only AI super bowl campaign to escape virtually unscathed.
> the goodwill they'd earned lately in my book was just decimated by this
That sounds absurd to me. Committing to not building in advertising is very important and fundamental to me. Asking people who pay for a personal subscription rather than paying by the API call to use that subscription themselves sounds to me like it is. Just clarifying the social compact that was already implied.
I WANT to be able to pay a subscription price. Rather like the way I pay for my internet connectivity with a fixed monthly bill. If I had to pay per packet transmitted, I would have to stop and think about it every time I decided to download a large file or watch a movie. Sure, someone with extremely heavy usage might not be able to use a normal consumer internet subscription; but it works fine for my personal use. I like having the option for my AI usage to operate the same way.
The problem with fixed subscriptions in this model is that the service has an actual consumption cost. For something like internet service, the cost is primarily maintenance, unless the infrastructure is being expanded. But using LLMs is more like using water, where the more you use it, the greater the depletion of a resource (electricity in this case, which is likely being produced with fossil fuel which has to be sourced and transported, etc). Anthropic et al would be setting themselves up for a fall if they allow wholesale use at a fixed price.
The people mad about this feel they are entitled to the heavily subsidized usage in any context they want, not in the context explicitly allowed by the subsidizer.
It's kind of like a new restaurant started handing out coupons for "90% off", wanting to attract diners to the restaurant, customers started coming in and ordering bulk meals then immediately packaging them in tupperware containers and taking it home (violating the spirit of the arrangement, even if not the letter of the arrangement), so the restaurant changed the terms on the discount to say "limited to in-store consumption only, not eligible for take-home meals", and instead of still being grateful that they're getting food for 90% off, the cheapskate customers are getting angry that they're no longer allowed to exploit the massive subsidy however they want.
I would not be surprised if the market share breakdown is similar to browsers (eg 70+ percent - more if you ignore that safari is the only real option on iOS).
VSCode has slowly been getting more and more bloated, but the alternatives are all very meh or are missing crucial extensions.
If you do embedded development, things like https://platformio.org/platformio-ide, but also smaller, nice to have extensions for auto-deploying code to cloud providers, etc.
To me that sounds like claiming Arduino IDE is a "crucial extension". Their website[1] lists a bunch of IDEs where it can be integrated, so I wouldn't call it missing. That said both of these are hobbyist toys to make it more approachable and embedded development was fine long before VSCode, they're in no way "crucial".
You're suggesting Waymo isn't deep software? Or Tensorflow? Or Android? The Go programming language? Or MapReduce, AlphaGo, Kubernetes, the transformer, Chrome/Chromium or Gvisor?
You must have an amazing CV to think these are shallow projects.
No, I just realize these for what they are - reasonable projects at the exploitation (rather than exploration) stage of any industry.
I’d say I have an average CV in the EECS world, but also relatively humble perspective of what is and isn’t bleeding edge. And as the industry expands, the volume „inside” the bleeding edge is exploitation, while the surface is the exploration.
Waymo? Maybe; but that’s acquisition and they haven’t done much deep work since. Tensorflow is a handy and very useful DSL, but one that is shallow (builds heavily on CUDA and TPUs etc); Android is another acquisition, and rather incremental growth since; Go is a nth C-like language (so neither Dennis Richie nor Bjarne Stroustrup level work); MapReduce is a darn common concept in HPC (SGI had libraries for it in the 1990s) and implementation was pretty average. AlphaGo - another acquisition, and not much deep work since; Kubernetes is a layer over Linux Namespaces to solve - well - shallow and broad problems; Chrome/Chromium is the 4th major browser that reached dominance and essentially anyone with a 1B to spare can build one.. gVisor is another thin, shallow layer.
What I mean by deep software, is a product that requires 5-10y of work before it is useful, that touches multiple layers of software stack (ideally all from hardware to application) etc. But these types of jobs are relatively rare in the 2020s software world (pretty common in robotics and new space) - they were common in the 1990s where I got my calibration values ;) Netscape and Palm Pilot was a „whoa”. Chromium and Android are evolutions.
> No, I just realize these for what they are - reasonable projects at the exploitation (rather than exploration) stage of any industry.
I get that bashing on Google is fun, but TensorFlow was the FIRST modern end-user ML library. JAX, an optimizing backend for it, is in its own league even today. The damn thing is almost ten years old already!
Waymo is literally the only truly publicly available robotaxi company. I don't know where you get the idea that it's an acquisition; it's the spun-off incarnation of the Google self-driving car project that for years was the butt of "haha, software engineers think they're real engineers" jokes. Again, more than a decade of development on this.
Kubernetes is a refinement of Borg, which Google was using to do containerized workloads all the way back in 2003! How's that not a deep project?
True, for some definition of first and some definition of modern. I’d say it builds extremely heavily on the works inside XTX (and prior to that, XFactor etc) on general purpose linear algebra tooling, and still doesn’t change the fact that it remains shallow, even including JAX. Google TPUs change this equation a bit, as they are starting to come to fruition; but for them to reach the level of depth of NVDA, or even DEC to SUN, they’d have to actually own it from silicon to apps… and they eventually might. But the bulk of work at Google is narrow end-user projects, and they don’t have (at large) a deep engineering excellence focus.
Waymo is an acquihire from ‘05 DARPA challenges, and I’d say Tesla got there too (but with a much stricter hardware to user stack, which ought to bear fruits)
I’d say Kubernetes would be impressive compared to 1970s mainframes ;) Jokes aside, it’s a neat tool to use crappy PCs as server farms, which was sort of Google’s big insight in 2000s when everyone was buying Sun and dying with it, but that makes it not deep, at least not within Google itself.
But this may change. I think Brin recognizes this during the Code Red, and they start very heavily on building a technical moat since OpenAI was the first credible threat to the user behavior moat.
You think that Tesla, which has not accepted liability for a single driverless ride, has "gotten there?" I'm not even going to look up how many Waymo does in a month, I'm sure it's in the millions now.
Come on, man.
> Google's TPUs change this equation a bit
Google has been using TPUs to serve billions of customers for a decade. They were doing it at that scale before anyone else. They use them for training, too. I don't know why you say they don't own the stack "from silicon to apps" because THEY DO. Their kernels on their silicon to serve their apps. Their supply chain starts at TSMC or some third-party fab, exactly like NVIDIA.
Google's technical moat is a hundred miles deep, regardless of how dysfunctional it might look from the outside.
I think this acquisition in reality has more to do with developer goodwill? And a little to do with the shell game of making these AI companies hard to value because they collect assets like this.
Since when is a CLI tool like this a sufficiently demanding technical project that it needs to buy the runtime just to get sufficient support?
This just isn't the hard part of the product.
Like if I was building a Claude Code competitor and I acquired bun, I wouldn't feel like I had an advantage because I could get more support with like fs.read?
Seems there's a lot of conceptual overlap between your WFP, Dynamic Workers, and Sandbox products.
(I guess there's an expectation of at least some permanence with WFP?)
reply