Multiple times I’ve rejected an llm’s file changes and asked it to do something different or even just not make the change. It almost always tries to make the same file edit again. I’ve noticed if I make user edits on top of its changes it will often try to revert my changes.
I’ve found the best thing to do is switch back to plan mode to refocus the conversation
> Now consider what the same analyst does with an LLM agent:
"Show me all software companies with over $1B market cap, P/E under 30, and revenue growing over 20% year over year. Build a DCF model for the top 5. Run sensitivity analysis on discount rate and terminal growth."
While I think LLMs can improve the interface and help users learn/generate domain specific languages, I don’t see how a professional can trust an llm to get a technical request like this correct without verification. Wouldn’t a financial professional trust the Bloomberg llm agent that translates their request into a set of Bloomberg commands more?
I think there's a lot of overstatement about LLM capabilities throughout this piece, but I think it's generally directionally correct. There's an attitude of "LLMs are just going to directly perform business logic" or "data extraction and ingestion" or "calculations". The reality is that deterministic human-mediated code is going to do all that stuff (and AI is going to drastically amplify human leverage in building that code), and LLM agents will call into it as tools.
It's like the people who talk about how LLMs can't count the r's in "raspberry" and don't seem to understand that GPT5 can reliably e.g. work out a transformed probability distribution function from a given PDF by integration and derivation --- in part because frontier models are smarter but more importantly because they're all presumably just calling into CAS tooling.
Real financial analysts already have DCF spreadsheets where they can just plug in numbers for any company. An LLM can help with fine tuning or catching errors but it's not a game changer.
Agents like Claude Code, Cursor, etc, natively support agents.md being in the project structure, this is the standard: https://agents.md , in saying that, nothing stops you from creating a setup where you tell the agent to read the DB to obtain its instructions.
They built it as a railroady board game instead of a sandbox video game. The rumors from their experimental workshop test and latest announcement make me hopeful for a big update in the spring. Until then, it doesn’t feel worth playing it more than a couple times through. Every game feels the same.
Trying to streamline the series into a boardgame seems to be a trend. Even Civ6 felt more like a boardgame for points than a sandbox already, even though it was still rather enjoyable.
Perhaps not coincidentally, Ed Beach has been a board game designer in the past. Which is not to say he's the wrong guy for the job, he has done some great work on Civ5 BNW and Civ6. But perhaps he went overboard on 7.
I’m currently working through the Linux from scratch project and think it would be cool if I got good enough to contribute to the Linux kernel or drivers.
I think part of the issue here is that software engineering is a very broad field. If you’re building another crud app, your job might only require reading a ticket and copy/pasting from stack overflow. If you are working in a regulated industry, you are spending most of your time complying with regulations. If you are building new programming languages or compilers, you are building the abstractions from the ground up. I’m sure there’s dozens if not hundreds of other sub fields that build software in other ways with different requirements and constraints.
LLMs will trivialize some subfields, be nearly useless in others, but will probably help to some degree in most of them. The range of opinions online about how useful LLMs are in their work probably correlates to what subfields they work in
The thing is if you’re working on a CRUD app, you probably have (and you should) a framework, which make it easy to do all the boilerplate. Editor fluency can add an extra boost to your development speed.
I’ve done CRUD and the biggest proportion of the time was custom business rules and UI tweaking (updating the design system). And they were commits with small diffs. The huge commits were done by copy pasting, code generators and heavy use of the IDE refactoring tools.
I mostly agree with your assessment of the industry. However, I think there are still more new and useful products to be built. They are not “the next big thing” though. Big tech Management has been screwing this up in a couple of ways though.
1. prioritizing bets for things that could be as profitable as social media or e-commerce instead of betting on more incremental improvement products.
2. Focusing on pricing everything with reoccurring revenue and thus increasing the lifetime cost for end users instead of selling products at a discrete costs and providing end users value
3. Optimizing for growth and controlling the vision of products instead of letting small groups of talented people slowly build products.
4. Treating people as fungible resources and moving them around all the time rather than letting people develop unique expertise skillsets.
As a result, any product that can’t achieve $10+ billion annual revenue within a couple of years with a ship of Theseus team is deemed a failure and scrapped.
I’ve found the best thing to do is switch back to plan mode to refocus the conversation