My manager thinks if we give it a year or two, no one will write code by hand anymore, we will just generate everything from specifications in English.
At the current rate of progress, that is a reasonable thing to expect. But I'd say give it two to three years, myself. This kind of wholesale paradigm shift tends to take longer than you think it will, and then, once it happens, it tends to happen faster than you think it will.
Except for things like hardware drivers, most of the code that will ever need to be written already has been. It will just need to be refactored and recast for new systems and applications, and current-gen LLMs are already extremely good at that.
The line that separates specifications and source code will get increasingly blurry over the next couple of years, eventually reaching a point where it's no longer worth arguing about.
What are some of your reasons? The usual ones don't seem to be holding up well but I'm interested in new insights, certainly. Obviously you disagree with your manager, but that could be due to any number of things.
That sounds awful... Thankfully our CTO is quite supportive of our teams anti-AI policy and is even supportive of posting our LLM-ban on job postings. I honestly dont think that I could operate in an environment with any sort of AI mandate...
I guess time will tell, but so far none of the AI-output we've seen is any good. We dont like to adopt technologies based on hype, so if it proves itself it will be adopted, but until then its a toy.
Exactly! Also operating "at scale" is only impressive if you can do it with comparable speed and uptime, it doesn't mean much if every page takes seconds to load and it falls over multiple times a day lol
Ironic that that same AI you're mentioning is probably a large part of why this class of outages are increasing. Id highly recommend folks understand their infrastructure enough to setup/run it without AI before they put anything critical on it.
Sure. I can agree with that. At the same time, the reason people aren't doing it is not solely a skill issue. It's also a matter of time, energy, and what you want to prioritise.
I believe I have good enough control over it to fix issues that may arise. But then again, CC will probably do it faster. I will most likely not need to fix my own issues, but if needed, I think I will be able to.
"Critical" plays an important role in what you're saying. The true core of any business is something you should have good control over. You should also accept that less important parts are OK for AI to handle.
I think the non-critical part is a larger part than most people think.
We are lagging behind in understanding what AI can handle for us.
I'm an optimistic grey beard, even if the writing makes me sound like a naive youth :)
I mean, not necessarily proprietary right? There are OSS solutions like forgejo that make it pretty simple, at least as simple as running a git system and a standalone CI system
i mean that is certainly better, but I still don’t like having them coupled. Webhooks were a great idea, and everyone seems to have forgotten about them.
Wasnt GitHub supposed to be doing a feature freeze while they move to Azure?(1)
They certainly could use it as their stability has plummeted. After moving to a self-hosted Forgejo I'll never go back. My UI is instant, my actions are faster than they ever were on GH (with or without accelerators like Blacksmith.sh), I dont constantly get AI nonsense crammed into my UI, and I have way better uptime all with almost no maintenance (mostly thanks to uCore)...
GH just doesnt really have much a value proposition for anything that isnt a non-trivial, star gathering obsessed, project IMO...
I mean, I just dont see any evidence of that happening. TBF I'm a SWE so I can only speak to that segment, but its literally worse than useless for working with anything software related thats non-trivial...
I see that sentiment here all the time and I don't understand what you must be doing; our projects are far from non trivial and we get a lot of benefit from it in the SWE teams. Our software infra was alway (almost 30 years) made to work well with outsourcing teams, so maybe that is it, but I cannot understand how you can have quite that bad results.
Butting in here but as I have the same sentiment as monkaiju: I'm working on a legacy (I can't emphasize this enough) Java 8 app that's doing all sorts of weird things with class loaders and dynamic entities which, among others, is holding it in Java 8. It has over ten years of development cruft all over it, code coverage of maybe 30-40% depending on when you measure it in the 6+ years I've been working with it.
This shit was legacy when I was a wee new hire.
Github Copilot has been great in getting that code coverage up marginally but ass otherwise. I could write you a litany of my grievances with it but the main one is how it keeps inventing methods when writing feature code. For example, in a given context, it might suggest `customer.getDeliveryAddress()` when it should be `customer.getOrderInfo().getDeliveryInfo().getDeliveryAddress()`. It's basically a dice roll if it will remember this the next time I need a delivery address (but perhaps no surprises there). I noticed if I needed a different address in the interim (like a billing address), it's more likely to get confused between getting a delivery address and a billing address. Sometimes it would even think the address is in the request arguments (so it would suggest something like `req.getParam('deliveryAddress')`) and this happens even when the request is properly typed!
I can't believe I'm saying this but IntelliSense is loads better at completing my code for me as I don't have to backtrack what it generated to correct it. I could type `CustomerAddress deliveryAddress = customer` let it hang there for a while and in a couple of seconds it would suggest to `.getOrderInfo()` and then `.getDeliveryInfo()` until we get to `.getDeliveryAddress()`. And it would get the right suggestions if I name the variable `billingAddress` too.
"Of course you have to provide it with the correct context/just use a larger context window" If I knew the exact context Copilot would need to generate working code, that eliminates more than half of what I need an AI copilot in this project for. Also if I have to add more than three or four class files as context for a given prompt, that's not really more convenient than figuring it out by myself.
Our AI guy recently suggested a tool that would take in the whole repository as context. Kind of like sourcebot---maybe it was sourcebot(?)---but the exact name escapes me atm. Because it failed. Either there were still too many tokens to process or, more likely, the project was too complex for it still. The thing with this project is although it's a monorepo, it still relies on a whole fleet of external services and libraries to do some things. Some of these services we have the source code for but most not so even in the best case "hunting for files to add in the context window" just becomes "hunting for repos to add in the context window". Scaling!
As an aside, I tried to greenfield some apps with LLMs. I asked Codex to develop a minimal single-page app for a simple internal lookup tool. I emphasized minimalism and code clarity in my prompt. I told it not to use external libraries and rely on standard web APIs.
What it spewed forth is the most polished single-page internal tool I have ever seen. It is, frankly, impressive. But it only managed to do so because it basically spat out the most common Bootstrap classes and recreated the W3Schools AJAX tutorial and put it all in one HTML file. I have no words and I don't know if I must scream. It would be interesting to see how token costs evolve over time for a 100% vibe-coded project.
Copilot is notoriously bad. Have you tried (paid plans) codex, Claude or even Gemini on your legacy project? That's the bare minimum before debating the usefulness of AI tools.
"notoriously bad" is news to me. I find no indication from online sources that would warrant the label "notoriously bad".
https://arxiv.org/html/2409.19922v1#S6 from 2024 concludes it has the highest success rate in easy and medium coding problems (with no clear winner for hard) and that it produces "slightly better runtime performance overall".
> Have you tried (paid plans) codex, Claude or even Gemini on your legacy project?
This is usually the part of the pitch where you tell me why I should even bother especially as one would require me to fork up cash upfront. Why will they succeed where Copilot has failed? I'm not asking anyone to do my homework for me on a legacy codebase that, in this conversation, only I can access---that's outright unfair. I'm just asking for a heuristic, a sign, that the grass might indeed be greener on that side. How could they (probably) improve my life? And no, "so that you pass the bare minimum to debate the usefulness of AI tools" is not the reason because, frankly, the less of these discussions I have, the better.
I'm saying this to help you. Whether you give it a shot makes no difference to me. This topic is being discussed endlessly everyday on all major platforms and for the past year or so the consensus is strongly against using copilot.
If you want to see if your project and your work can benefit from AI you must use codex, Claude code or Gemini (which wasn't a contender until recently).
> This topic is being discussed endlessly everyday on all major platforms and for the past year or so the consensus is strongly against using copilot.
So it would be easy to link me to something that shows this consensus, right? It would help me see what the "consensus" has to say about the known limitations of Copilot too. It would help me see the "why" that you seem allergic to even hint at.
Look, I'm trying to not be close-minded about LLMs hence why I'm taking time out of my Sunday to see what I might be missing. Hence my comment that I don't want to invest time/money in yet-another-LLM just for the "privilege" of debating the merits of LLMs in software engineering. If I'm to invest time/money in another coding LLM, I need a signal, a reason, to why it might be better than Copilot for helping me do my job. Either tell me where Copilot is lacking or where your "contenders" have the upper-hand. Why is it a "must" to use Codex/Claude/Gemini other than trustmebro?
I couldn't tell you because I've kept it at arms length but over the last year our most enthusiastic "AI guy" (as well as another AI-user on the team) has churned through quite a few, usually saying something like "$NEW_MODEL is much better!" before littering garbage PRs all over the project.
I mean, Im 10ish years in, so probably have another couple decades at least, and I never use AI assistants for anything. Also one of the two highest performing team members, and the other doesn't use it either.
Its a great time to be a non-AI user, and even better to have never been one, because its easier now than ever to differentiate oneself from those who are reliant on it and, over the long run, much less effective because of it.
reply