Right now I'm working two AI-jobs. I build agents for enterprises and I teach agent development at a university. So I'm probably too deep to see straight.
But I think the future of programming is english.
Agent frameworks are converging on a small set of core concepts: prompts, tools, RAG, agent-as-tool, agent handoff, and state/runcontext (an LLM-invisible KV store for sharing state across tools, sub-agents, and prompt templates).
These primitives, by themselves, can cover most low-UX application business use cases. And once your tooling can be one-shotted by a coding agent, you stop writing code entirely. The job becomes naming, describing, and instructing and then wiring those pieces together with something more akin to flow-chart programming.
So I think for most application development, the kind where you're solving a specific business problem, code stops being the relevant abstraction. Even Claude Code will feel too low-level for the median developer.
You think prompting is here to stay? Sql has survived a long period of time. Servlets haven’t. We moved from assembly to higher languages. Flash couldn’t make it. So, im not sure for how long we will be prompting. Sure it looks great right now (just like Flash, servlets and assembly looked back then) but I think another technology will emerge that perhaps is based on promps behind the curtains but doesn’t look like the current prompting.
I would say prompting is not here to stay. It’s just temporary “tech”
> The job becomes naming, describing, and instructing and then wiring those pieces together with something more akin to flow-chart programming.
That's precisely what peoples are bad at. If people don't grasp (even intuitively) the concept of finite state machine and the difference between states and logic, LLMs are more like a wishing well (vibes) than a code generator (tooling for engineering).
Then there's the matter of technical knowledge. Software is layers of abstraction and there's already abstraction beneath. Not knowing those will limit your problem solving capabilities.
This is beautiful and highly readable but, still, I yearn for a detailed line-by-line explainer like the backbone.js source: https://backbonejs.org/docs/backbone.html
True for coding agents running SotA models where you're the human-in-the-loop approving, less true for your deployed agents running on cheap models that you don't see what's being executed.
Probably oversold here because if you read the fine print, the savings only come in cases when you don't need the bytes in context.
That makes sense for some of the examples the described (e.g. a QA workflow asking the agent to take a screenshot and put it into a folder).
However, this is not true for an active dev workflow when you actually do want it to see that the elements are not lining up or are overlapping or not behaving correctly. So token savings are possible...if your use case doesn't require the bytes in context (which most active dev use cases probably do)*
I’m just curious, what would need to happen for you to change your opinion about this? Are you basically of the opinion that it’s not good enough today, never will be good enough in the future, and we should just wind back the clock 3 years and pretend these tools don’t exist?
It feels to me like a lot of this is dogma. If the code is broken or needs more testing, that can be solved. But it’s orthogonal: the LLM can be used to implement the unit testing and fuzz testing that would beat this library into shape, if it’s not already there. It’s not about adding a human touch, it’s about pursuing completeness. And that’s true for all new projects going from zero to one, you have to ask yourself whether the author drove it to completeness or not. That’s always been true.
You want people to hedge their projects with disclaimers that it probably sucks and isn’t production worthy. You want them to fess up to the fact that they cheated, or something. But they’re giving it away for free! You can just not use it if you don’t want to! They owe you nothing, not even a note in the readme. And you don’t deserve more or less hacker points depending on whether you used a tool to generate the code or whether you wrote it by hand, because hacker points don’t exist, because the value of all of this is (and always will be) subjective.
To the extent that the modern tools and models can’t oneshot anything, they’re going to keep improving. And it doesn’t seem to me like there’s any identifiable binary event on the horizon that would make you change your mind about this. You’re just against LLMs, and that’s the way it is, and there’s nothing that anyone can do to change your mind?
I mean this in the nicest way possible: the world is just going to move on without you.
>I’m just curious, what would need to happen for you to change your opinion about this?
Imagine a machine that can calculate using logic circuits and one that uses a lookup table.
LLMs right now is the latter (please don't take literally, It is just an example). You can argue that the look up table is so huge that it works most of the time.
But I (and probably the parent commenter) need it to be the former. And that answers your question.
So it does not matter how huge the lookup table will grow in the future so that it will work more often, it is still a lookup table.
So people are divided into two groups right now. One group that goes by appearance, and one that goes by what the thing actually is fundamentally, despite the appearances.
right, so why are you asking me to imagine one machine that can calculate using logic circuits and another that can calculate using a lookup table when we’re in agreement that they’re the same thing?
I think you will get a better response to a slightly different analogy. In genetic programming (and in machine learning), we have a concept of "overfitting". Overfitting can be understood as a program memorizing too much of its test/training data (i.e. so it is acting more like an oracle than a computation). This, intuitively, becomes less of a problem the greater the training-dataset becomes, but the problem will always be there. Noticing the problem is like noticing the invisible wall at the edge of the game-world.
The most insightful thing about LLMs, is just how _useful_ overfitting can be in practice, when applied to the entire internet. In some sense, stack-overflow-driven-development (which was widespread throughout the industry since at least 2012), was an indication that much of a programmer's job was finding specific solutions to recurring problems, that never seem to get permanently fixed (mostly for reasons of culture, conformity, and churn in the ranks).
The more I see the LLM-ification of software unfold (essentially an attempted controlled demolition of our industry and our culture), the more I think about Arthur Whitney (inventor of the K language and others). In this interview[1], he said two interesting things: (1) he likened programming to poetry, and (2) he said that he designed his languages to not have libraries, and everybody builds from the 50 basic operators that come with the language, resulting in very short programs (in terms of both source code size and compiled/runtime code size).
I wonder if our tendency to depend on libraries of functions, counterintuitively results in more source code (and more compiled/runtime code) in the long run -- similarly to how using LLMs for coding tends to be very verbose as well. In principle, libraries are collections of composable domain-verbs that should allow a programmer to solve domain-problems, and yet, it rarely feels that way. I have ripped out general libraries, and replaced them with custom subroutines more times than I can count, because I usually need a subset of functionality, and I need it to be correct (many libraries are complex and buggy because they have some edge-cases [for example, I once used an AVL library that would sometimes walk the tree in reverse instead of from least to greatest -- unfortunately, the ordering mattered, and I wrote a simpler bespoke implementation]).
Arguably, a buggy program or a buggy library or a buggy function, is just an overfit program, or library, or function (it is overfit to the mental-model of the problem-space in the library writer's mind). These overfit libraries, which are often used as blackboxes by someone rushing to meet a deadline, often result in programs that are themselves overfit to the buggy library, creating _less_ modularity instead of more. _Creating_ an abstraction is practically free, but maintaining it and (most disappointingly) _using_ it has real, often permanent long term costs. I have rarely been able to get two computers, that were meant to share data with NFS, to do so reliably, if they were not running the same exact OS (because the NFS client and server of each OS are bug-compatible, are overfit to each other).
In fact the rise of VMWare, and the big cloud companies, and containerization and virtualization technologies is, conceivably, caused by this very tendency to write software that is overfit to other software (the operating system, the standard library [on some OSes emacs has to be forced to link to glibc, because using any other memory allocator causes it to SEGFAULT, and don't get me started on how no two browser-canvases return the same output in different browser _nor_ on the same browser in a different OS]). (Maybe, just as debt keeps the economy from collapsing, technical debt is the only thing that keeps Silicon Valley from collapsing.)
In some ways, coding-LLMs exaggerate this tendency towards overfitting in comical ways, like fun-house mirrors. And now, a single individual, with nothing but a dream, can create technical debt at the same rate as a thousand employee software company could a decade ago. What a time to be alive.
You’ve gotta read the code. It doesn’t matter how it got there but if you don’t fully understand it (which implies reading it) don’t get mad when you try to push slop on people. It’s the equivalent of asking an LLM to write an email for somebody else to read that you didn’t read yourself. It’s basic human trust - of course people get annoyed with you. You’re untrustworthy.
I tried to control LLM output quality by different means, including fuzzing. Had several cases when LLM "cheated" on that too. So, I have my own shades and grades of being sure the code is not BS.
But once you told it to stop cheating, did it eventually figure it out? I mean, correctly implementing fuzzer support for a project is entirely within the wheelhouse of current models. It’s not rocket science.
There never was a cohesive generic open source community. There are no meaningful group norms. This was and always will be a fiction.
I’m tempted to just start putting co-authored-by: Claude in every commit I make, even the ones that I write by hand, just to intentionally alienate people like you.
The best guardrails are linters, autoformatters, type checkers, static analyzers, fuzzers, pre-commit rules, unit tests and coverage requirements, microbenchmarks, etc. If you genuinely care about open source code quality, you should be investing in improving these tools and deploying them in the projects you rely on. If the LLMs are truly writing bad or broken code, it will show up here clearly.
But if you can’t rephrase your criticism of a patch in terms of things flagged by tools like those, and you’re not claiming there’s something architecturally wrong with the way it was designed, you don’t have a criticism at all. You’re just whining.
> There never was a cohesive generic open source community. There are no meaningful group norms. This was and always will be a fiction.
It's always been a bit splintered, but it was generally composed of 95%+ of people that know how to program. That is no longer the case in any sense.
> I’m tempted to just start putting co-authored-by: Claude in every commit I make, even the ones that I write by hand, just to intentionally alienate people like you.
I mean it sounds like you are already using claude for everything so this is probably a bit of a noop lol.
> But if you can’t rephrase your criticism of a patch in terms of things flagged by tools like those, and you’re not claiming there’s something architecturally wrong with the way it was designed, you don’t have a criticism at all. You’re just whining.
No, because doing that requires MORE rigor and work than what an LLM driven project had put into it. That difference in effort/work is not tenable, its shallow work being shown, its shallow criticisms thrown at it.
All sense of depth and integrity is gone and killed.
Yes, it was always about writing useful programs for computers. Which is why people moan about the use of LLMs: because then the writing aspect is gone!
Anyway, this stuff will resolve itself, one way or another.
Sorry but these are just not accurate as blanket statements anymore, given how good the models have gotten.
As other similar projects have pointed out, if you have a good test suite and a way for the model to validate its correctness, you can get very good results. And you can continue to iterate, optimize, code review, etc.
I'm not really trolling. I'm trying to push people to consider that the world is already in a state where "I used AI" is neither binary nor dispositive. I think we're used to a 2023 to mid-2025 framing where outside of some narrow, highly structured cases, the code is garbage.
If that's still true as a binary now, it won't be for long. As the robot likes to say, some of these changes are "in flight".
Showing the prompts is not feasible when using agentic coding tools. I suppose one could persist all chat logs ever used in the project, but is that even useful?
I think it would be useful. I see lots of comments like "it one-shotted this" and am curious if they just had to write one sentence or many pages of instructions.
never mind the fact that the model is constantly reseeding itself against the files it’s reading from your working directory, so the prompts are useless on their own.
AI often produces nonsense that a human wouldn't. If a project was written using AI the chances that it is a useless mess are significantly higher than if it was written by a human.
I craft a detailed and ordered set of lecture notes in a Quarto file and then have a dedicated claude code skill for translating those notes into Slidev slides, in the style that I like.
Once that's done, much like the author, I go through the slides and make commented annotations like "this should be broken into two slides" or "this should be a side-by-side" or "use your generate clipart skill to throw an image here alongside these bullets" and "pull in the code example from ../examples/foo." It works brilliantly.
And then I do one final pass of tweaking after that's done.
But yeah, annotations are super powerful. Token distance in-context and all that jazz.
Quarto can be used to output slides in various formats (Powerpoint, beamer for pdf, revealjs for HTML, etc.). I wonder why you use Slidev as you can just ask Claude Code to create another Quarto document.
It looks like Slidev is designed for presentations about software development, judging from its feature set. Quarto is more general-purpose. (That's not to say Quarto can't support the same features, but currently it doesn't.)
I'm not affiliated with Slidev. I was just curious.
Not yet... but also I'm not sure it makes a lot of sense to be open source. It's super specific to how I like to build slide decks and to my personal lecture style.
But it's not hard to build one. The key for me was describing, in great detail:
1. How I want it to read the source material (e.g., H1 means new section, H2 means at least one slide, a link to an example means I want code in the slide)
2. How to connect material to layouts (e.g., "comparison between two ideas should be a two-cols-title," "walkthrough of code should be two-cols with code on right," "learning objectives should be side-title align:left," "recall should be side-title align:right")
Then the workflow is:
1. Give all those details and have it do a first pass.
2. Give tons of feedback.
3. At the end of the session, ask it to "make a skill."
4. Manually edit the skill so that you're happy with the examples.
As a SE with over 15 years' professional experience, I find myself pointing out dumb mistakes to even the best frontier models in my coding agents, to refine the ouput. A "coder" who is not doing this on the regular is only a tool of their tool.
(in my mental model, a "vibe coder" does not do this, or at least does not do it regularly)
Well, the term lacks clarity and a shift of meaning.
If you define "vibe-coders" as people who just write prompts and don't look at code - no, they ain't coders now.
But if you mean people who do LLM-assistet coding, but still read code (like all of those who are upset by this change) - then sure, they always have been coders.
Sure, but it also guarantees that people will think twice about buying their service. Support should have reached out and informed them about whatever they did wrong, but I can't say that I'm surprised that an AI company wouldn't have an real support.
I'd agree with you that if you rely on an LLM to do your work, you better be running that thing yourself.
Not sure what your point is. They have the right to kick OP out. OP has the right to post about it. We have a right to make decisions on what service to use based on posts like these.
Pointing out whether someone can do something is the lowest form of discourse, as it's usually just tautological. "The shop owner decides who can be in the shop because they own it."
"I can't remember where I heard this, but someone once said that defending a position by citing free speech is sort of the ultimate concession; you're saying that the most compelling thing you can say for your position is that it's not literally illegal to express."
The article describes data showing a correlation between Ozempic use and slowed progression of certain brain conditions. The study aimed to determine whether that effect came from Ozempic itself or simply from weight loss. Once researchers controlled for weight loss, the effect disappeared. In other words, correlation, not causation.
That's an important caveat. But effectively it sounds like Ozempic typically results in a better diet, and a better diet typically results in slowed progression.
But I think the future of programming is english.
Agent frameworks are converging on a small set of core concepts: prompts, tools, RAG, agent-as-tool, agent handoff, and state/runcontext (an LLM-invisible KV store for sharing state across tools, sub-agents, and prompt templates).
These primitives, by themselves, can cover most low-UX application business use cases. And once your tooling can be one-shotted by a coding agent, you stop writing code entirely. The job becomes naming, describing, and instructing and then wiring those pieces together with something more akin to flow-chart programming.
So I think for most application development, the kind where you're solving a specific business problem, code stops being the relevant abstraction. Even Claude Code will feel too low-level for the median developer.
The next IDE looks like Google Docs.
reply