I have to give this a try. My current model for backend is the same as how author does frontend iteration. My friend does the research-plan-edit-implement loop, and there is no real difference between the quality of what I do and what he does. But I do like this just for how it serves as documentation of the thought process across AI/human, and can be added to version control. Instead of humans reviewing PRs, perhaps humans can review the research/plan document.
On the PR review front, I give Claude the ticket number and the branch (or PR) and ask it to review for correctness, bugs and design consistency. The prompt is always roughly the same for every PR. It does a very good job there too.
Everyone seems to have different ways to deal with AI for coding and have different experiences. But Armin's comment quoted in the article is spot on. I have seen a friend do exactly the same thing, vibe coded an entire product hooked to Cursor over three months. Filled with features no one uses, feeling very good about everything he built. Ultimately it's his time and money, but I would never want this in my company. While you can get very far with vibe coding, without the guiding hands and someone who understands what's really going on with the code, it ends up in a disaster.
I use AI for the mundane parts, for brainstorming bugs. It is actually more consistent than me in covering corner cases, making sure guard conditions exist etc. So I now focus more on design/architecture and what to build and not minutea.
I recently asked Claude to make some kind of simple data structure and it responded with something like "You already have an abstraction very similar to this in SourceCodeAbc.cpp line 123. It would be trivial to refactor this class to be more generic. Should I?" I was pretty blown away. It was like a first glimpse of an LLM play-acting as someone more senior and thoughtful than the usual "cocaine-fueled intern."
HA is not about exceeding the limits of a server. Its about still serving traffic when that best server I bought goes offline (or has failed memory chip, or a disk or... ).
Postgres replication, even in synchronous mode, does not maintain its consistency guarantees during network partitions. It's not a CP system - I don't think it would actually pass a Jepsen test suite in a multi-node setup[1]. No amount of tooling can fix this without a consensus mechanism for transactions.
Same with MySQL and many other "traditional" databases. It tends to work out because these failures are rare and you can get pretty close with external leader election and fencing, but Postgres is NOT easy (likely impossible) to operate as a CP system according to the CAP theorem.
There are various attempts at fixing this (Yugabyte, Neon, Cockroach, TiDB, ...) which all come with various downsides.
It truly is. I've seen multiple pre-seed startups and founders burn significant amounts of personal capital in order to land an O-1A visa because of how much capital and mentorship is available here.
In my own anecdotal experience Claude Code found a bug in production faster than I could. I was the author of the said code, that was written 4 years ago by hand. GPs claim perhaps is not all that unsubstantiated. My role is moving more towards QA/PM nowadays.
For sure. Not hard fails, but bad fixes. It confidently thought it fixed a bug, but it really didn't. I could only tell (it was fairly complex), because I tried reproducing it before/after. Ultimately I believe there was not sufficient context provided to it. It has certainly failed to do what I asked it to do in round 1, round 2, but eventually got it right (a rendering issue for a barcode designer).
These incidents have been less and less over the last year - switching it Opus made failure frequencies less. Same thing for code reviews. Most of it is fluff, but it does give useful feedback, if the instructions are good. For example, I asked for a blind code review of a PR ("Review this PR"), and it gave some generic commentary. I made the prompt more specific ("Follow the API changes across modules and see impact") - it found a serious bug.
The number of times I had to give up in frustration has been going down over the last one year. So I tend believe a swarm of agents could do a decent job of autonomous development/maintenance over the next few years.
On the PR review front, I give Claude the ticket number and the branch (or PR) and ask it to review for correctness, bugs and design consistency. The prompt is always roughly the same for every PR. It does a very good job there too.
Modelwise, Opus 4.6 is scary good!
reply