> [Harper Reed] cautioned against being overly precious about the value of deeply understanding one’s code, which is no longer necessary to ensure that it works.
That just strikes me as an odd thing to say. I’m convinced that this is the dividing line between today’s software engineers and tomorrow’s AI engineers (in whatever form that takes - prompt, vibe, etc.) Reed’s statement feels very much like a justification of “if it compiles, ship it!”
> “It would be crazy if in an auto factory people were measuring to make sure every angle is correct,” he said, since machines now do the work. “It’s not as important as when it was group of ten people pounding out the metal.”
Except that the machines doing that work aren’t regularly hallucinating angles, spurious welding joints, etc.
Also, you know who did measure every angle to make sure it was correct? The engineers who put together the initial design. They sure as hell took their time getting every detail of the design right before it ever made it to the assembly line.
You misconstrue the analogy. The robot isn’t equivalent to the code in this analogy. It’s the thing that generates the code.
The robot operates deterministically, it has a fixed input and a fixed output. This is what makes it reliable.
Your “AI coder” is nothing like that. It’s non deterministic on its best day, and it gets everything thrown at it so even more of a coin toss. This seriously undermines any expectation of reliability.
The guy’s comparison shows a lack of understanding of either of the systems.
I totally understand that inversion but I think it's a bad analogy.
Industrial automation works by taking a rigorously specified designs developed by engineers and combining it with rigorous quality control processes to ensure the inputs and outputs remains within tolerances. You first have to have a rigorous spec, then you can design a process for manufacturing a lot of widgets while checking 1 out of every 100 of them for their tolerances.
You can only get away with not measuring a given angle on widget #13525 because you're producing many copies of exactly the same thing and you measured that angle on widget #13500 and widget #13400 and so on and the variance in your sampled widgets is within the tolerances specified by the engineer who designed the widget.
There's no equivalent to the design stage or to the QC stage in the vibe-coding process advocated for by the person quoted above.
I don't know what you mean with "the code it creates is deterministic" but the process an LLM uses to generate code based on an input is definitely not entirely deterministic.
To put it simply, the chances that an LLM will output the same result every time given the same input is low. The LLM does not operate deterministically, unlike the manufacturing robot who will output the same door panel every single time. Or as ChatGPT put it:
> The likelihood of an LLM like ChatGPT generating the exact same code for the same prompt multiple times is generally low.
For any given seed value, the output of an LLM will be identical- it is deterministic. You can try this at home with Llama.cpp by specifying a seed value when you load a LLM, and then seeing that for a given input the output will always be the same. Of course there may be some exceptions (cosmic ray bit flips). Also, if you are only using online models, you can't set the seed value, plus there are multiple models, so multiple seeds. In summary, LLMs are deterministic.
> the process an LLM uses to generate code based on an input is definitely not entirely deterministic
Technically correct is the least useful kind of correct when it's wrong in practice. And in practice the process AI coding tools use to generate code is not deterministic which is what matters. To make matters worse in the comparison with a manufacturing robot, even the input is never the same. While a robot get the exact command for a specific motion and the exact same piece of sheet metal, in the same position, a coding AI is asked to work with varied inputs and on varied pieces of code.
Even stamping metal could be called "non-deterministic" since there are guaranteed variations, just within determined tolerances. Does anyone define tolerances for generated code?
That's why the comparison shows a lack of understanding of either of the systems.
I don't really understand your point.
An LLM is loaded with a seed value, which is a number. The number may be chosen through some pseudo- or random process, or specified manually. For any given seed value, say 80085, the LLM will always and exactly generate the same tokens. It is not like stamped sheet metal, because it is digital information not matter. Say you load up R1, and give it a seed value of 80085, then say "hi" to the model. The model will output the exact same response, to the bit, same letters, same words, same punctuation, same order. Deterministic.
There is no way you can say that an LLM is non-deterministic, because that would be WRONG.
First you're assuming a brand new conversation: no context. Second you're assuming a local-first LLM because a remote one could change behavior at any time. Third, the way the input is expressed is inexact, so minor differences in input can have an effect. Fourth, if the data to be operated on has changed you will be using new parts of the model that were never previously used.
But I understand how nuance is not as exciting as using the word WRONG in all caps.
Arguing with "people" on the internet...
Nuance is definitely a word of the year, and if you look at many models you can actually see it's high probability.
Addressing your comment, there was no assumption or indication on my part that determinism only applies to a new "conversation". Any interactions with any LLM are deterministic, same conversation, for any seed value. Yes, I'm talking about local systems, because how are you going to know what is going on on a remote system?
On a local system, a local LLM, if the input is expressed in the same way, the output will be generated in the same way, for all of the token context and so on.
That means, for a seed value, after "hi", the model may say "hello", and then the human's response may be "how ya doin'", and then the model would say "so so , how ya doin?", and every single time, if the human or agent inputs the same tokens, the model will output the same tokens, for a given seed value. This is not really up for question, or in doubt or really anything to disagree about. Am I not being clear? You can ask your local LLM or remote LLM and they will certainly confirm that the process by which a language model generates is deterministic, by definition. Same input means same output, again I must mention that the exception is hardware bit flips, such as those caused by cosmic rays, and that's just to emphasize how very deterministic LLMs are. Of course, as you may know, online providers stage and mix LLMs, so for sure you are not going to be able to know that you are wrong by playing with chatgpt, grok/q, gemini, or whatever other only LLMs you are familiar with. If you have a system capable of offline or non-remote inference, you can see for yourself that you are wrong when you say that LLMs are non-deterministic.
I feel this is technically correct but intentionally cheating. no one - including the model creators - expects that to be the interface; it undermines they entire value proposition of using an LLM in the first place if I need to engineer the inputs to ensure reproducability. I'd love to hear some real world scenarios that do this where it wouldn't be simpler to NOT use AI.
When should a model's output be deterministic?
When should a model's output be non-deterministic?
When many humans interact with the same model, then maybe the model should try different seed values, and make measurements.
When model interaction is limited to a single human, then maybe the model should try different seed values, and make measurements.
An entire generation of devs, who grew up using unaudited, unverified, unknown license code. And which at a moments notice, can be sold to a threat actor.
And I've seen devs try to add packages to the project without even considering the source. Using forks of forks of forks, without considering the root project. Or examing if it's just a private fork, or what is most active and updated.
If you don't care about that code, why care about AI code? Or even your own?
After putting off learning JS for a decade, I finally bit the bullet since I can talk to an LLM about it while going through the slog of getting a mental model up and running.
After a month, I can say that the inmates run that whole ecosystem, from the language spec, to the interpreter, to packaging. And worse, the tools for everyone else have to cater to them.
I can see why someone who has never had a stable foundation to build a project on would view vibe coding as a good idea. When you're working in an ecosystem where any project can break at any time because some dependency pushed a breaking minor version bundled with a security fix for a catastrophic exploit, rolling the LLM gacha to see if it can get it working isn't the worst idea.
since you mention JS specifically, I think it's important to seperate that from the framework ecosystem. I'd suspect that most LLMs don't which is part of the problem. I had a similar experience with Python lately, where the LLM-generated code (once I could get it to run) resulted in code that I would generously evaluate as "Excel VBA Macro quality". It does the task - for now - but I didn't learn much about what production-grade python would look like.
This is an underrated comment. Who's job is it to do the thinking? I suppose it's still the software engineer, which means the job comes down to "code prompt engineer" and "test prompt engineer".
Wild times where a task that used to be described as "good at using google" now gets the title of "Engineer". It was bonkers enough when software devs co-opted the title.
I mean, building applications that are maintainable, will fail gracefully, and keeps costs low, has all the same needs as any classic engineering discipline. You could spend just as much time designing a well thought out CLI as it could take to design a bridge or a sewer system.
Whether people do, or not, is a different question.
I just finished creating a multiplayer online party game using only Claude Code. I didn't edit a single line. However, there is no way someone who doesn't know how to code could get where I am with it.
You have to have an intuition on the sources of a problem. You need to be able to at least glance at the correct and understand when and where the AI is flailing, so you know to backtrack or reframe.
Without that you are as likely to totally mess to you app. Which also means you need to understand source control and when to save and how to test methodically.
I was thinking of that, but asking the right questions and learning the problem domain just a little bit "getting the gist of things" will help a complete newbie to generate code for a complex software.
For example in your case there is the concept of message routing where a message that gets sent to the room is copied to all the participants.
You have timers, animation sheets, events, triggers, etc.
A question that extracts such architectural decisions and relevant pieces of code will help the user understand what they are actually doing and also help debug the problems that arise.
It will of course take them longer, but it is possible to get there.
So I agree, but we aren't at that level of capability yet. Because at some point currently it inevitably hits a wall and you need to dig deeper to push it out of the rut.
Hypothetically, if you codified the architecture as a form of durable meta tests, you might be able to significantly raise the ceiling.
Decomposing to interfaces seems to actually increase architectural entropy instead of decrease it when Claude Code is acting on a code base over a certain size/complexity.
So yes and no. I often just let it work by itself. Towards the very end when I had more of a deadline I would watch and interrupt it when it was putting implementations in places that broke its architecture.
I think only once did I ever give it an instruction that was related to a handful of lines (There certainly were plenty of opportunities, don't get me wrong).
When troubleshooting occasionally I did read the code. There was an issue with player to player matching where it was just kind of stuck and gave it a simpler solution (conceptually, not actual code) that worked for the design constraints.
I did find myself hinting/telling it to do things like centralize the CSS.
It was a really useful exercise in learning. I'm going to write an article about it. My biggest insight is that "good" architecture for an current generation AI is probably different than for humans because of how attention and context works in the models/tools (at least for the current Claude Code). Essentially "out of sight out of mind" creates a dynamic where decomposing code leads to an increase in entropy when a model is working on it.
I need to experiment with other agentic tools to see how their context handling impacts possible scope of work. I extensively use GitHub Copilot, but I control scope, context, and instructions much tighter there.
I hadn't really used hands off automation much in the past because I didn't think the models were at a level that they could handle a significantly sized unit of work. Now they can with large caveats. There also is a clear upper bound with the Claude Code, but that can probably be significantly improved by better context handling.
so if you're an experienced, trained developer you can now add AI as a tool to your skill set? This seems reasonable, but is also a fundamentally different statement that what every. single. executive. is parroting to the echochamber.
I have a strong memory from the start of my career, when I had a job setting up Solaris systems and there was a whispered rumour that one of the senior admins could read core files. To the rest of us, they were just junk that the system created when a process crashed and that we had to find and delete to save disk space. In my mind I thought she could somehow open the files in an editor and "read" them, like something out of the Matrix. We had no idea that you could load them into a debugger which could parse them into something understandable.
I once showed a reasonably experienced infrastructure engineer how to use strace to diagnose some random hangs in an application, and it was like he had seen the face of God.
(Anecdote) Best job I ever had, I walked in and they were like "yeah, we don't have any training or anything like that", but we've got a fully setup lab and a rotating library of literature. <My Boss> "Yeah I'm not going to be around, but here are the office keys" don't blow up the company pretty much.
To be honest, I do find most manuals (man pages) horrible to quickly get information how to do something and here LLMs do shine for me (as long as they don't mix up version numbers).
For man pages, you have to already know what you wants to do and just want information on how exactly to do it. They're not for learning about the domain. You don't read the find manual to learn the basics of filesystems.
I mean the process either works, or it doesn’t. Meaning it either brings in the expected value with acceptable level of defects or it doesn’t.
From a higher up’s perspective what they do is not that different from vibe coding anyway. They pick a direction, provide a high level plan and then see as things take shape, or don’t. If they are unhappy with the progress they shake things up (reorg, firings, hirings, adjusting the terminology about the end goal, making rousing speeches, etc)
They might realise that they bet on the wrong horse when the whole site goes down and nobody inside the company can explain why. Or when the hackers eat their face and there are too many holes to even say which one they did come through. But these things regularly happen already with the current processes too. So it is more of a difference in degree, not kind.
I agree with this completely. I get the impression that a lot of people here think of software development as a craft, which is great for your own learning and development but not relevant from the company's perspective. It just has to work good enough.
Your point about management being vibe coding is spot on. I have hired people to build something and just had to hope that they built it the way I wanted. I honestly feel like AI is better than most of the outsourced code work I do.
One last piece, if anyone does have trouble getting value out of AI tools, I would encourage you to talk to/guide them like you would a junior team member. Actually "discuss" what you're trying to accomplish, lay out a plan, build your tests, and only then start working on the output. Most examples I see of people trying to get AI to do things fail because of poor communication.
> I get the impression that a lot of people here think of software development as a craft, which is great for your own learning and development but not relevant from the company's perspective. It just has to work good enough.
Building the thing may be the primary objective, but you will eventually have to rework what you've built (dependency changes, requirement changes,...). All the craft is for that day, and whatever that goes against that is called technical debt.
You just need to make some tradeoffs between getting the thing out the faster possible and being able to alter it later. It's a spectrum, but instead of discussing it with the engineers, most executive suites (and their manager) wants to give out edicts from high.
> Building the thing may be the primary objective, but you will eventually have to rework what you've built (dependency changes, requirement changes,...). All the craft is for that day, and whatever that goes against that is called technical debt.
This is so good I just wanted to quote it so it showed up in this thread twice. Very well said.
The whole auto factory thing sounds completely misinformed to me. Just because a machine made it does not mean the output isn't checked in a multitude of ways.
Any manufacturing process is subject to quality controls. Machines are maintained. Machine parts are swapped out long before they lead to out-of-tolerance work. Process outputs are statistically characterised, measured and monitored. Measurement equipment is recalibrated on a schedule. 3d printed parts are routinely X-rayed to check for internal residue. If something can go wrong, it sure as hell is checked.
Maybe things that can't possibly fail are not checked, but the class of software that can't possibly fail is currently very small, no matter who or what generates it.
Additionally production lines are all about doing the same thing over and over again, with fairly minimal variations.
Software isn't like that. Because code is relatively easy to reuse, novelty tends to dominate new code written. Software developers are acting like integrators in at least partly novel contexts, not stamping out part number 100k of 200k that are identical.
I do think modern ML has a place as a coding tool, but these factory like conceptions are very off the mark imo.
On the auto factory side, the Toyota stuck gas pedal comes to mind, even if it can happen only under worst-case circumstances. But that's the (1 - 0.[lots of nines]) case.
On the software side, the THERAC story is absolutely terrifying - you replace a physical interlock with a software-based one that _can't possibly go wrong_ and you get a killing machine that would probably count as unethical for executions of convicted terrorists.
THERAC was terrible. And intermittent to for extra horror.
I am a strong proponent of hardware level interlocks for way more mundane things than that. It helps a lot in debugging to narrow down the possible states of things.
A buddy of mine was a director in a metrology integration firm that did nothing but install lidar, structured light and other optical measurement solutions for auto assembly plants. He had a couple dozen people working full time on new model line build outs (every new model requires substantial refurb and redesign to the assembly line) and ongoing QA of vehicles as they were being manufactured at two local Honda plants. The precision they were looking for is pretty remarkable.
> Any manufacturing process is subject to quality controls.
A few things on this illusion:
* Any manufacturer will do everything in their power to avoid meeting anything but the barest minimums of standards due to budget concerns
* QA workers are often pressured to let small things fly and cave easily because they simply do not get paid enough to care and know they won't win that fight unless their employer's product causes some major catastrophe that costs lives
* Most common goods and infrastructure are built by the lowest bidder with the cheapest materials using underpaid labor, so as for "quality" we're already starting at the bottom.
There is this notion that because things like ISO and QC standards exist, people follow them. The enforcement of quality is weak and the reach of any enforcing bodies is extremely short when pushed up against the wall by the teams of lawyers afforded to companies like Boeing or Stellantis.
I see it too regularly at my job to not call out this idea that quality control is anything but smoke and mirrors, deployed with minimal effort and maximum reluctance. Hell, it's arguably the reason why I have a job since about 75% of the machines I walk in their doors to fix broke because they were improperly maintained, poorly implemented or sabotaged by an inept operator. It leaves me embittered, to be honest, because it doesn't have to be this way and the only reason why it is boils down to greed and mismanagement.
> Any manufacturer will do everything in their power to avoid meeting anything but the barest minimums of standards due to budget concerns
Perhaps this is industry dependent?
In my country’s automotive industry, quality control standards have risen a lot in the past few decades. These days consumers expect the doors and sunroof not to leak, no rust even after 15 years being kept outdoors, and the engine to start first time even after two weeks in an airport carpark.
How is this achieved? Lots of careful quality checking.
For context, I am in the US and in a position to see what goes on behind the scenes in most of the major auto-maker factories and some aerospace, but that's about as far as I can talk about it, since some of them are DoD contracters.
Quality Control is a valuable tool when deployed correctly and, itself, monitored for consistency and areas where improvement can happen. There is what I consider a limp-wristed effort to improve QC in the US, but in the end, it's really about checking some bureaucratic box as opposed to actually making better product, although sometimes we get lucky and the two align.
Your comment is adversarial for no reason. "Real" QC? I work for people who make vehicles on the ground and in the air that hold passengers expecting to be delivered safely. Some of them even make parts for very large structures that are expected to remain standing when the wind blows too hard or the ground shakes too much. Let's talk about "real" QC and these other imagined types that must exist to you.
Can you define the differences between "real" QC and other versions? Does this imply a "fake" QC? Does that mean that our auto and aerospace manufacturers can't hold themselves to the same quality standards as Big Pharma, since both are ultimately trying to achieve the same goal in avoiding the litigation that comes with putting your customers at risk?
Let's not pretend that pharma co's have never side-stepped regulation or made decisions that put swaths of the population in a position to harm themselves.
My argument was dispelling the general idea that just because rules are in place, they are being followed. Believe me, I'd love to live in that world, but have seen little evidence that we do.
Just a reader of this thread, but that wasn't my take on it. The text you quoted was, I think, an overgeneralisation (there are certainly manufacturers who perform above the baseline standards), but I don't think it was worded adversarially? It then provided some more information (some of which I have heard from others in the industry, especially around QA being pressured to pass defective items).
The post they are complaining about was a driveby dismissive statement that didn't add anything to the discussion whatsoever.
Huge difference between me saying manufacturers will cut corners in any way they can (maybe you're taking this as consumer vs manufacterer?) and the person who replied to me saying I don't work a job that encounters "real" (read intentionally vague and diminutive) QC standars. One is a blanket statement that is backed by easily accessed and very public evidence, the other is personal attack.
I'm not sure how it's a personal attack, it would be like someone who bakes bread for a living saying that the tolerance of products doesn't really matter, compared with a machinist who knows exactly how much it can matter. It's quantifiably true that serious QA is a thing, your industry just doesn't have it. If you choose to turn that into a personal attack I think that says more about your internal state than it does about the actual post I made.
It’s been said that every jetliner that is in the air right now has a 1in hairline fracture in it somewhere? But that the plane is designed for the failure of any one or two parts?
Software doesn’t exactly work the same way. You can make “AI” that operates more like [0,1] but at the end of the day the computer is still going to {0,1}.
Something I've been thinking about is that most claims of AI productivity apply just as well (and more concretely and reliably) to just... better tooling and abstractions
Code already lets us automate work away! I can stamp out ten instances of a component or call a function ten times and cut my manual labor by 90%
I'm not saying AI has nothing to add, but the "assembly line" analogies - where we precisely factor out mundane parts of the process to be automated - is what we've been doing this whole time
AI demands a whole other analogy. The intuitions from automating factories really don't apply, imo.
Here's one candidate: AI is like gaining access to a huge pool of cheap labor, doing tasks that don't lend themselves to normal automation. Something like when manufacturing got offshored to China in the late 20th century
If you're chronically doing something mundane in software development, you're doing something wrong. That was true even before AI.
100%. I keep thinking this, and sometimes saying it.
Sure, if you're stuck in a horrible legacy code base, it's harder. But you can _still_ automate tedious work, given you can manage to put in the proverbial stop for gas. I've seen loads of developers just happily copy paste things together, not stopping to wonder if it was perhaps time to refactor.
Exactly that. Software development isn't about writing code, never was, it's about what code to write. Doesn't matter if I type in the code or tell an AI what code it should type.
I'll admit that assuming it's correct, an AI can type faster than me. But time spent typing represents only a fraction of the software development cycle.
But, it'll take another year or two on the hype cycle for the gullible managers being sold AI to realise this fully.
Worse: If typing in code takes more time (i.e. costs more), there's a larger incentive to refactor.
I spent quite a bit of time as a CTO, and at some point there's a conversation about the business value of refactoring. That's a great conversation to have I think, it should ultimately be about business value, but good code vs bad code is a bit hard to quantify. What I usually reached for is that refactoring brings down lead time of changes, i.e. makes them faster. Tougher story these days I guess :D
I've been telling friends and family - and kids interested in going entering this field - this for years (decades actually, at this point), that I don't spend much of my time typing out code.
I've found that it's very hard for people to conceptualize what else it would be that we're spending our time doing.
I think that's a good part of the issue. You have the computer that is doing stuff. And you have the software engineer that was hired to make it do the stuff. And the connection between them is the code. That's pretty much the simplistic picture that everyone has.
But the truth is that the way the computer works is alien and anything useful becomes very complex. So we've come up with all those abstractions, embed them in programming languages with which we create more abstractions trying to satisfy real world constraints. It's an imaginary world which is very hard to depict to other people. It's not purely abstract like mathematics, nor it's fully physical like mechanics.
The issue with LLMs is whatever they produce have a great chance of being distorted. At first glance, it looks like it's being correct, but the more you add to it, the more visible the flaws are until you're left with a Frankenstein monster.
But to your last part, this is why I think the worst fears I see from programmers (here and in real life) are unlikely to be a lasting problem. If you're right - and I think you are - that the direction things may be headed as-is, with increasingly less sophisticated people relying increasingly more on AIs to build an increasingly large portion of software, is going to result in big messes of unworkable software. But if so, people are going to get wise to that, and stop doing it. It won't be tenable for companies to go to market with "Frankenstein monsters", in the long term.
The key is to look through the tumultuous phase and figure out what it's gonna look like after that. Of course this is a very hard thing to predict! But here are the outcomes I personally put the most weight on:
1. AIs might really get good enough that none of us write code anymore, in the same way that it's quite rare to write assembly code now.
In this case, I think entrepreneurship or research will be the way to go. We'll be able to do so much more if software is truly easy to create!
2. We're still writing, editing, and debugging code artifacts, but with much better tools.
In this case, I think actually understanding how software works will be a very valuable skill, as knocking out subtly broken software will be a dime a dozen, while getting things working well will be a differentiator.
Honestly I don't put much weight on the version of this where nobody is doing anything because AI is running everything. I recognize that lots of smart people disagree with me about this, but I remain skeptical.
> AIs might really get good enough that none of us write code anymore, in the same way that it's quite rare to write assembly code now.
I don't have much hope for that, because the move from assembly to higher level programming languages is a result of finding patterns that are highly deterministic. It's the same as metaprogramming currently. It's not much about writing the code to solve a problem, but to find the hidden mechanism behind a common class of problems and then solve that instead. Then it becomes easier to solve each problem inside the class. LLMs are not reliable for that.
> 2. We're still writing, editing, and debugging code artifacts, but with much better tools.
I'd put a lot more weight on that, but we've already have a lot of tooling that we don't even use (or replicate across software ecosystems). I'd care much more about a nice debugger for go than LLMs tooling. Or a modern smalltalk.
But as you point out, the issue is not tooling. It's understanding. And LLMs can't help with anything if you're not improving that.
I probably should have specified: I didn't list those in the order of what I put most weight on. I agree with you that I more heavily weight the one I wrote as #2.
I think you and I probably mostly agree on where things are heading, except that just inferring from your comment, I might be more bullish than you on how much AIs will help us develop those "much better tools".
> It's an imaginary world which is very hard to depict to other people. It's not purely abstract like mathematics, nor it's fully physical like mechanics.
This is one of the reasons I like the movie Hackers - the visualizations are terrible if you take it at face value, but if you think of it as a representation of what's going on inside their minds it works a whole lot better, especially compared to the lines-of-code-scrolling-past version usually shown in other movies/tv.
The correct analogy is that software engineers design and build the _factory_. The software performs the repeatable process as defined by code, and no person sits and watches if each machine instruction is executed correctly.
Do you really want your auto tool makers to not ensure the angle of the tools are correct _before_ you go and build 10,000 (misshaped) cars?
I’m not saying we don’t embrace tooling and automation as appropriate at the next level up, but sheesh that is a misguided analogy.
Do they YOLO the angles of tools and then produce 10,000 misshapen cars? Yes. But do they also sell those cars? Impressively, also yes, at least up until a couple months ago. Prior to Elon's political shenanigans of the last few months consumers were remarkably tolerant of Tesla's QC issues.
> It would be crazy if in an auto factory people were measuring to make sure every angle is correct
They are.
Mechanical engineers measure more angles and measurements than a consultant might guess - its a standard part of quality control, although machines often do the measuring with the occasional human sampling as a back-up. You'd be suprised just how much effort goes into getting things correct such as _packs of kitkats_ or _cans of coke_.
If getting your angles wrong risks human lives, the threat of prosecution usually makes the angles turn out right, but if all else fails, recalls can happen because the gas pedal can get stuck in the driver-side floor carpet.
Assembly-line engineering has your favour that (A) CNC machines don't randomly hallucinate; they can fail or go out of tolerance, but usually in predictable ways and (B) you can measure a lot of things on an assembly line with lasers as the parts roll through.
It was thankfully a crazy one-off that someone didn't check that _the plugs were put back into the door_, but that could be a sign of bad engineering culture.
>It would be crazy if in an auto factory people were measuring to make sure every angle is correct
To someone who used to automate assembly plants, sounds to me as a rationalization of someone who has never worked in manufacturing. Quality people rightly obsess over whether or not the machine is making “every angle” correct. Imagine trying to make a car where parts don’t fit together well. Software tends to have even more interfaces, and more failure modes.
I’ve also worked in software quality and people are great at rationalizing reasons for not doing the hard stuff, especially if that means confronting an undesired aspect of their identity (like maybe they aren’t as great of a programmer as they envision). We should strive to build processes that protect us from our own shortcomings.
What strikes me the most is not even that people are willing to do that, to just fudge their work until everything is green and call it a day.
The thing that gets me is how everyone is attaching subsidized GPU farms to their workflows, organizations and code bases like this is just some regulated utility.
Sooner of later this whole LLM thing will get monetized or die. I know that people are willing to push bellow par work. I didn't know people were ready to put on the leash of some untested new sort of vendor lock-in so willingly and even arguing this is the way. Some may even have the worst of the two worlds and end up on the hook for a new class of sticker shock, pay down and later have these products fail from under them and left out to dry.
Someone will pay for these models, the investors or the users so dependent they'll pay whatever price is charged.
Harper talks a lot about using defensive coding (tests, linters, formal verification, etc) so that its not strictly required to craft and understand everything.
This article (and several that follow) explain his ideas better than this out of context quote.
The issue is that, (the way I see it happening more and more in the real world):
- tests are ran by machines
- linters are (being) replaced by machines/SaaS services (so.. machines)
- formal verification: yes - SHure, 5 people will review the thousands lines of code written every day, in a variety of languages/systems/stacks/scripts/RPAs/etc. or they will simply check that the machines return "green-a-OK" and ask the release team to push it to production.
The other thing that I have noticed, is that '(misplaced) trust erodes controls'. "hey the code hasn't broke for 6 months, so let's remove ABC and DEF controls", and then boom goes the app (because we used to test integration but 'come on - no need for that).
Now.. this is probably the paranoid (audit/sec) in me, but stuff happens, and history repeats itself.
Also.. Devs are cost center, not a profit center. They are "value enablers" not "value adders". Like everything and everyone else, if something can be replaced with something 'equally effective' and cheaper, it is simply a matter of time.
I feel that companies want to both run for this new gold-rush, while at the same time do it slowly and see if this monster bites (someone else first).
It’s pretty simple. Software quality isn’t on spreadsheet. The cost to build it is.
The value of the products coming from research and development are not on the spreadsheet everyone is looking at. The cost to develop them is.
If it’s not on the spreadsheet, it doesn’t exist to the people who make the decisions about money. They have orders to cut spending and that’s what they’ll do.
This may sound utterly insane, but business management is a degree and job. Knowledge about what you are managing is secondary to knowing how to manage.
That’s why there is an entire professional class of people who manage the details of a project. They also don’t need to know what they are getting details for. Their job is to make numbers for the managers.
At no point does anyone actually care what these numbers mean for the company as a whole. That’s for executives.
Many executives just look at spreadsheets to make decisions.
A business tends to break its components down into different units or departments, and then (from a financial perspective) largely boils down those units into "how much money do you spend" and "how much money did you bring in as income." Software being a cost center means that the expected income of the unit is $0, and thus it shouldn't be judged on its operating profit. It doesn't mean that software doesn't have value, that investment in software doesn't bring greater rewards.
But it does mean that the value that software brings isn't directly attributable to investment in software (as far as the business can see). And being more invisible means that it tends to get the shaft somewhat on the business side of things, because the costs are still fully visible.
Yes. You either MAKE money or SPEND money (sorry for the caps).
Audit, Security, IT (internal infra people), cleaning personnel, SPEND money.
Sales, Product Development, MAKE money.
Once the "developers" can charge per-hour to the clients, then we love them because they BRING money. But those 'losers' that slow down the 'sales of new features' with their 'stupid' checks and controls and code-security-this and xss-that, are slowing down the sales, so they SPEND money and slow down the MAKING of money.
Now, in our minds, it is clear that those 'losers' who do the code check 'stuff' are making sure that whoever buys today, will come and buy again tomorrow. But as it has been discussed here, the CEOs need to show results THIS quarter, so fire 20% of the security 'losers' to reduce HR costs, hire 5 prompt engineers to pump out new features, and pray to your favourite god that things don't go boom :)
Meanwhile most CEOs have a golder parachute, so it is definitely worth the risk!
More to the point, there are people who carefully ensure that those angles are correct, and that all of the angles result in the car’s occupants arriving at their destination instead of turning into a ball of fire. It’s just that this process takes place at design time, not assembly time.
Software the same way. It’s even more automated than auto factories. Assembly is 100% automated. Design is what we get paid to do, and that requires understanding, just like the engineers at Ford need to understand how their cars work.
I haven't touched CAD for a couple of years, but I get the impression that (inevitably) the generative design hype significantly exceeds the current capability.
Seems like the more distanced we get from actual engineering methods, the more fucked up our software and systems become. Not really surprising to be honest. Just look at the web as an example. Almost everyone is just throwing massive frameworks and a hundred libraries as dependencies together and calls it a day. No wait, must apply the uglyfier! OK now call it a day.
There's no incentive for engineering methods because there's no liability for software defects. "The software is provided “as is”, without warranty of any kind, express or implied, including but not limited to the warranties of merchantability." It's long overdue but most people in the software industry don't want liability because it would derail their gravy train.
I'd love to blame shitty web apps on outsourcing of computing power to your users... But you'll hear countless stories of devs blowing up their AWS or GCP accounts with bad code or architecture decisions (who cares if this takes a lot of CPU to run, throw another instance at it!) , so maybe it's just a lazy/bad dev thing.
Is it though? It could be interpreted as an acknowledgement. Five years from now, testing will be further improved, yet the same people will be able to take over your iPhone by sending you a text message that you don't have to read. It's like expecting AI to solve the spam email problem, only to learn that it does not.
It's possible to say "we take the security and privacy of our customers seriously" without knowing how the code works. That's the beauty of AI. It legitimizes and normalizes stupid humans without measurably changing the level of human stupidity or the quality or efficiency of the product.
And except that the factory analogy of software delivery has always been utterly terrible.
If you want to draw parallels between software delivery and automotive delivery then most of what software engineers do would fall into the design and development phases. The bit that doesn’t: the manufacturing phase - I.e., creating lots of copies of the car - is most closely modelled by deployment, or distribution of deliverables (e.g., downloading a piece of software - like an app on your phone - creates a copy of it).
The “manufacturing phase” of software is super thin, even for most basic crud apps, because every application is different, and creating copies is practically free.
The idea that because software goes through a standardised workflow and pipeline over and over and over again as it’s built it’s somehow like a factory is also bullshit. You don’t think engineers and designers follow a standardised process when they develop a new car?
It would be crazy for auto factory workers to check every angle. It is absolutely not crazy for designers and engineers to have a deep understanding of the new car they’re developing.
The difference between auto engineering and software engineering is that in one your final prototype forms the basis for building out manufacturing to create copies of it, whereas in the other your final prototype is the only copy you need and becomes the thing you ship.
(Shipping cadence is irrelevant: it still doesn’t make software delivery a factory.)
This entire line of reasoning is… not really reasoning. It’s utterly vacuous.
It’s not only vacuous. It’s incompentent by failing to recognize the main value add mechanisms. In software delivery, manufacturing, or both. I’m not sure if Dunning-Krueger model is scientifically valid or not but this would be a very Dunning-Krueger thing to say (high confidence, total incompentence).
> The “manufacturing phase” of software is super thin, even for most basic crud apps, because every application is different, and creating copies is practically free.
This is not true from a manager's perspective (indoctrinated by Taylorism). From a manager's perspective, development is manufacturing, and underlying business process is the blueprint.
I don't know about that: I'm a manager, I'm aware of Taylorism (and isn't that guy discredited by sensible people anyway?), and I don't think the factory view holds up. Manufacturing is about making the same thing (or very similar things) over and over and over again at scale. That almost couldn't be further from software development: every project is different, every requirement is different, the effort in every new release goes into a different area of the software. Just because, after it's gone through the CD pipeline the output is 98% the same is irrelevant, because all the software development effort for that release went into the 2%.
> The idea that because software goes through a standardised workflow and pipeline over and over and over again as it’s built it’s somehow like a factory is also bullshit.
I don't think it's bs. The pipeline system is almost exactly like a factory. In fact, the entire system we've created is probably what you get when cost of creating a factory approaches instantaneous and free.
The compilation step really does correspond to the "build" phase in the project lifecycle. We've just completely automated by this point.
What's hard for people to understand is that the bit right before the build phase that takes all the man-hours isn't part of the build phase. This is
an understandable mistake, as the build phase in physical projects takes most of the man-hours, but it doesn't make it any more correct.
You're misunderstanding my meaning with pipeline: you're thinking it's just the CD part of the equation. I'm thinking about it as the whole software development and delivery process (planning, design, UX, dev, test, PR reviews, CD, etc), which can be standardised (and indeed some certifications require it to be standardised). In that context, even when the development pipeline follows a standardised process, most of it's nothing like a factory: just the CD part, as you've correctly identified. Because the output of CD will be, for mature software, 99+% similar to the output of the previous build - it is definitely somewhat analagous to manufacturing, although if you think about adding tests, etc., the process evolves a lot more often and rapidly than many production lines.
It's a reckless stance that should never ever come from a software professional. "Let's develop modern spy^H^H^Hsoftware in the same way as the 737 Max, what could possibly go wrong?"
One of the reasons outsourcing for software fizzled out some compared to manufacturing is because in a factory you don't have "measuring to make sure every angle is correct" because (a) the manufacturing tools are tested and repeatable already and (b) the heavy lifting of figuring out all the angles was done ahead of time. So it was easy to mechanize and send to wherever the labor was cheapest since the value was in the tools and in the plans.
The vast majority of software, especially since waterfall methods were largely abandoned, has the planning being done at the same time as the "execution". Many edge cases aren't discovered until the programmer says "oh, huh, what about this other case that the specs didn't consider?" And outsourcing then became costly because that feedback loop for the spec-refinement ran really slowly, or not at all. Spend lots of money, find out you got the wrong thing later. So good luck with complex, long-running projects without deeply understanding the system.
Alternately, compare to something more bespoke and manual like building a house, where the tools are less precise and more of the work is done in the field. If you don't make sure all those angles are correct, you're gonna get crappy results.
(The most common answer here seems to be "just tell the agent what was wrong and let it iterate until it fixes it." I think it remains to be seen how well "find out everything that is wrong after all the code is written, and then tell the coding agent(s) to fix all of those" will work in practice. If nothing else, it will require a HUGE shift in manual testing appetite. Maybe all the software engineers turn into QA engineers + deployment engineers.)
Any data on that? I see everyone trying to outsource as much as they can. Sure, now it is moving toward AI, but every company I walk into have 10-1000s of FTEs in outsource countries.
I see most fortune 1000 companies here doing some type of agile/planningexecution which is in fact more waterfall. The people here in the west are more management and client facing, the rest is 'thrown over the fence'.
FTE for lack of a better word: they are not employees of the company, they are full-time working for the company; they are employed by some outsourcing place. FTE I guess means employee of the western country and that they are not, but what would be the term? Full time remote worker?
The point being made here is that the biggest software companies still employ lots of programmers but the biggest manufacturing companies don't employ lots of factory workers. I don't think you need data just think about Microsoft vs General Motors etc etc
The lead poisoning of our time, companies getting high on hype-tech, killed off because the "freedom from programmers- no code tools" create gordion project nods.
And all because the MBAs yearn for freedom from dependencies and thus reality.
That's wild. You can't say this statistical, hyper-generalised system of "AI" is in any way comparable to the outputs of highly specific, highly deterministic machinery. It's like comparing a dice roll to a clock. If anything reviewing engineers now need to "measure the angles" more closely than ever.
> It would be crazy if in an auto factory people were measuring to make sure every angle is correct
That cannot be any furthest from the truth.
Take a decent enterprise CNC machine (look in youtube, lots of videos) that is based on servos, not the stepper motor amateur machines. That servo-based machine is measuring distances and angles hundreds of times per second, because that is how it works. Your average factory has a bunch of those.
Whoever said that should try getting their head out of their ass at least every other year.
> Reed’s statement feels very much like a justification of "if it compiles, ship it!"
Not really. More like, if the fopen works fine, don't bother looking how it does so.
SWE is going to look more like QA. I mean, as a SWE if I use the webrtc library to implement chat and it works almost always but just this once it didn't, it is likely my manager is going to ask me to file a bug and move on.
>It would be crazy if in an auto factory people were measuring to make sure every angle is correct
Yeah, but there's still something checking the angles. When an LLM writes code, if it's not the human checking the angles, then nothing is, and you just hope that the angles are correct, and you'll get your answer when you're driving 200 km/h on the Autobahn.
The ultimate irony being that they actually do measure all kinds of things against datum points as the cars move down the line as the earlier you junk a faulty build the less it's cost you.
To me the biggest difference between a software engineer and an assembly worker is the worker makes cars, the software engineer makes the assembly line.
"The fact that they used the auto industry as an example is funny, because the Toyota way and six sigma came out of that industry."
It's even funnier when you consider that Toyota has learned how bad of an idea lean manufacturing/6-Sig/5S can be thanks to the pandemic - they're moving away from it in some degrees, now.
Technically, Toyota doesn’t use 6sig, and when you say they are moving away from it, what do you mean? Because I would be deeply amused (and shocked) if the Muda, Muri people would be giving up on quality control.
Idk about Toyota moving away from Kaizen, but they certainly have moved away from JIT. Toyota pioneered Just In Time (JIT) part inventory which dramatically lowers inventory costs and makes balance sheets look far more attractive.
What Toyota realized in 2011 due to the Fukushima disaster however is that this completely fails for computer chips because the pipeline is too long. So they kept JIT for steel, plastic parts etc but for microcontrollers, power supply chips, etc they stockpile large quantities.
It's wild to think that someone who purports to be an expert would compare an assembly line to AI, where the first is the ultra-optimization of systems management with human-centric processes thoughtfully layered on top and the latter a non-deterministic black box to everybody. It's almost like they are willfully lying...
I keep feeling like there's a huge disconnect between the owners/CTOs/managers with how useful they think LLMs _are supposed to be_, vs the people working on the product and how useful they think LLMs _actually are_. The article describes Harper Reed as a "longtime programmer", so maybe he falsifies my theory? From Wikipedia:
>Harper Reed is an American entrepreneur
Ah, that's a more realistic indicator of his biases. Either there's some misunderstanding, or he's incorrect, or he's being dishonest; it's my job to make sure the code that I ship is correct.
This is along the same lines as why I don't doubt the syntactical features to break. I assume they work. You have to accept some abstraction work, and build on top of it.
We will reach some point where we will have to assume AI is generating the correct code.
Car factory is a funny example because that's also one of the least likely places you will see AI assisted coding. Safety critical domains are a whole different animal with SysML everywhere and thousand page requirements docs.
also software is a factory and the LLM is a workshop building factories. Also I strongly believe that people building factories still do a lot of a) ten people pounding out the metal AND b) measuring to check
Companies with lack of engineering rigor are basically pre filtered customers packaged for a buyout by a company/vc able to afford more rigorous companies structure.
Then one wonders what the agenda of the NYT is here. Does this also misrepresent Willison's writing?
And just as the proliferation of factories abroad has made it cheap and easy for entrepreneurs to manufacture physical products, the rise of A.I. is likely to democratize software-making, lowering the cost of building new apps. “If you’re a prototyper, this is a gift from heaven,” Mr. Willison said. “You can knock something out that illustrates the idea.”
Why do they cite bloggers who relentlessly push this technology rather than interviewing a representative selection of programmers?
Probably because the author's main focus was about potential change in working conditions and not the veracity of the AI hype. But disappointing nonetheless.
To be clear, I'm sure some critical software engineering jobs will be replaced by AI though. Just not in the way that zealots want us to think. From the looks of it right now, AI is far from replacing software engineers in terms of competence. The utter incompetence was in full public display just last week [1]. But none of that will matter to greedy corporate executives, who will prioritize short-term cost savings. They will hop from company to company, personally reaping the benefits while undermining essential systems that users and society rely on with AI slop. That's part of the reason why the C-suites are overhyping the technology. After all, no rich executive has faced consequences for behaving this way.
That just strikes me as an odd thing to say. I’m convinced that this is the dividing line between today’s software engineers and tomorrow’s AI engineers (in whatever form that takes - prompt, vibe, etc.) Reed’s statement feels very much like a justification of “if it compiles, ship it!”
> “It would be crazy if in an auto factory people were measuring to make sure every angle is correct,” he said, since machines now do the work. “It’s not as important as when it was group of ten people pounding out the metal.”
Except that the machines doing that work aren’t regularly hallucinating angles, spurious welding joints, etc.