> significant extra effort is required to make them reproducible.
Zero extra effort is required. It is reproducible. The same input produces the same output. The "my machine" in "Works on my machine" is an example of input.
> Engineering in the broader sense often deals with managing the outputs of variable systems to get known good outcomes to acceptable tolerances.
You can have unreliable AIs building a thing, with some guidance and self-course-correction. What you can't have is outcomes also verified by unreliable AIs who may be prompt-injected to say "looks good". You can't do unreliable _everything_: planning, execution, verification.
If an AI decided to code an AI-bound implementation, then even tolerance verification could be completely out of whack. Your system could pass today and fail tomorrow. It's layers and layers of moving ground. You have to put the stake down somewhere. For software, I say it has to be code. Otherwise, AI shouldn't build software, it should replace it.
That said, you can build seemingly working things on moving ground, that bring value. It's a brave new world. We're yet to see if we're heading for net gain or net loss.
If we want to get really narrow I'd say real determinism is possible only in abstract systems, to which you'd reply it's just my ignorance of all possible factors involved and hence the incompleteness of the model. To which I'd point of practical limitations involved with that. And that reason, even though it is incorrect and I don't use it in this way, I understand why some people are using the quantifiers more/less with the term "deterministic", probably for the lack of a better construct.
I don't think I'm being pedantic or narrow. Cosmic rays, power spikes, and falling cows can change the course of deterministic software. I'm saying that your "compiler" either has intentionally designed randomness (or "creativity") in it, or it doesn't. Not sure why we're acting like these are more or less deterministic. They are either deterministic or not inside normal operation of a computer.
To be clear: I'm not engaging with your main point about whether LLMs are usable in software engineering or not.
I'm specifically addressing your use of the concept of determinism.
An LLM is a set of matrix multiplies and function applications. The only potentially non-deterministic step is selecting the next token from the final output and that can be done deterministically.
By your strict use of the definition they absolutely can be deterministic.
But that is not actually interesting for the point at hand. The real point has to do with reproducibility, understand ability and tolerances.
3blue1brown has a really nice set of videos on showing how the LLM machinery fits together.
They _can_ be deterministic, but they usually _aren't_.
That said, I just tried "make me a haiku" via Gemini 3 Flash with T=0 twice in different sessions, and both times it output the same haiku. It's possible that T=0 enables deterministic mode indeed, and in that case perhaps we can treat it like a compiler.
Depends if you're using the botanical definition or the (more common) culinary definition[0].
I would argue fruit and fruit are two words, one created semasiologically and the other created onomasiologically. Had we chosen a different pronunciation for one of those words, there would be no confusion about what fruits are.
Yup. Though rather than say "fruit and fruit" are two words, or focusing on "definitions" (which tend to morph over time anyway), I think the more straightforward and typical approach is to just recognize that the same word can have different meanings in different contexts.
This is such a basic and universal part of language, it is a mystery to me why something so transparently clueless as "actually, tomato is a fruit" persists.
I mean, a jelly is just broadly any thickened sweet goop (doesn't even have to be fruit, and is often allowed to have some savoury/umami, e.g. mint jelly or red pepper jelly). Usually a jelly also is relatively clear and translucent, as it is made with puree / concentrate strained to remove large fibers, but this isn't really a strict requirement, and the amount of straining / translucency is generally just a matter of degree. There are opaque jellies out there, and jellies with bits and pieces.
Ketchup has essentially all the key defining features of a jelly, technically, just is more fibrous / opaque and savoury than most typical jellies.
But, of course, calling a ketchup "jelly", due to such technical arguments, is exactly as dumb as saying "ayktually, tomato is a fruit": both are utterly clueless to how these words are actually used in culinary contexts.
> Consider what happens when you build software professionally. You talk to stakeholders who do not know what they want and cannot articulate their requirements precisely. You decompose vague problem statements into testable specifications. You make tradeoffs between latency and consistency, between flexibility and simplicity, between building and buying. You model domains deeply enough to know which edge cases will actually occur and which are theoretical. You design verification strategies that cover the behaviour space. You maintain systems over years as requirements shift.
I'm not sure why he thinks current LLM technologies (with better training) won't be able to do more and more of this as time passes.
To genuinely "talk to stakeholders" requires being part of their social world. To be part of their social world you have to have had a social past - to have been a vulnerable child, to have experienced frustration and joy. Efforts to decouple human development from human cognition betray a fundamental misunderstanding.
And, again, is one person going to develop those? A person with access to elastic rope might invent the slingshot, but I wouldn't expect them to invent the far superior sling: it's not obvious that the sling is better, since the learning curve is steeper. And a slingshot is not a particularly effective weapon: it's an inefficient bow that can't fire arrows.
You're still thinking in terms of "sighted society versus blind society", which is not what we are discussing. (Unless you're thinking "sighted and superintelligent", in which case I'd say sight is probably redundant.)
Ok. Just evading blind people would be absurdly easy if you can see. You could accurately throw rocks and run away from them all day. And being attacked from a distance would be terrifying to blind people.
Blind people are no less capable of throwing stones, and you only have the flight advantage if the ground is potentially-treacherous (e.g. unmanaged forest, scrubland) or you're that much faster. Any inhabited area will have been engineered to be safe for people to navigate – and it will not be lit well at night, where your reliance on vision will put you at a skill disadvantage.
The main advantage in an urban combat environment, I think, would be the ability to detect quiet people at a distance. Not needing to see makes it easier to hide yourself from visual inspection, but why would anyone develop this skill if nobody can see? Then, if the only person to practice with is the enemy you're trying to hide from… Also, you'd be able to dodge projectiles by watching the person throwing them, who might not telegraph their throws audibly, but would probably do so visually. This would let you defeat a single ranged opponent, possibly two – though I doubt your ability to dodge the rocks from three people at once for long enough to take one down.
But what do you gain from winning fights against small numbers of people? (I doubt very much you could win against a group of 30 or 40 opponents, with only sight as your advantage.) You would run out of food, shelter would be hard to come by, and every theft of resources would risk defeat: and one defeat against a society means it's over. Either you're killed, imprisoned, or they decide to do something else with you, presumably depending how much of a menace you've been. Your only options are to attempt a self-sufficient lifestyle (which you probably won't survive for long), to flee somewhere they haven't heard of your deeds, or to put yourself at the mercy of the justice system (and hope it isn't too retributive).
"Blind people are no less capable of throwing stones"
They sure suck at aiming.
But the best way to exploit the ability to see when everyone else is blind is to provide a service blind people can't. You could be a much better doctor and diagnose diseases based on site and perform surgery much better.
> This experiment was inspired by @swyx’s tweet about Ted Chiang’s short story “Understand” (1991). The story imagines a superintelligent AI’s inner experience—its reasoning, self-awareness, and evolution. After reading it and following the Hacker News discussion, ...
Umm...
I <3 love <3 Understand by Ted Chiang,
But the story is about super intelligent *humans*.
I'm glad the author spent some time thinking about this, clarifying his thoughts and writing it down, but I don't think he's written anything much worth reading yet.
He's mostly in very-confident-but-not-even-wrong kind of territory here.
One comment on his note:
> As an example, let’s say an LLM is correct 95% of the time (0.95) in predicting the “right” tokens to drive tools that power an “agent” to accomplish what you’ve asked of it. Each step the agent has to take therefore has a probability of being 95% correct. For a task that takes 2 steps, that’s a probability of 0.95^2 = 0.9025 (90.25%) that the agent will get the task right. For a task that takes 30 steps, we get 0.95^30 = 0.2146 (21.46%). Even if the LLMs were right 99% of the time, a 30-step task would only have a probability of about 74% of having been done correctly.
The main point that for sequential steps of tasks errors can accumulate and that this needs to be handled is valid and pertinent, but the model used to "calculate" this is quite wrong - steps don't fail probabilistically independently.
Given that actions can depend on outcomes of previous step actions and given that we only care about final outcomes and not intermediate failing steps, errors can be corrected. Thus even steps that "fail" can still lead to success.
(This is not a Bernoulli process.)
I think he's referencing some nice material and he's starting in a good direction with defining agency as goal directed behaviour, but otherwise his confidence far outstrips the firmness of his conceptual foundations or clarity of his deductions.
Part of the problem seems to be that he’s trying to derive a large portion of philosophy from first principles and low-n observations.
This stuff has been well-trodden by Dennett, Frankfurt, Davidson, and even Hume. I don’t see any engagement with the centuries (maybe millennia) of thought on this subject, so it’s difficult to determine whether he thinks he’s the first to notice these challenges or what new angle he’s bringing to the table.
> I don’t see any engagement with the centuries (maybe millennia) of thought on this subject
I used to be that person, but then someone pointed me to the Stanford Encyclopedia of Philosophy which was a real eye-opener.
Every set of arguments I read I thought "ya, exactly, that makes sense" and then I read the counters in the next few paragraphs "oh man, I hadn't thought of that, that's true also". Good stuff.
I've wanted something like this for a while to use with architectural PlantUML diagrams rendered to SVG with hyperlinks linking to their implementations.
Engineering in the broader sense often deals with managing the outputs of variable systems to get known good outcomes to acceptable tolerances.
Edit: added second paragraph
reply