Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

They took their time precisely specifying the robot movements so they're always repeatable. Almost like...some sort of carefully crafted code.


You misconstrue the analogy. The robot isn’t equivalent to the code in this analogy. It’s the thing that generates the code.

The robot operates deterministically, it has a fixed input and a fixed output. This is what makes it reliable.

Your “AI coder” is nothing like that. It’s non deterministic on its best day, and it gets everything thrown at it so even more of a coin toss. This seriously undermines any expectation of reliability.

The guy’s comparison shows a lack of understanding of either of the systems.


> The robot isn’t equivalent to the code in this analogy. It’s the thing that generates the code.

I think this inversion is what a lot of people are missing, or just don't understand (because they don't understand what code is or how it works).


I totally understand that inversion but I think it's a bad analogy.

Industrial automation works by taking a rigorously specified designs developed by engineers and combining it with rigorous quality control processes to ensure the inputs and outputs remains within tolerances. You first have to have a rigorous spec, then you can design a process for manufacturing a lot of widgets while checking 1 out of every 100 of them for their tolerances.

You can only get away with not measuring a given angle on widget #13525 because you're producing many copies of exactly the same thing and you measured that angle on widget #13500 and widget #13400 and so on and the variance in your sampled widgets is within the tolerances specified by the engineer who designed the widget.

There's no equivalent to the design stage or to the QC stage in the vibe-coding process advocated for by the person quoted above.


Yea I think we're saying the same thing


That was exactly their point.


> The robot isn’t equivalent to the code in this analogy

I never said it is. The code is the code that controls the robot and makes it behave deterministically.


Except the code it creates is deterministic.


I don't know what you mean with "the code it creates is deterministic" but the process an LLM uses to generate code based on an input is definitely not entirely deterministic.

To put it simply, the chances that an LLM will output the same result every time given the same input is low. The LLM does not operate deterministically, unlike the manufacturing robot who will output the same door panel every single time. Or as ChatGPT put it:

> The likelihood of an LLM like ChatGPT generating the exact same code for the same prompt multiple times is generally low.


For any given seed value, the output of an LLM will be identical- it is deterministic. You can try this at home with Llama.cpp by specifying a seed value when you load a LLM, and then seeing that for a given input the output will always be the same. Of course there may be some exceptions (cosmic ray bit flips). Also, if you are only using online models, you can't set the seed value, plus there are multiple models, so multiple seeds. In summary, LLMs are deterministic.


> the process an LLM uses to generate code based on an input is definitely not entirely deterministic

Technically correct is the least useful kind of correct when it's wrong in practice. And in practice the process AI coding tools use to generate code is not deterministic which is what matters. To make matters worse in the comparison with a manufacturing robot, even the input is never the same. While a robot get the exact command for a specific motion and the exact same piece of sheet metal, in the same position, a coding AI is asked to work with varied inputs and on varied pieces of code.

Even stamping metal could be called "non-deterministic" since there are guaranteed variations, just within determined tolerances. Does anyone define tolerances for generated code?

That's why the comparison shows a lack of understanding of either of the systems.


I don't really understand your point. An LLM is loaded with a seed value, which is a number. The number may be chosen through some pseudo- or random process, or specified manually. For any given seed value, say 80085, the LLM will always and exactly generate the same tokens. It is not like stamped sheet metal, because it is digital information not matter. Say you load up R1, and give it a seed value of 80085, then say "hi" to the model. The model will output the exact same response, to the bit, same letters, same words, same punctuation, same order. Deterministic. There is no way you can say that an LLM is non-deterministic, because that would be WRONG.


WRONG lol.

First you're assuming a brand new conversation: no context. Second you're assuming a local-first LLM because a remote one could change behavior at any time. Third, the way the input is expressed is inexact, so minor differences in input can have an effect. Fourth, if the data to be operated on has changed you will be using new parts of the model that were never previously used.

But I understand how nuance is not as exciting as using the word WRONG in all caps.


Arguing with "people" on the internet... Nuance is definitely a word of the year, and if you look at many models you can actually see it's high probability.

Addressing your comment, there was no assumption or indication on my part that determinism only applies to a new "conversation". Any interactions with any LLM are deterministic, same conversation, for any seed value. Yes, I'm talking about local systems, because how are you going to know what is going on on a remote system? On a local system, a local LLM, if the input is expressed in the same way, the output will be generated in the same way, for all of the token context and so on. That means, for a seed value, after "hi", the model may say "hello", and then the human's response may be "how ya doin'", and then the model would say "so so , how ya doin?", and every single time, if the human or agent inputs the same tokens, the model will output the same tokens, for a given seed value. This is not really up for question, or in doubt or really anything to disagree about. Am I not being clear? You can ask your local LLM or remote LLM and they will certainly confirm that the process by which a language model generates is deterministic, by definition. Same input means same output, again I must mention that the exception is hardware bit flips, such as those caused by cosmic rays, and that's just to emphasize how very deterministic LLMs are. Of course, as you may know, online providers stage and mix LLMs, so for sure you are not going to be able to know that you are wrong by playing with chatgpt, grok/q, gemini, or whatever other only LLMs you are familiar with. If you have a system capable of offline or non-remote inference, you can see for yourself that you are wrong when you say that LLMs are non-deterministic.


I feel this is technically correct but intentionally cheating. no one - including the model creators - expects that to be the interface; it undermines they entire value proposition of using an LLM in the first place if I need to engineer the inputs to ensure reproducability. I'd love to hear some real world scenarios that do this where it wouldn't be simpler to NOT use AI.


When should a model's output be deterministic? When should a model's output be non-deterministic?

When many humans interact with the same model, then maybe the model should try different seed values, and make measurements. When model interaction is limited to a single human, then maybe the model should try different seed values, and make measurements.


It's simple. You run the code generated by the llm you will get a deterministic result of said code.


LLMs almost always operate at least partly stochastically.


Of course but the code they generate only operates in one way.


Who determines if that code operates in the intended way?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: