> By total coincidence, some hallucinations happen to reflect the truth, but only because the training data happened to generally be truthful sentences.
It's not a "total coincidence". It's the default. Thus, the model's responses aren't "divorced from any concept of truth or reality" - the whole distribution from which those responses are pulled is strongly aligned with reality.
(Which is why people started using the term "hallucinations" to describe the failure mode, instead of "fishing a coherent and true sentence out of line noise" to describe the success mode - because success mode dominates.)
Humans didn't invent language for no reason. They don't communicate to entertain themselves with meaningless noises. Most of communication - whether spoken or written - is deeply connected to reality. Language itself is deeply connected to reality. Even the most blatant lies, even all of fiction writing, they're all incorrect or fabricated only at the surface level - the whole thing, accounting for the utterance, what it is about, the meanings, the words, the grammar - is strongly correlated with truth and reality.
So there's absolutely no coincidence that LLMs get things right more often than not. Truth is thoroughly baked into the training data, simply because it's a data set of real human communication, instead of randomly generated sentences.
The problem - as defined by how end users understand it - is that the model itself doesn't know the difference, and will proclaim bullshit with the same level of confidence that it does accurate information.
That's how you end up with grocery store chatbots recommending mixing ammonia and bleach for a cocktail, or lawyers using chatbots to cite entirely fictional case law before a judge in court.
Nothing that comes out of an LLM can be implicitly trusted, so your default assumption must be that everything it gives you needs verification from another source.
Telling people "the truth is baked in" is just begging for a disaster.
> your default assumption must be that everything it gives you needs verification from another source
That depends entirely on what you're doing with the output. If you're using it as a starting point for something that must be true (whether for legal reasons, your own reputation as the ostensible author of this content, your own education, etc.) then yes, verification is required. But if you're using it for something low-stakes that just needs some semblance of coherent verbiage (like the summary of customer reviews on Amazon, or the SEO junk that comes before the recipe on cooking websites which have plenty of fiction whether or not an LLM was involved) then you can totally meet your goals without any verification.
People have been capable of bullshitting at scale for a very long time. There are occasional consequences (hoaxes, scams, etc.) but the guidelines around fide sed vide are ancient; this is just the latest addendum.
This is just moving the goalposts. The post I replied to was claiming that models "have the truth baked in". Real people in the real world are misusing them, in no small part because they don't know that the models are unreliable, and OP's claims only make that worse.
> It's not a "total coincidence". It's the default. Thus, the model's responses aren't "divorced from any concept of truth or reality" - the whole distribution from which those responses are pulled is strongly aligned with reality.
One big caveat here - the responses are strongly aligned with the training data. We can't necessarily say that the training data itself is strongly aligned with reality.
And even if it was ‘absolutely no coincidence’, that is still only reflective of the reality as perceived by the average of all the people from the training set.
> Even the most blatant lies, even all of fiction writing, they're all incorrect or fabricated only at the surface level - the whole thing, accounting for the utterance, what it is about, the meanings, the words, the grammar - is strongly correlated with truth and reality.
I would reject this pretty firmly. As you said, people write whole novels about imagined worlds and people about magic or technology or whatever that doesn't or can't exist. The LLM may understand what words mean and "know" how to string them together into a meaningful and grammatical sentence, but that's entirely different than a truthful sentence.
Truth requires some mechanism of fact finding, or chains of evidence, or admitting when those chains don't exist. LLMs have nothing like that.
> By total coincidence, some hallucinations happen to reflect the truth, but only because the training data happened to generally be truthful sentences.
It's not a "total coincidence". It's the default. Thus, the model's responses aren't "divorced from any concept of truth or reality" - the whole distribution from which those responses are pulled is strongly aligned with reality.
(Which is why people started using the term "hallucinations" to describe the failure mode, instead of "fishing a coherent and true sentence out of line noise" to describe the success mode - because success mode dominates.)
Humans didn't invent language for no reason. They don't communicate to entertain themselves with meaningless noises. Most of communication - whether spoken or written - is deeply connected to reality. Language itself is deeply connected to reality. Even the most blatant lies, even all of fiction writing, they're all incorrect or fabricated only at the surface level - the whole thing, accounting for the utterance, what it is about, the meanings, the words, the grammar - is strongly correlated with truth and reality.
So there's absolutely no coincidence that LLMs get things right more often than not. Truth is thoroughly baked into the training data, simply because it's a data set of real human communication, instead of randomly generated sentences.