Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The problem is that it doesn’t stay simplified; once you leave the room those people start using that model to explain details of actual LLM behavior. I’m constantly arguing with people who use this mental model to explain why LLMs can’t do things that they absolutely can do.

I frequently see people underestimate the danger of LLMs to humanity in the long term and to people’s jobs in the medium term because they follow this chain of reasoning:

- An LLM is basically a fancy markov chain (highly dubious even if there’s a tiny kernel of truth in there) - Therefore markov chains and LLMs have a similar skill ceiling (definitely wrong) - My job could never be done by a markov chain, it’s much too complex (this is presented self-evidently, no one ever feels the need to back this up) - Therefore, since LLMs are basically markov chains, and a markov chain could never do my job, I have nothing to worry about.

I’ll grant that you’re not necessarily responsible for these people using your model in a way you didn’t intend. But I do think at this point it’s time to start pushing people to stop trying to reason mechanically about how these things work.



1. An LLM is absolutely a Markov chain. That is not meant to dismiss that they are a remarkable feat of generating and compressing the representation of really cool Markov chains.

2. Because they are Markov chains, their skill ceiling is bounded by whatever the skill ceiling of Markov chains happens to be. What LLMs demonstrate is that the skill ceiling of Markov chains is higher than previously understood.

3. If the skill ceiling of Markov chains is high enough, then one could take over some or all of someone's job.


I think there’s an equivocation being accidentally made between n-gram models, and Markov processes, because “Markov chain” is used to mean both things.

N-gram models are not useful in many of the ways LLMs are.

N-gram models are very limited.

On the other hand, basically any process can be considered a Markov process if you make the state include enough.

So, calling both the very-limited n-gram models and the nearly-unlimited Markov processes, by the same name “Markov chain” is just, super confusing.


> I frequently see people underestimate the danger of LLMs to humanity in the long term

My experience with people making claims about the inherent danger has been along the lines of this later quote:

> this is presented self-evidently, no one ever feels the need to back this up




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: