That realistic model it builds of the information you are feeding it, if that’s ...

kilgnad · on Feb 15, 2023

Don't know. But the fact that it often produces a "smart" answer is a phenomenon that needs investigation.

You cannot simply dismiss this thing that passes a Google L3 interview and bar exam just because it got some addition problem wrong. That would be bias.

mrguyorama · on Feb 15, 2023

No, it's not bias to understand a stopped clock is right twice a day. The reason LLMs sometimes generate accurate things and sometimes don't is because what they have been trained to do is chose a word to add next. Truth and lies are not a property of the structure of language, but rather information that is conveyed by language. Therefore, both truth and lie are perfectly valid continuations to make "valid" sentences with. LLMs develop no understanding of truth or lie. They just have a statistical model of what words go with what other words, and are plucking the next word based on that and whatever magical metaparams are being tweaked by the data scientists in charge.

This is also why it is so good at programming. Programming languages are intentionally designed, often to be very regular, often to be easy to learn, and with usually very strict and simple structures. The syntax can often be diagrammed on one normal sheet of paper. It makes perfect sense that "add a token to this set of tokens based on the statistical likelihood of what would be a common next token" produces often syntactically correct code, but more thorough observers note that the code is often syntactically convincing but not even a little correct. It's trained on a bunch of programming textbooks, a bunch of "Lets do the common 10 beginner arduino projects" books, a bunch of stackoverflow stuff, probably a bunch of open source code etc.

OF COURSE it can pass a code interview sometimes, because programming interviews are TERRIBLE at actually filtering who can be good software developers and instead are great at finding people who can act confident and write first-glance correct code.