Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Or it is you who are confused. And I want to remind you that you can't retcon historical word use.
 help



Fei Fei was annotating images... the second L in LLM is for "language". The first language models named LLM at the time were trained on language data, with an objective function of predicting the next token. It had nothing to do with the imagenet data. Imagenet data was used in... vision models.

The attention is all you need paper didn't ever use the term LLM or large language model because the phrase didn't exist in industry.

Why comment on a field you know nothing about?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: