Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It makes sense if you think of the LLM as building a data-aware model that compresses the noisy data by parsimony (the principle that the simplest explanation that fits is best). Typical text compression algorithms are not data-aware and not robust to noise.

In lossy compression the compression itself is the goal. In prediction, compression is the road that leads to parsimonious models.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: