Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You’d think it would be smart enough to know that for this particular question, the details of the answers have not changed since 2021.


The model is trained to, essentially, fabulate an excuse in response to correction; which also gets to a major limitation: it is not learning truth from falsehood but rather learning what human evaluators like or dislike.

"ChatGPT sometimes writes plausible-sounding but incorrect or nonsensical answers. Fixing this issue is challenging, as: (1) during RL training, there’s currently no source of truth; (2) training the model to be more cautious causes it to decline questions that it can answer correctly; and (3) supervised training misleads the model because the ideal answer depends on what the model knows, rather than what the human demonstrator knows."

https://openai.com/blog/chatgpt/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: