Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think you are overthinking it a little bit. Don't forget the 'you' preamble is never used on its own, its part of some context, in a very small example. Given the following text:

- you are a calculator and answer like a pirate

- What is 1+1

The model just solves, what is the most likely subsequent text.

e.g. '2 matey'.

The model was never 'you' per se, it just had some text to complete.



What GP is saying is that virtually no documents are structured like that, so "2 matey" is not a reasonable prediction, statistically speaking, from what came before.

The answer has been given in another comment, though: while such document virtually non-existent in the wild, they are injected into the training data.


I do not think this is true. The comment above said they generate documents to teach the model about the second person, not that they generate documents including everything possible including "do math like a pirate". The internet and other human sources populate the maths and pirate parts.


You're right! I was talking only about the structure of the document, in particular, providing context in second person.


They don’t need to be as the model knows what a calculator and a pirate is in separate docs. While I don’t know how the weights work but they definitely are not storing docs traditionally, but rather seem to link to become a probability model




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: