When I started playing with this stuff in the GPT-4 days (8K context!), I wrote a script that would search for a relevant passage in a book, by shoving the whole book into GPT-4, in roughly context sized chunks.
I think it was like a dollar per search or something in those days. We've come a long way!
Anthropic, in their RAG article, actually say that if your thing fits in context, you should probably just put it there instead of using RAG.
I don't know where the optimal cutoff is though, since quality does suffer with long contexts. (Not to mention price and speed.)
I think it was like a dollar per search or something in those days. We've come a long way!
Anthropic, in their RAG article, actually say that if your thing fits in context, you should probably just put it there instead of using RAG.
I don't know where the optimal cutoff is though, since quality does suffer with long contexts. (Not to mention price and speed.)
https://www.anthropic.com/engineering/contextual-retrieval
The context size and pricing has come so far! Now the whole book fits in context, and it's like 1 cent to put the whole thing in context.
(Well, a little more with Anthropic's models ;)