When I started playing with this stuff in the GPT-4 days (8K context!), I wrote ...

When I started playing with this stuff in the GPT-4 days (8K context!), I wrote a script that would search for a relevant passage in a book, by shoving the whole book into GPT-4, in roughly context sized chunks.

I think it was like a dollar per search or something in those days. We've come a long way!

Anthropic, in their RAG article, actually say that if your thing fits in context, you should probably just put it there instead of using RAG.

I don't know where the optimal cutoff is though, since quality does suffer with long contexts. (Not to mention price and speed.)

https://www.anthropic.com/engineering/contextual-retrieval

The context size and pricing has come so far! Now the whole book fits in context, and it's like 1 cent to put the whole thing in context.

(Well, a little more with Anthropic's models ;)