Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Although RAG is often implemented via vector databases to find 'relevant' content, I'm not sure that's a necessary component. I've been doing what I call RAG by finding 'relevant' content for the current prompt context via a number of different algorithms that don't use vectors.

Would you define RAG only as 'prompt optimisation that involves embeddings'?



Sure thing, your RAG approach sounds intriguing, especially since you're sidestepping vector databases. But doesn't the input context length cap affect it? (chatgpt plus at 32K [0] or gpt4 via open ai at 128K [1]) Seems like those cases would be pretty rare though.

[0]: https://openai.com/chatgpt/pricing#:~:text=8K-,32K,-32K

[1]: https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turb...


Yes, context window is a limiting factor, but that's true however you identify the content to augment generation.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: