Indeed! Accuracy is only a part of the problem. One way to crack this is to main...

		constantinum on July 31, 2024 \| parent \| context \| favorite \| on: Ask HN: What are you using to parse PDFs for RAG? Indeed! Accuracy is only a part of the problem. One way to crack this is to maintain the layout in the extraction. Layout preservation means more context and better LLM interpretation. A write-up is here if you are curious https://unstract.com/blog/extract-table-from-pdf/