I built FinSight because I wanted to analyze my spends, inflow and outflow. But I did not want to upload a statement to a cloud LLM for data privacy.
Finsight provides LLM-assisted transaction categorization without uploading bank or credit card statements to a 3rd Party service.
Architecture: PDF parsing client-side via pdfjs-dist, AI inference via local Ollama/LM Studio API, storage in localStorage/sessionStorage via Zustand. No backend (yet)
A few things I found technically interesting:
1. Context window management is the main challenge with long statements.
I'm handling it by chunking transactions and doing a second pass
aggregation. It works but it's the messiest part of the codebase —
would genuinely value feedback on better approaches.
2. 1B parameter models are sufficient for parsing. 7B models give
meaningfully better categorization accuracy. The main constraint
isn't model capability — it's context window length with large statements and speed.
3. Personally, Qwen 3 gave me the best results but was the slowest in processing a large file. Gpt-oss-20b was faster but the categorization wasn’t as good. Speed is of course, hardware dependent.
3. PDF statement formats vary enormously between banks. LLM-based extraction handles this variation better than any regex approach I’ve tried.
Caveats: setup requires Ollama or LM Studio plus a model download,
which is 20-30 minutes on a fresh machine.
Really cool apology! I wonder why you're choosing this hill to die on with this tone. Belligerence in the original post is what led to this and you're continuing with the same tone. You and your team do deserve to be dunked on
Klub Works | Bangalore, India | Full Time | https://www.klubworks.com/careers
Klub is building one of the first and largest Revenue Based Financing (RBF) platforms in India. With a vision to democratize financing, Klub offers non-dilutive financing options to early-stage and growth-stage brands to access capital without any equity stake or collateral.
Klub has one of the largest RBF customer base in India. We've built our portfolio through a team of 65 (as of date). We've previously raised funding from Sequoia, GMO Venture Partners, Alter and many others.
I'm confused. Are you a Gitlab employee or are you addressing a Gitlab employee?
Edit: From the parent commenters profile, it appears they are a Gitlab employee. The comment itself was confusing due to the comma after the Gitlab Employee
Finsight provides LLM-assisted transaction categorization without uploading bank or credit card statements to a 3rd Party service.
Architecture: PDF parsing client-side via pdfjs-dist, AI inference via local Ollama/LM Studio API, storage in localStorage/sessionStorage via Zustand. No backend (yet)
A few things I found technically interesting:
1. Context window management is the main challenge with long statements. I'm handling it by chunking transactions and doing a second pass aggregation. It works but it's the messiest part of the codebase — would genuinely value feedback on better approaches.
2. 1B parameter models are sufficient for parsing. 7B models give meaningfully better categorization accuracy. The main constraint isn't model capability — it's context window length with large statements and speed. 3. Personally, Qwen 3 gave me the best results but was the slowest in processing a large file. Gpt-oss-20b was faster but the categorization wasn’t as good. Speed is of course, hardware dependent.
3. PDF statement formats vary enormously between banks. LLM-based extraction handles this variation better than any regex approach I’ve tried.
Caveats: setup requires Ollama or LM Studio plus a model download, which is 20-30 minutes on a fresh machine.
Installation & Demo Video - https://youtu.be/VGUWBQ5t5dc
GitHub - https://github.com/AJ/FinSight?utm_source=hackernews&utm_med...
MIT licensed.
reply