Yeah, and fails quickly at anything handwritten.

hakunin · 2025-10-20T15:14:22 1760973262

I mostly OCR English, so Japanese (as mentioned by parent) wouldn't be an issue for me, but I do care about handwriting. See, these insights are super helpful. If only there was, say, a benchmark to show these.

My main question really is: what are practical OCR tools that I can string together on my MacBook Pro M1 Max w/ 64GB Ram to maximize OCR quality for lots of mail and schoolwork coming into my house, all mostly in English.

I use ScanSnap Manager with its built in OCR tools, but that's probably super outdated by now. Apple Vision does way better job than that. I heard people say also that Apple Vision is better than Tesseract. But is there something better still that's also practical to run in a scripted environment on my machine?

wahnfrieden · 2025-10-20T19:25:17 1760988317

LiveText too? It has a newer engine

hakunin · 2025-10-20T19:39:25 1760989165

This is the second comment of yours about LiveText (this is the older one https://news.ycombinator.com/item?id=43192141) — I found that one by complete coincidence because I'm trying to provide a Ruby API for these frameworks. However, I can't find much info on LiveText? What framework is it part of? Do you have any links or any additional info? I found a source where they say it's specifically for screen and camera capturing.

wahnfrieden · 2025-10-20T19:43:30 1760989410

https://developer.apple.com/documentation/visionkit/imageana... VisionKit. Swift-only (as with many new APIs) so lots of people stuck on ObjC bridges simply ignore it.

It does not provide bounding boxes but you can get text.

hakunin · 2025-10-20T20:08:31 1760990911

That's great, I'm going to give this a shot. If you have any more resources please do share. I don't mind Swift-only, because I'm writing little shims with `@_cdecl` for the bridge (don't have much experience here, but hoping this is going to work, leaning on AI for support).