I’ll join some other commenters, to add my favorite difficult pdf problem that I...

Froodle · on Dec 25, 2023

Dev here for the above stirling pdf app, Please raise features like this as a feature request github issue ticket and we can try address it in future!

kpandit · on Dec 25, 2023

I would do exactly what you have done here if I were the dev of the said app. But with the luxury of being an outsider, a user has expressed an inconvenience and it seems to make sense, then if I were to be the dev of the app here, wouldn't I go and create the ticket in whatever system with a link to this post instead of asking the user of the app to follow the red tape? I know there are places where this is not incentivised so this is a question for your org and not for you.

Froodle · on Dec 25, 2023

I see what you're saying and for simple features I agree However Without the OP creating the ticket there can be no feedback look on the feature. If i wanted it tested for their usecase, there input and confirmation on if its what they wanted and improvements for the workflow etc.. If I base the whole feature on this comment it could end up only doing half a job. Id rather have that communication loop open!

d4rkp4ttern · on Dec 25, 2023

I tend to agree. As an open source dev myself, I avoid asking folks to create issues, as it puts a burden on the user. I’ve see some highly respected open source leads so this, and I’m not faulting them, as I think they’re coming from a good place; it may be a difference of opinion on what’s best practice.

Sai_ · on Dec 26, 2023

Not OP. My take is that if the requester can’t be bothered to create a GH issue, it’s likely that this isn’t really a problem for them. An annoyance possibly but has not risen to “pain” levels.

cyanydeez · on Dec 26, 2023

This is open source software sir, it needs multiple steps to ensure users actually need these features and are willing to use them.

ylk · on Dec 25, 2023

This could be a paid option for parsing forms (not sure about ocr): https://demos.textcontrol.com/chapter/topic/PDF/PDFFormData https://www.textcontrol.com/technologies/pdf/

dave8088 · on Dec 26, 2023

Their scummy website doesn’t list their prices in any way I can see. Hard pass.

Closi · on Dec 25, 2023

Have you tried Azure AI Document Intelligence?

In theory it's exactly this...

brianjking · on Dec 25, 2023

I second this, that or have you tried GPT-4 Vision or Donut?

d4rkp4ttern · on Dec 25, 2023

Still waiting for GPT4V but doubt it will do this. Yes I’ve tried Donut and other options but this is a very gnarly problem.

One option is to extract text blocks along with their coordinates (unstructured.io gives this, probably based on another pkg because it’s basically a container for many pigs). Then do the same with a blank template, and you then have an algorithmic problem of matching the filled values spatially with the key locations from the template.

brianjking · on Dec 25, 2023

I'm fairly confident GPT-4V will do this just fine, tbh.

You just need to extract each of the elements into a structured JSON or something, right?

I'll try with your example later today.

d4rkp4ttern · on Dec 26, 2023

Exactly, the form has filled values in named cells, so we need a JSON of cellName -> filledValue mappings.

Let me know how GPT-4V does!

qingcharles · on Dec 25, 2023

I second trying GPT-4 Vision, though they have dumbed it down a bit since launch.