Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I use this to store archives of scanned documents. The last thing I want is to scan something only to later find some subtle image artifact corruption (remember that case of copy machines modifying numbers by swapping glyphs?). I store checksums and a static flif binary along with the archive. It's definitely overkill, but a huge win compared to stacks of paper sitting around.

My intuition was informed by choosing FLAC for my music collection ~15 years ago, and that working out fantastically. If a better format does come along, or if I change my mind, I can always transcode.



The issue with copy machines modifying glyphs isn’t a problem with all algorithms; Really, only that one. Instead of just discarding data like a lossy algorithm, it would notice similar sections of the image and make them the same.

Also, why not PNG?


Yeah, I'll admit that specific example wasn't the most relevant. Really I just want to be able to scan papers and then be confident enough to destroy them without having to scrutinize the output. Rather than committing to specific post-processing I settled on just keeping full masters of the 300dpi greyscale. Even at 5M/page, that's just 100GB for 20k pages.

I don't think PNG provided meaningful compression, due to the greyscale. If FLIF didn't exist, I certainly could have used PNG, for being nicer than PGM. But using FLIF seemed like a small compromise to pay for going lossless.

JPEG would have sufficed, but JPEG artifacts have always bugged me. I also considered JPEG2000 for a bit, which left me with a concern of how stable/future-proof are the actual implementations. Lossless is bit-perfect, so that concern is alleviated.


> Also, why not PNG?

The article claims 43% improvement over typical PNGs, if you have a lot of images that's pretty significant.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: