And here I'm still looking for a way, with one click, to create an offline backup of the webpages each of my bookmarks points to. Such that the offline version looks and works exactly like the online version in (say) Google Chrome (e.g. the CTRL+F feature works fine). And such that I can use some key-combo and click a bookmark in my bookmarks manager (in Chrome) to open a webpage from the backup (or the backup can have its own copy of the bookmarks manager... it needs a catalog of some sort or it won't be useful).
I love ArchiveBox but the headless Chromium they use has some annoying "will break randomly and GFL trying to figure out why/how to fix it" problems (like it'll just randomly stop working because the profile is locked except the lock file isn't there and even if you tweak things to make 100% sure the profile lock is removed before and after every archive request, it'll still randomly fail on a locked profile and WHAT THE HELL IS GOING ON?!)
Although, to be fair, running it in Docker seems less fraught and breaks less often (and it's a lot easier to restart when it does break.)
(I've got a pipeline from Instapaper -> {IFTTT -> {Pinboard -> Linkhut, Dropbox, Webhook -> ArchiveBox}} which works well most of the time for archiving random pages. Used to be Pocket until Mozilla decided to be evil.)
I used SingleFile for a while but now I've switched to WebScrapBook because a lot of the pages that I save have the same images. Then I run rdfind to hard link all the identical files and save space.
Anecdotally (not to diminish any bug the parent had), SingleFile is one of my favorite extensions. Been using it for years and it's saved my ass multiple times. Thank you!
Edit: What's the best way to support the project? I'm seeing there's an option through the Mozilla store and through GitHub. Is there's a preference?
I have SingleFile configured to post full archives to Karakeep with an HTTP POST; this enables archiving pages from my browser that Karakeep cannot scrape and bookmark due to paywalls or bot protection.
I've been using single file for five years and I've never had this issue for what it's worth. I keep a directory called Archives on my Synology that I expose with Copy Party, and I routinely back up web pages and then drop the result into my Copy Party instance for safekeeping.
I would look into what happened with the single file copies you made that didn't work because that is highly unusual.
WebRecorder [0] is the best implemention of this that I've tested. It runs as an extension in your browser, intercepting HTTP streams, so as long as you open a page in your browser the data is captured to reproduce it exactly. It outputs WARC files that are (in theory) compatible with the rest of the web archiving ecosystem, and has a WARC explorer interface to browse captured archives.
For pages with dynamic content that can't be trivially reproduced by their HTTP streams— E.G., opening the archive triggers GETs with a mismatched timestamp, even if the file it's looking for is in the WARC under a different URI— There's always SingleFile [1], and Chromium's built-in MHTML Ctrl+S export, which "bake" the content into a static page.
On Firefox, but I still feel the need to reply. You might find it handy, or other readers here might like it. Maybe it's also available for Chrome, I don't know.
I've been using an extension called WebScrapBook to locally save copies of interesting webpages. I use the basic functionality, but it comes with tons of options and settings.
I happened upon a bit of an unconventional approach to this with Zotero. It’s obviously more focused on academic research but it takes snapshots and works as a more general purpose archive tool really well.
FWIW I've had success with self-hosted [LinkDing](https://github.com/sissbruecker/linkding) and the firefox SingleFile plugin (so it archives what I'm seeing / gets around logins etc). LinkDing also links directly to Internet Archive for any URL.