Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Hello everyone, I'm the primary author for WAL-G and would be happy to answer any questions.


Thanks! Does WAL-G provides some kind of "continuous backup" where changes committed to the database are continuously streamed to the backup storage? Or does it work "step by step", for example by backing up every 5 minutes or every 10 MB?


It does continuous backup like WAL-E.

Both back up PG's WAL files (Write Ahead Log) and allow restoring your database state as it was at a specific time or after a specific transaction committed. This is known as point-in-time recovery (PITR) [0]

Users and admins make mistakes, and accidentally delete or overwrite data. With PITR you can restore in a new environment, just before the mistake occurred and recover the data from there.

[0] https://www.postgresql.org/docs/9.6/static/continuous-archiv...


What I meant is that the archive_command is run only when a WAL segment is completed or when archive_timeout is reached. In the meantime, nothing is backed up. On a low traffic database, this can be a problem. I'm wondering if there is a way to continuously stream the WAL to an object storage like S3, without waiting to have a complete segment.


S3 is a block store; not something you can really stream to.

However it might be interesting to stream WAL logs to e.g. AWS Kinesis....


You can open multi-part transfers and close out the transfer when you're ready, which can be used so that it is very close to streaming; for this case perhaps it's close enough to try with wall-g if it otherwise supports it.


You're right. S3 is an object store and doesn't support the append operation, which is required for what I want to do. Thanks!


That's the usecase for archive_timeout. I set it to 60 seconds. So at most I'll have lost 60s + the time to transfer the file to s3, which shouldn't be more than a couple seconds.


According to PostgreSQL documentation, "archived files that are archived early due to a forced switch are still the same length as completely full files".

I'm afraid to use a lot of storage for WAL segments that are mostly empty:

16 MB per segment x 60 minutes x 24 hours x 7 days = 161 GB/week

Does WAL-G/WAL-E compression help?


Yes, the lzop compression helps a lot, and I imagine that mostly "empty files" will be more compressible.

On a staging server with little activity, the compressed WAL-E wal files go as low as 1.9MB per 10 minutes. (~2GB/week)

The production server has files between 4 and 12MB per 1 minute or less. (~220GB/week)

WAL-E has a good `wal-e delete retain` command that removes older base backups and wal files.


2 GB/week is so much better than 161 GB/week! It looks like compression helps a lot. Thanks for sharing these numbers.


Seriously awesome work on this! I was expecting some solid improvement when I heard you were rewriting this in Go, but this is beyond what I could have expected. 7x improvement on high end instance types!

Also, what an impressive project to have on the resume as a college intern. I don't think many interns get to tackle something so meaningful.


Nor would many choose to, given the opportunity. Way to pick an interesting and challenging project! :-)


Thanks for making this! To someone who's unfamiliar with Postgres tooling, what's the difference between WAL-G and Barman? What're the advantages of using one over the other?


I was kafkes's guide on this project, and the WAL-E author and maintainer...

I'd say the differences between WAL-G and Barman are similar to WAL-E and Barman, which comes up relatively frequently. https://news.ycombinator.com/item?id=13573481

In summary, WAL-E is simpler program all around that focuses on cloud storage, barman does more around inventories of backups and file-based backups and configuring Postgres. There are integrative downsides to its span. WAL-E also happens to predate Barman.


WAL-G (and WAL-E) are expected to run next to the main database, while barman is to be run on a separate machine. Barman can also backup many databases. It is essentially the difference between a central backup service and local backups.


Is this production ready, or just an early dev snapshot?


We've currently been testing it at Citus, but have not flipped it to be live for our disaster recovery yet.

We're going to start rolling it out for Forks/point-in-time recoveries first, which present less risk to start. Later we'll explore either parallel restores from WAL-E and WAL-G or possibly just flip the switch based on the results.

On restoration there's really no risk to data. Further we page our on call for any issues that happen such as WAL not progressing, or servers not coming online out of restore.


WAL-G is not yet production ready, but it has been used in a staging environment for the past few weeks without any issues. Once fdr adds parallel WAL support, he plans to take it into production.


Cool cool.

is google cloud storage on the roadmap?


WAL-G looks like it should be able to talk to Minio (minio.io, S3 compatible thing, also written in go) as a backend instead of S3 itself.

Minio has an interesting feature where it can be a "gateway" to other cloud storage. Google Cloud Storage is one of their specific examples:

https://docs.minio.io/docs/minio-gateway-for-gcs

So WAL-G would talk to Minio, and Minio would transparently proxy that to GCS.


Neat. My concern out of the gate is what would be the perf hit.

I assume I am switching from WAL-E to WAL-G for more perf. But WAL-E speaks GCS. If WAL-G needs an extra hop to do so, may lose some of the point of it..


Yeah, no idea personally. Haven't used the gateway functionality in Minio at all.

That being said, the Minio team seem pretty good with writing performance optimised code. Frank Wessels (on Minio team), has been writing articles about Go assembler and other Go optimisation things recently. eg:

https://blog.minio.io/accelerating-blake2b-by-4x-using-simd-...

https://blog.minio.io/golang-internals-part-2-nice-benefits-...

So the performance impact might not be such a problem. :)


There was some mention about resumable uploads in the blogpost which sadly each provider handles differently (that is the GCS layer that supports the S3 API does not accept resumable uploads).

Disclosure: I work on Google Cloud (so I'd love to see this tool point at GCS).


I answered this question in a couple of other places, but: unknown because I don't have use for that yet. https://news.ycombinator.com/item?id=15049527


Do you have plans to support encrypted backups?


Maybe. Depends what you mean by that: https://news.ycombinator.com/item?id=15049691


It would be nice if you could sign the tarball with the binary. I see the tag is already signed so hopefully it's not much trouble: https://wiki.debian.org/Creating%20signed%20GitHub%20release...


Done.


Any plans to support backup to Google Cloud Storage instead of just S3?


Or some sort of pluggable storage system.

For now, since I'm also on GCP, I'm using PGHoard: https://github.com/ohmu/pghoard


Unknown, but I don't have an immediate plan to implement them: https://news.ycombinator.com/item?id=15049527


Can you elaborate on how much automated testing is behind this? When it comes to backup tools, I am very cautious.


WAL-G has a number of unit tests, and has been tested manually in a staging environment for a number of weeks without issues. We are looking to implement more integration tests in the future.


What's the min version of the Postgres server this can be used with?


WAL-G uses non-exclusive backups so at least 9.6


Are there plans to support exclusive backups / older PG versions ?


Not right now. Maybe use WAL-E until you upgrade?




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: