> There are plenty of codecs that give you either obscene ratios or low CPU usage, but none that support both in combination.
I still find lbzip2 (which is a bzip2 reimplementation with better algorithms and support for multithreading) quite competitive for highly compressible data. Here's quick and unscientific test that shows that lbzip2 (-9) is still both faster and has better ratio than zstd (-12, --long or not) while also using the least amount of RAM (tmpfs, multithreaded compression using 4-core Xeon E5-2603 v1):
$ time lbzip2 -k linux-5.0.8.tar
real 24.21 user 84.87 sys 4.07 maxrss 105472
$ time ~/zstd-1.4.0/zstd -T0 -k -12 linux-5.0.8.tar -o linux-5.0.8.tar.zst-12
real 30.69 user 105.27 sys 0.61 maxrss 942416
$ time ~/zstd-1.4.0/zstd -T0 -k -12 --long linux-5.0.8.tar -o linux-5.0.8.tar.zst-12-long
real 31.28 user 107.90 sys 0.86 maxrss 1532432
$ time xz -T0 -k -2 linux-5.0.8.tar
real 34.40 user 123.59 sys 0.57 maxrss 410192
$ stat -c '%s %n' linux-5.0.8.tar* | sort -n
126382954 linux-5.0.8.tar.bz2
126394210 linux-5.0.8.tar.zst-12-long
128003669 linux-5.0.8.tar.zst-12
131418488 linux-5.0.8.tar.xz
863426560 linux-5.0.8.tar
The only clear advantage zstd has is decompression speed:
$ time xzcat -T0 linux-5.0.8.tar.xz >/dev/null
real 17.25 user 17.06 sys 0.17 maxrss 17312
$ time ~/zstd-1.4.0/zstd -dc -T0 linux-5.0.8.tar.zst-12 >/dev/null
real 2.08 user 1.97 sys 0.08 maxrss 27088
$ time ~/zstd-1.4.0/zstd -dc -T0 linux-5.0.8.tar.zst-12-long >/dev/null
real 2.26 user 2.03 sys 0.17 maxrss 535360
$ time lbzcat linux-5.0.8.tar.bz2 >/dev/null
real 10.34 user 33.74 sys 3.53 maxrss 127088
Given that it's the decompression speed that is typically the "user-facing" part in many contexts (with compression being done by automatic jobs etc), such a difference in decompression speed is pretty awesome indeed.
(which is also what makes it a near-perfect codec for HDF5, via blosc-hdf5)
I mostly use lbzip2 for day-to-day tasks too. But shouldn't you compare to pzstd to be fair? It's not very surprising that running 4 threads is faster than 1 thread in wall clock time.
EDIT: no, I was wrong, `zstd -T0` is basically the same as `pzstd`.
I'll just add that I tried it and pzstd -12 is still significantly slower than lbzip2 -9 on my machine, with approximately the same compression ratio for linux-5.0.8.tar.
EDIT: no surprise, as -T0 also enables multithreading.
I still find lbzip2 (which is a bzip2 reimplementation with better algorithms and support for multithreading) quite competitive for highly compressible data. Here's quick and unscientific test that shows that lbzip2 (-9) is still both faster and has better ratio than zstd (-12, --long or not) while also using the least amount of RAM (tmpfs, multithreaded compression using 4-core Xeon E5-2603 v1):
The only clear advantage zstd has is decompression speed: