Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> There are plenty of codecs that give you either obscene ratios or low CPU usage, but none that support both in combination.

I still find lbzip2 (which is a bzip2 reimplementation with better algorithms and support for multithreading) quite competitive for highly compressible data. Here's quick and unscientific test that shows that lbzip2 (-9) is still both faster and has better ratio than zstd (-12, --long or not) while also using the least amount of RAM (tmpfs, multithreaded compression using 4-core Xeon E5-2603 v1):

    $ time lbzip2 -k linux-5.0.8.tar 
    real 24.21 user 84.87 sys 4.07 maxrss 105472

    $ time ~/zstd-1.4.0/zstd -T0 -k -12 linux-5.0.8.tar -o linux-5.0.8.tar.zst-12
    real 30.69 user 105.27 sys 0.61 maxrss 942416

    $ time ~/zstd-1.4.0/zstd -T0 -k -12 --long linux-5.0.8.tar -o linux-5.0.8.tar.zst-12-long
    real 31.28 user 107.90 sys 0.86 maxrss 1532432

    $ time xz -T0 -k -2 linux-5.0.8.tar
    real 34.40 user 123.59 sys 0.57 maxrss 410192

    $ stat -c '%s %n' linux-5.0.8.tar* | sort -n
    126382954 linux-5.0.8.tar.bz2
    126394210 linux-5.0.8.tar.zst-12-long
    128003669 linux-5.0.8.tar.zst-12
    131418488 linux-5.0.8.tar.xz
    863426560 linux-5.0.8.tar
The only clear advantage zstd has is decompression speed:

    $ time xzcat -T0 linux-5.0.8.tar.xz >/dev/null
    real 17.25 user 17.06 sys 0.17 maxrss 17312

    $ time ~/zstd-1.4.0/zstd -dc -T0 linux-5.0.8.tar.zst-12 >/dev/null
    real 2.08 user 1.97 sys 0.08 maxrss 27088

    $ time ~/zstd-1.4.0/zstd -dc -T0 linux-5.0.8.tar.zst-12-long >/dev/null
    real 2.26 user 2.03 sys 0.17 maxrss 535360

    $ time lbzcat linux-5.0.8.tar.bz2 >/dev/null
    real 10.34 user 33.74 sys 3.53 maxrss 127088


Given that it's the decompression speed that is typically the "user-facing" part in many contexts (with compression being done by automatic jobs etc), such a difference in decompression speed is pretty awesome indeed.

(which is also what makes it a near-perfect codec for HDF5, via blosc-hdf5)


I mostly use lbzip2 for day-to-day tasks too. But shouldn't you compare to pzstd to be fair? It's not very surprising that running 4 threads is faster than 1 thread in wall clock time.

EDIT: no, I was wrong, `zstd -T0` is basically the same as `pzstd`.


I'll just add that I tried it and pzstd -12 is still significantly slower than lbzip2 -9 on my machine, with approximately the same compression ratio for linux-5.0.8.tar.

EDIT: no surprise, as -T0 also enables multithreading.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: