Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Cryptsetup 2.0.0 introduces new on-disk LUKS2 format (kernel.org)
100 points by tyngde on Dec 25, 2017 | hide | past | favorite | 35 comments


I'm surprised to see so many comments here throwing up on the use of JSON. Have we really forgotten the Unix philosophy?

> Write programs to handle text streams, because that is a universal interface.

A lot of the data formats I've written recently have used JSON. Partly because of that aforementioned principal, but mostly because it's just easier and there are very little downsides. Almost every language has support for it so it's fairly universal, it's self documenting, it's easy to debug, easy to manipulate by hand, and easy to maintain.

Put more simply: I've implemented several data formats using custom binary and several data formats using JSON. JSON was easier and faster every single time.

My recommendation for data formats: just use JSON; unless you have a really good reason not to.

And I do mean really good reason. For example, many might think concerns about data size would be a reason not to use JSON. But go ahead and run some JSON through a compression algorithm some time. You'll be amazed. Compressed JSON is very competitive versus a custom binary format.


> several data formats using custom binary and several data formats using JSON

You may not realize it, but you've presented a false dichotomy here.

The alternate to JSON isn't "custom binary", it's standardized binary encodings that allow the same flexibility (or additional flexibility) compared to JSON. Some examples include cbor & msgpack.

>> Write programs to handle text streams, because that is a universal interface.

cryptsetup manipulates block devices. I don't think that line can or should really be applied to something that is fairly similar to a filesystem's on-disk format.


Doesn't solve my most important problem with LUKS: Allowing Trim passthrough on SSDs without impacting vulnerability.

Also, why use json for meta-data instead of simple C Structs? No human is supposed to read this kind of data anyways.


> Alowing Trim passthrough on SSDs without impacting vulnerability.

Unfortunately I don’t think that’s possible. If I scan your encrypted disk and see that 10% of blocks are zeroed out, I can assume that your disk is 90% full. So information has been leaked, which is arguably a vulnerability.


This is an engineering problem where tradeoffs have to be made. E.g. You could make it appear that the disk is about 90% full all the time if you queue up the trim commands in some deterministic way. This way you only could determine if the disk is more or less than about 90% full at any time. I think good engineers would come up with something even better.

This would be on a level of a nice CS Bachelors/Masters thesis.


I think TRIM is dangerous not because of information leaks, but because of the risk of IV/nonce reuse when SSD data blocks are unmapped but not cleared. This would pose a risk if someone dumps the raw content of the NAND chips and find two or more data blocks encrypted with the same IV/nonce.


Why does trim make a difference though? You're not going to scan the whole disk on each write for duplicates, so you need to guarantee statistically-unique nonces either way, or make sure reuse doesn't matter. Trim doesn't make this any worse/better.


After some time, I think I get it. If the key/iv is location-specific, trim may result in an abandoned block which will then be recreated somewhere else. This results in two blocks from the same logical location in two different flash locations. Unless I misunderstand something, the xts mode encryption uses location-based keys.


Couldn't this be an opt-in feature? For some people, this might be a reasonable trade-off given their threat model


Fedora decided with release 27 to enable trim by default on newly created encrypted devices. I was not happy about this decision and try to make some noise but nobody really seem to care. There is a reason that it is disabled by default in the Linux kernel, and making this kind of decision on behalf of the users without any input from the community is pretty fucked up.

Personally, I would like to explore the idea of a secure enclave that keeps a map of which blocks are in use that gets referred to during write operations. This seems like a problem that is going to need to be solved with hardware.


Yes, it can be, and it's already present.


Oops, probably should have checked before asking


Likely for easier future extensions to the metadata without requiring prior agreement among all implementations about how things are laid out.


Is there a good writeup on why this is such a horrific vulnerability? I see a lot of doom and gloom about this without a lot of explanations.


Plausible deniability. For example, from the trimmed sector you can inference that the disk is being used.


Is there really plausible deniability about use of a hard drive that isn't just all zeros (as, AFAIK, they come from the factory)?

I mean, I guess is depends on what burden of proof is needed in the case, certainly if it were balance of probability I'd be surprised if having the disk in a computer looking like it has been written to is sufficient to not make it plausible to deny.


Anyone aware of any independent formal security audits of this version of this or any prior version of LUKS?


Section regarding the format:

    LUKS2 format and features
    ~~~~~~~~~~~~~~~~~~~~~~~~~
    The LUKS2 is an on-disk storage format designed to provide simple key
    management, primarily intended for Full Disk Encryption based on dm-crypt.

    The LUKS2 is inspired by LUKS1 format and in some specific situations (most
    of the default configurations) can be converted in-place from LUKS1.
 
    The LUKS2 format is designed to allow future updates of various
    parts without the need to modify binary structures and internally
    uses JSON text format for metadata. Compilation now requires the json-c library
    that is used for JSON data processing.
 
    On-disk format provides redundancy of metadata, detection
    of metadata corruption and automatic repair from metadata copy.
 
    NOTE: For security reasons, there is no redundancy in keyslots binary data
    (encrypted keys) but the format allows adding such a feature in future.
 
    NOTE: to operate correctly, LUKS2 requires locking of metadata.
    Locking is performed by using flock() system call for images in file
    and for block device by using a specific lock file in /run/lock/cryptsetup.
 
    This directory must be created by distribution (do not rely on internal
    fallback). For systemd-based distribution, you can simply install
    scripts/cryptsetup.conf into tmpfiles.d directory.
 
    For more details see LUKS2-format.txt and LUKS2-locking.txt in the docs
    directory. (Please note this is just overview, there will be more formal
    documentation later.)


> JSON text format for metadata

It seems like it could be better to use some easier to parse binary format that allows the same flexibility that json does, like cbor/msgpack/bson.

Using JSON here is a very strange choice.


To me, the worst part is that it introduces a library dependency, which itself can introduce security issues unless they intend to audit it. I don't under this choice at all.


> Using JSON here is a very strange choice.

I agree! Perhaps the userspace administrative tool (which links json-c) parses it and converts it to something more like an ioctl struct for the kernel.


AFAIK the kernel only contains dm-crypt. The whole key handling takes place in userspace, in cryptsetup(1).

(If you want to have a bad day, you can send a patch to Linus to put a JSON parser in the kernel.)


BSON, despite its name, is really just a format for MongoDB rather than a binary alternative for JSON.


Why can't it be both?


I see no reason why it can't be both, it's just that BSON happens to be a poor format for general interchange, and if you went back and redesigned you could probably fix many of its flaws. I'm a little skeptical though, I would rather just throw it out.

It's kind of shitty that it was called BSON, because it falls so far short of the generality that JSON has.


At least it's not XML.


> internally uses JSON text format for metadata.

What the hell? Why would they do this? The only (tenuous) justification for using a parsed human-oriented text format in any protocol is so that humans can edit things by hand, but this will presumably not be the case for file metadata.

I don’t want to read too much into this since there might be a sane explanation, but this seriously makes me question the design of this security-critical system.


JSON is a language agnostic format that is widely supported, so one could read/parse the metadata from virtually any language. This is much more flexible and powerful than a specific binary format. It may also make debugging a lot easier since it is human readable.

I don't know if that's the reason they chose JSON, but that's what I would think about IIWM.


> so one could read/parse the metadata from virtually any language.

This isn’t a use case for file system metadata. It only makes sense internally anyway. People interact with filesystem metadata through standardized APIs, not metadata dumps.

> It may also make debugging a lot easier since it is human readable.

It’s also basically guaranteed to introduce a ton of bugs; parsing and generating JSON is orders of magnitude more complicated than generating and reading an unambiguous tagged binary format.


JSON seems like a weird choice for a kernel-parsed data structure. Or perhaps it's just parsed in userspace and signaled to the kernel in a more direct format.


This isn't without precedence. IIRC the LVM (Linux Volume Management) metadata is also mostly text format

https://github.com/libyal/libvslvm/blob/master/documentation...


That doesn't really matter as the program parsing it is probably running with full privileges.


I'm pretty sure this JSON is not parsed by kernel.



As noted on the release notes, do not use this for production systems (for now) unless you have a backup of your data




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: