Cryptsetup 2.0.0 introduces new on-disk LUKS2 format

fpgaminer · on Dec 25, 2017

I'm surprised to see so many comments here throwing up on the use of JSON. Have we really forgotten the Unix philosophy?

> Write programs to handle text streams, because that is a universal interface.

A lot of the data formats I've written recently have used JSON. Partly because of that aforementioned principal, but mostly because it's just easier and there are very little downsides. Almost every language has support for it so it's fairly universal, it's self documenting, it's easy to debug, easy to manipulate by hand, and easy to maintain.

Put more simply: I've implemented several data formats using custom binary and several data formats using JSON. JSON was easier and faster every single time.

My recommendation for data formats: just use JSON; unless you have a really good reason not to.

And I do mean really good reason. For example, many might think concerns about data size would be a reason not to use JSON. But go ahead and run some JSON through a compression algorithm some time. You'll be amazed. Compressed JSON is very competitive versus a custom binary format.

codys · on Dec 25, 2017

> several data formats using custom binary and several data formats using JSON

You may not realize it, but you've presented a false dichotomy here.

The alternate to JSON isn't "custom binary", it's standardized binary encodings that allow the same flexibility (or additional flexibility) compared to JSON. Some examples include cbor & msgpack.

>> Write programs to handle text streams, because that is a universal interface.

cryptsetup manipulates block devices. I don't think that line can or should really be applied to something that is fairly similar to a filesystem's on-disk format.

sprash · on Dec 25, 2017

Doesn't solve my most important problem with LUKS: Allowing Trim passthrough on SSDs without impacting vulnerability.

Also, why use json for meta-data instead of simple C Structs? No human is supposed to read this kind of data anyways.

bmalehorn · on Dec 25, 2017

> Alowing Trim passthrough on SSDs without impacting vulnerability.

Unfortunately I don’t think that’s possible. If I scan your encrypted disk and see that 10% of blocks are zeroed out, I can assume that your disk is 90% full. So information has been leaked, which is arguably a vulnerability.

sprash · on Dec 25, 2017

This is an engineering problem where tradeoffs have to be made. E.g. You could make it appear that the disk is about 90% full all the time if you queue up the trim commands in some deterministic way. This way you only could determine if the disk is more or less than about 90% full at any time. I think good engineers would come up with something even better.

This would be on a level of a nice CS Bachelors/Masters thesis.

sslalready · on Dec 25, 2017

I think TRIM is dangerous not because of information leaks, but because of the risk of IV/nonce reuse when SSD data blocks are unmapped but not cleared. This would pose a risk if someone dumps the raw content of the NAND chips and find two or more data blocks encrypted with the same IV/nonce.

viraptor · on Dec 26, 2017

Why does trim make a difference though? You're not going to scan the whole disk on each write for duplicates, so you need to guarantee statistically-unique nonces either way, or make sure reuse doesn't matter. Trim doesn't make this any worse/better.

viraptor · on Dec 26, 2017

After some time, I think I get it. If the key/iv is location-specific, trim may result in an abandoned block which will then be recreated somewhere else. This results in two blocks from the same logical location in two different flash locations. Unless I misunderstand something, the xts mode encryption uses location-based keys.

saghm · on Dec 25, 2017

Couldn't this be an opt-in feature? For some people, this might be a reasonable trade-off given their threat model

kakarot · on Dec 25, 2017

Fedora decided with release 27 to enable trim by default on newly created encrypted devices. I was not happy about this decision and try to make some noise but nobody really seem to care. There is a reason that it is disabled by default in the Linux kernel, and making this kind of decision on behalf of the users without any input from the community is pretty fucked up.

Personally, I would like to explore the idea of a secure enclave that keeps a map of which blocks are in use that gets referred to during write operations. This seems like a problem that is going to need to be solved with hardware.

Denvercoder9 · on Dec 25, 2017

Yes, it can be, and it's already present.

saghm · on Dec 26, 2017

Oops, probably should have checked before asking

simcop2387 · on Dec 25, 2017

Likely for easier future extensions to the metadata without requiring prior agreement among all implementations about how things are laid out.

Teknoman117 · on Dec 25, 2017

Is there a good writeup on why this is such a horrific vulnerability? I see a lot of doom and gloom about this without a lot of explanations.

mjepronk · on Dec 25, 2017

Plausible deniability. For example, from the trimmed sector you can inference that the disk is being used.

gsnedders · on Dec 25, 2017

Is there really plausible deniability about use of a hard drive that isn't just all zeros (as, AFAIK, they come from the factory)?

I mean, I guess is depends on what burden of proof is needed in the case, certainly if it were balance of probability I'd be surprised if having the disk in a computer looking like it has been written to is sufficient to not make it plausible to deny.

wonderous · on Dec 25, 2017

Anyone aware of any independent formal security audits of this version of this or any prior version of LUKS?

2bluesc · on Dec 25, 2017

Section regarding the format:

    LUKS2 format and features
    ~~~~~~~~~~~~~~~~~~~~~~~~~
    The LUKS2 is an on-disk storage format designed to provide simple key
    management, primarily intended for Full Disk Encryption based on dm-crypt.

    The LUKS2 is inspired by LUKS1 format and in some specific situations (most
    of the default configurations) can be converted in-place from LUKS1.
 
    The LUKS2 format is designed to allow future updates of various
    parts without the need to modify binary structures and internally
    uses JSON text format for metadata. Compilation now requires the json-c library
    that is used for JSON data processing.
 
    On-disk format provides redundancy of metadata, detection
    of metadata corruption and automatic repair from metadata copy.
 
    NOTE: For security reasons, there is no redundancy in keyslots binary data
    (encrypted keys) but the format allows adding such a feature in future.
 
    NOTE: to operate correctly, LUKS2 requires locking of metadata.
    Locking is performed by using flock() system call for images in file
    and for block device by using a specific lock file in /run/lock/cryptsetup.
 
    This directory must be created by distribution (do not rely on internal
    fallback). For systemd-based distribution, you can simply install
    scripts/cryptsetup.conf into tmpfiles.d directory.
 
    For more details see LUKS2-format.txt and LUKS2-locking.txt in the docs
    directory. (Please note this is just overview, there will be more formal
    documentation later.)

codys · on Dec 25, 2017

> JSON text format for metadata

It seems like it could be better to use some easier to parse binary format that allows the same flexibility that json does, like cbor/msgpack/bson.

Using JSON here is a very strange choice.

Crontab · on Dec 25, 2017

To me, the worst part is that it introduces a library dependency, which itself can introduce security issues unless they intend to audit it. I don't under this choice at all.

loeg · on Dec 25, 2017

> Using JSON here is a very strange choice.

I agree! Perhaps the userspace administrative tool (which links json-c) parses it and converts it to something more like an ioctl struct for the kernel.

majewsky · on Dec 25, 2017

AFAIK the kernel only contains dm-crypt. The whole key handling takes place in userspace, in cryptsetup(1).

(If you want to have a bad day, you can send a patch to Linus to put a JSON parser in the kernel.)

klodolph · on Dec 25, 2017

BSON, despite its name, is really just a format for MongoDB rather than a binary alternative for JSON.

jwilk · on Dec 25, 2017

Why can't it be both?

klodolph · on Dec 25, 2017

I see no reason why it can't be both, it's just that BSON happens to be a poor format for general interchange, and if you went back and redesigned you could probably fix many of its flaws. I'm a little skeptical though, I would rather just throw it out.

It's kind of shitty that it was called BSON, because it falls so far short of the generality that JSON has.

cesarb · on Dec 25, 2017

At least it's not XML.

wyager · on Dec 25, 2017

> internally uses JSON text format for metadata.

What the hell? Why would they do this? The only (tenuous) justification for using a parsed human-oriented text format in any protocol is so that humans can edit things by hand, but this will presumably not be the case for file metadata.

I don’t want to read too much into this since there might be a sane explanation, but this seriously makes me question the design of this security-critical system.

freedomben · on Dec 25, 2017

JSON is a language agnostic format that is widely supported, so one could read/parse the metadata from virtually any language. This is much more flexible and powerful than a specific binary format. It may also make debugging a lot easier since it is human readable.

I don't know if that's the reason they chose JSON, but that's what I would think about IIWM.

wyager · on Dec 25, 2017

> so one could read/parse the metadata from virtually any language.

This isn’t a use case for file system metadata. It only makes sense internally anyway. People interact with filesystem metadata through standardized APIs, not metadata dumps.

> It may also make debugging a lot easier since it is human readable.

It’s also basically guaranteed to introduce a ton of bugs; parsing and generating JSON is orders of magnitude more complicated than generating and reading an unambiguous tagged binary format.

loeg · on Dec 25, 2017

JSON seems like a weird choice for a kernel-parsed data structure. Or perhaps it's just parsed in userspace and signaled to the kernel in a more direct format.

nly · on Dec 25, 2017

This isn't without precedence. IIRC the LVM (Linux Volume Management) metadata is also mostly text format

https://github.com/libyal/libvslvm/blob/master/documentation...

tinus_hn · on Dec 25, 2017

That doesn't really matter as the program parsing it is probably running with full privileges.

jwilk · on Dec 25, 2017

I'm pretty sure this JSON is not parsed by kernel.

jwilk · on Dec 25, 2017

LUKS2 on-disk format specification:

https://gitlab.com/cryptsetup/cryptsetup/raw/v2.0.0/docs/LUK...

raverbashing · on Dec 25, 2017

As noted on the release notes, do not use this for production systems (for now) unless you have a backup of your data