Key Management

From FedoraProject

Jump to: navigation, search

Contents

Miscellaneous notes

For background information about the specifics of dm-crypt, LUKS, and LVM crypt, see Disk encryption formats.

Smart card and TPM support is mostly unrelated to any of the notes below; perhaps we can use them to encrypt locally stored keys (e.g. for hard disk encryption, or for the locally-held disk image keys).

Please ignore my time estimates, and use your own judgment. I have been very bad at estimates so far (estimating too little, of course)—I'm using this as an opportunity to practice.

Change log

Disk Encryption Key Escrow

The goal is to allow recovering from a e.g. a lost password for an encrypted drive of a company computer, by storing the necessary data about the encryption "centrally" and allowing authorized people to use it to access the encrypted data.

The threat model assumes attacks against the encrypted drives by unauthorized users, and attacks against the central data storage. Users deleting or corrupting data they have legitimate access to are out of scope (in the typical case, the user can overwrite their own hard drive or smash it with a hammer).

In this section, user is someone who legitimately has access to a passphrase or key that allows using a volume, recovery is the operation of restoring access of a user by an administrator.

Features

If the volume format uses separate data encryption keys and key encryption keys, the system should store raw data encryption keys rather than passphrases that allow access to the data encryption keys (by deriving a key encryption key from the passphrase). Unlike a passphrase, a data encryption key can be used to help with data restoration if the area of the disk that stores password-protected keys is corrupted. Further, if the data encryption key is escrowed, the escrow does not need to be updated if the users wishes to use a different passphrase. The LUKS encrypted volume format only supports a limited number of passphrase slots, which could be used up by legitimate users, leaving no slot for the recovery passphrase.

Storing data encryption keys rather than passphrases means that it must be possible to boot the computer into a rescue mode without knowing the passphrase for the root volume. To simplify handling of remote users that forgot their passwords, an additional random LUKS password can be set up and escrowed together with the data encryption key; the random password is revealed to the user (perhaps letting the user to change the password), and the data encryption key is used to generate a new random password.

The system should allow on-line as well as off-line recovery data transfer: on-line is necessary for seamless system management (e.g. automated set up of many computers, server-initiated key changes), off-line is useful for cases when the company network is not available (or difficult to set up, e.g. during installation or recovery), or to allow recovery for remote employees (by sending recovery data over e-mail and the password necessary to use it over phone).

Off-line mode can also be used without any management server, allowing e.g. individuals to print the recovery data and store it in a safe; this will increase the number of potential users and contributors in the Linux community.

Client functionality

Using the passphrase, get the data encryption key, create an escrow packet and send it to the server or store it in a file.
Optionally generate and add a random volume passphrase, and store it in the escrow packet as well.
A combined "create a volume and escrow packet" operation is not really necessary - a script that supplies the volume passphrase twice is easy to write.
Extract the data encryption key from the packet (asking for a packet passphrase if necessary), use it to add/replace a volume passphrase.
A variant of this generates and sets up a random volume passphrase, and stores it in another escrow packet.
Extract the data encryption key from the packet (asking for a packet passphrase if necessary), use it to set up or mount the volume. This is intended to replace the usual low-level tools that ask for the volume passphrase, e.g. cryptsetup; the volume would be mounted using mount as usual.
(Requires underlying volume support for on-line re-encryption; this is planned for LVM.) Generate a new data encryption key, create an escrow packet for it, then start on-line re-encryption. If this operation is initiated by a management server, the server can then poll for re-encryption completion.

Server functionality

The internal storage should be encrypted (using a server's master key).
Must support both on-line (client connection) and off-line (upload by administrator) operation.
On-line clients should have the ability to mark an escrow packet for a volume "obsolete" if a newer packet is stored on the server.
Limit the number of packets per client to, say, 1000, to avoid a denial of service attack.
Eventually we can support key splitting (storing parts of the volume data encryption keys on separate servers, requiring N of the servers to supply their part to recover the volume data encryption key); an attacker would have to compromise the master keys of N servers in order to access any volume data encryption key.[1][2]
The returned escrow packet must be encrypted (using an one-time password, or the client's machine private key). An one-time password mechanism must be supported, the machine private key is not available when attempting to recover the root partition.
Must support off-line operation (make a packet available for download). On-line operation (client asking for recovery data directly) is probably not necessary.
The operation should require manual administrator intervention (if the recovery is not initiated by an administrator, it should require administrator approval). A possibility is to encrypt escrow packets by a public master key when storing them, and to ask for a passphrase of the private master key when reading them and preparing them for recovery: this means that a compromise of the server does not compromise any of the escrowed keys until the private maaster key passphrase is captured by an attacker.

Handling of use cases

See Disk_encryption_key_escrow_use_cases.

Implementation details

The escrow packet format is a collection of name-value pairs: machine identification (host name, perhaps machine certificate identifier), volume identification (UUID, label, perhaps a /dev/disk/by-id, /dev/disk/by-path link), volume encryption mechanism/type of stored key (LUKS, LVM, dm-crypt key, dm-crypt passphrase, LUKS pass phrase), encryption parameters (necessary for e.g. raw dm-crypt that does not store them in the volume), and the data encryption key. There's little reason to favor any particular representation over another; the KMIP data format can be used (to help integrate the data in larger key management system, assuming KMIP gains traction). It should be possible to store more than one "secret" in a single file (e.g. both a data encryption key and a LUKS passphrase): if this is not supported by KMIP directly, we can simply store two KMIP packets in the file.

dm-crypt can be used with a passphrase or by specifying the data encryption key directly. Escrowing the data encryption key should be supported; for user-friendliness, escrowing the passphrase should be supported as well: it's better to tell the user their passphrase than to tell them they'll have to use a 32-character key. On the other hand, the user's passphrase might be used in other contexts and storing it could be undesirable. In any case, the vast majority of newly created encrypted volumes will use LUKS (or perhaps LVM crypt) instead of raw dm-crypt, so dm-crypt support is not essential.

The escrow packets should never be stored in plaintext. If they are created to be stored at a server, they should be encrypted using the server's public key (using CMS[3]); support for recovery packets encrypted using a machine private key is possible, but probably not essential.

For operation without a server, or for recovery, the packets should be encrypted using a passphrase. CMS specifies a way to encrypt data using a password instead of a public-private key pair, but that is not supported by NSS nor OpenSSL; a home-grown system—or gpg -c—can be used if adding support for password-based encryption to one of the crypto libraries becomes infeasible.

(It seems signing the escrow packet (neither by the client when creating it, nor by server when returning it) is not necessary: direct client access to the server would be authenticated by the client's machine certificate, which is equivalent to signing the packet, and a fake recovery data can at worst lead to a failed attempt to mount the volume.)

See also Disk encryption key escrow in IPA.

Affected packages

The escrow packet for each volume would probably be stored on the volume, making it the responsibility of the %post script to use it and destroy it.

Action items

Can start now

A single KMIP library will be shared with other key management applications.

Depends on LVM encryption support

LVM programmatic interface, its capabilities and various key management modes are not defined yet.

Depends on system management architecture and infrastructure

Recommendations

Virtualization guest disk image encryption

The goal is to encrypt each disk image using a separate key, to manage access to information stored in the images even if they are all stored on a single storage device, there are many possible guests (that do not correspond to accounts on the storage device), or if traffic on the connection between hosts (used here to mean "virtualization hosts"="nodes") and storage can be read by others.

Guest disk image encryption is not directly related to disk encryption key escrow—escrow deals with keys that are available inside the guest, not outside, and there is no need to recover guest image keys (as long as the management infrastructure is usable). The only caveat is that when backing up virtualization guest disk images, it will be important to back up the management server's key database as well.

Features

Unlike escrow, where users want to choose their own passphrases and change them at will, the encryption of disk images is completely managed by tools. Instead of storing the data encryption keys, it makes more sense to handle the secret that is easiest to use rather than always the data encryption key (e.g. use the qcow2 password, which is currently supported in qemu, instead of patching qemu to support manipulating the data encryption key directly). The data encryption key can be stored in addition, to help with data recovery if a header is corrupted. Unqualified "key" below refers to "the secret that is easiest to use".

Each volume has its own key: per-pool keys would not provide the required host isolation, and per-guest keys would make it difficult to mount a volume from a different guest (e.g. when sharing a cluster file system among guests).

The key is stored on the management server, and also on the hosts that currently contain guests that access this volume (storing the key on the hosts is necessary to support guest autostart). Hosts store the keys together with configuration of guests that use the keys, to ensure the keys are erased when guests are deleted or migrated away.

Without a management server, each user account used for managing guests (e.g. running virt-manager or virt-install) has a separate key store.

Host functionality

The host does not store the key after creating the volume.
If possible, the host should recognize encrypted volumes, and refuse to use them if a key is not supplied (this might be difficult, e.g. without out-of-band information it is ambiguous whether an image that starts with a LUKS header should be treated as an encrypted volume, or as a raw volume that is encrypted inside the guest; dm-crypt does not even have a header).
Store the keys persistently, delete them when deleting the guest configuration.
Use the keys when starting the guest.
Can be done "off-line" by copying the volume, as long as the volume is not used during the process.
"On-line" re-encryption requires currently unimplemented LVM crypt, it will probably make it impossible to migrate any guest that uses the volume to a different host during the process.
This can include converting between cleartext and encrypted volumes.

(The hosts should not provide a way to get the encryption key from the host over the network, to avoid the risk of escalating an unauthorized connection to a host into a disk image data compromise; perhaps extracting the key by users with local root access could be supported for disaster recovery.)

Management server functionality

All guest configuration stored on hosts must be updated afterwards.
Probably should allow re-encryption after migration (immediately or within a specified time interval) to prevent access to data from the old host.

Handling of use cases

See Virt_guest_disk_image_encryption_use_cases.

Implementation details

Possible encrypted image formats: dm-crypt, LUKS or the future LVM crypto (all work on volumes, but can be backed by a file as well), qcow2 (built-in AES encryption). LUKS support is probably not necessary - using raw dm-crypt and storing volume metadata in the virtualization management system works just as well.

The key packet must contains encryption mechanism name (differentiating between "the key" and the data encryption key, if both are used), parameters (for dm-crypt or other formats that do not store the information) and the key. There's little reason to favor any particular representation over another; the KMIP data format can be used (to help integrate the data in larger key management system, assuming KMIP gains traction).

Both hosts and the management server should take reasonable care to store the keys encrypted (e.g. using a master key to encrypt the key storage), but in this case the required ease of use and necessity to automatically start managed hosts and guests will probably require lower security (storing the master key on disk must be supported, along with asking for it on startup or perhaps using the TPM or a smart card).

Key packet transfer is assumed to be protected by using TLS to communicate between the management server and the guests; this, in turn, assumes per-host machine private keys. If possible, the packets should be transferred separately from the XML configuration (e.g. defining more fields in the libvirt RPC protocol), it would be difficult or costly to securely wipe all memory used by the XML creating/decoding code to store the key packet otherwise.

Affected packages

Action items

Can start now

A single KMIP library will be shared with other key management applications.

Depends on LVM encryption support

Depends on system management architecture and infrastructure

Recommendations

General asymmetric key management

Goal is to simplify or automate certificate and private key setup in various applications.

Opinion: Easy set up of e.g. company-wide certificate authorities is definitely desirable. Other than the machine private key, I'm not sure how much value does generic private key management have. If application/service-specific private keys are deployed on large enough scale to warrant management software, the scale probably warrants a configuration management/mass configuration software, which should be able to transfer private keys along with other configuration files, and the software problem is integration of configuration management with a CA rather than mass key deployment to applications.

Client operations

To a specific application, or "everywhere".
From a specific application, or "everywhere".

(If private key management is required:)

The key is generated and signed by the CA "on the server", no manual administrator work on the client (e.g. entering any passphrases) is necessary.
The private key can be stored unencrypted, with a password that is stored in application configuration, or the application can prompt for a password: the application-specific code must know which modes are supported by the application in question.
Difficult to do in general, some application config files are rather complex—at worst we can specify expected key file names, overwrite the key and restart the application.
The operation can contain a certificate for the key being replaced, to avoid accidents.

Server operations

To a specific application, or "everywhere"; probably will apply on all machines in a "group"

(If private key management is required:)

Should generate new certificates or private keys some time before old certificates expire, to allow seamless transition.

Handling of use cases

For above-described operations on certificate authority chains:

Specifics of private key management depend on the way the CA and configuration management software are integrated or connected.

Implementation

All operations are initiated from the server, the client is assumed to be identified by some generic management application; if this provides a machine certificate, TLS can be used to protect keys in transit.

Standard transfer formats (DER PKCS#7 for certificates, PKCS#12 for certificate + private key) can be used for data transfer; additional client and service identification can be tracked by the server or provided for logging.

Implementation would consist of a simple framework (receiving/sending certificates, API) and application-specific plugins. The planned shared NSS database would be one of the plugins, presumably affecting all NSS applications.

If this mechanism is used to manage the client machine certificates, it needs to be forgiving of expired machine certificates (otherwise the client could be unable to refresh the machine certificate because the machine certificate needs refreshing)—perhaps expired machine private keys should be accepted if an immediate refresh is performed, as long as the expired private key was not revoked due to a suspected compromise. Similar concerns arise in connection with clients that are not running for a long time, especially with virtual machines.

FIXME: How could this integrate with the Red Hat Certificate System?

FIXME: The certificate/key management interface should cooperate, or at least not fight with, the generic configuration management platform.

Affected packages

Action items

Can start now

Depends on system management architecture and infrastructure

Recommendations

General symmetric key management

PKCS defines certificates and private keys, but there seems to be no established standard for symmetric key formats. It seems EKMI[1] is rather limited; it was set up to standardize format used by a single open source (Java) project, and the author of the project has left the group. KMIP[7] is more general, people from various large corporations have contributed to the document, but its future is not quite clear.

Both proposed symmetric key management protocols focus on "smart applications" that use the server as a quite inactive key database: In addition to key storage, the major function of the database is to provide key usage policies and track key usage. The applications must already know who they are connecting with, which key (identified by a name or an unique ID) is necessary, and they must implement the key usage policies.

Opinion: It's not clear which RHEL applications require any central management for symmetric keys: keys used in RHEL servers are usually long-lived, or already have a key management protocol. A generic symmetric key management server for "smart applications" probably makes sense only as a service provided for an enterprise-wide application environment ("middleware infrastructure" - e.g. managing keys for communication between SOA components), not for RHN/IPA-like "management of clients".

"Management of clients"

For "management of clients" we can build something mostly similar to asymmetric key management, using a different format for key transfer (perhaps a KMIP subset).

Are there any significant users of symmetric cryptography in RHEL? I could find:

In most symmetric applications it is difficult to replace a key without breaking connections (even a perfectly timed key switch would probably require application restart).

"Smart applications"

KMIP quite directly implies a database schema, and specifies server's operations on the data. This leaves implementing the server, providing client libraries, and providing a mechanism to define key usage by applications; the mechanism should be a component of a larger system for connecting the applications.

Recommendations