Protect messaging and streaming data in the cloud with "data key" encryption

The best approach for protecting data in message queues and data streams is to not put any sensitive data in the message. Some systems use a claim check model where the messages contain just resource identifiers that can be used passed to the originating system to retrieve the data. The Claim check approach creates tighter coupling between the producer and consumers. It puts an additional burden on the producer to be able to cough up the data associated for the claim for some period of time.  Some systems sometimes have to create caching architectures to store the claims for retrieval adding additional complexity to the producer. 

Data / payload encryption is an alternative approach that can be used to protect data stored in messaging systems or on disk. Sensitive data is encrypted and put into the message payload.  Producers and consumers only need share access to encryption or decryption keys. This is easy in cloud environments which have services built just for this.

Standard disk encryption does not provide the same level of data security as payload.  Data on encrypted volumes can be seen by anyone with access with no additional work. Application machines often have volume access at the root level.  

Asymmetric encryption vs symmetric encryption

Asymmetric encryption provides more security because of the algorithms used and because of the segregation of encryption and decryption keys. The downsides of asymmetric encryption is speed and resulting payload size. Note: Asymmetric algorithms can be used for signing payloads or encrypting data.

Symmetric encryption uses the same key for encryption and decryption. It is unsuitable for digital signing. because anyone with the decrypt key could use that same key to create a false signature. Symmetric encryption is faster than asymmetric encryption, costs less in compute and results in smaller payloads than common asymmetric algorithms.

Envelope Encryption

  1. Create symmetric encryption key
  2. Encrypt data with the symmetric data key
  3. Encrypt data key with asymmetric cloud key
  4. Send encrypted data key and encrypted data in message

Approaches

Cloud provider vaults and encryption services provide highly secure asymmetric encryption support. You can use those services to directly encrypt your data or you can use those services to encrypt the keys that are used to encrypt your data.

ApproachDescriptionKey Management
Encrypt data with asymmetric cloud key
  • Encrypt/decrypt using cloud asymmetric encryption API.
  • The APIs are usually limited to small amounts of data, like 4K sized data.
  • All keys are managed by cloud provider key store.
  • Cloud providers charge per asymmetric encrypt/decrypt operation. 
Envelope (data key) Encryption
  • Encrypt/decrypt via faster symmetric encryption using random data keys.
  • Encrypt/decrypt the data keys using cloud asymmetric encryption API to encrypt the data keys
  • Asymmetric encryption keys are managed by the cloud provider key store.
  • Cloud providers charge per asymmetric encrypt/decrypt operation.

  • Symmetric data encryption keys are ephemeral and are not stored.
  • The data key must be sent with the payload as additional metadata.














Amazon and Azure SDKs support both approaches.

Envelop, aka data key, encryption

This diagram shows how data key encryption works.  Data keys can be cached and re-used in many situations.  This results in significant performance improvements and fewer cloud provider calls.

Cloud provider SDKs hide the bulk of this complexity.

Single pass encryption

Single pass encryption is simpler. 

SDK notes

The Amazon SDK supports data key caching.  SDK FAQ, KMS FAQ
The Azure SDK supports one time use symmetric data key creation for every message sent.

Advantages and Disadvantages

Let us look at some of the advantages of both approaches.
ApproachWhen to useWhen to not use
Encrypt data with asymmetric cloud key
  • The data is very sensitive and must stay encrypted for long periods of time.
  • Resulting data size is unimportant.
  • Encryption speed is unimportant.
  • Volume is low enough that there are neither rate limits nor const concerns.
  • Ephemeral data.
  • Data blobs that are larger than supported by the cloud provider APIs.
  • Encryption/decryption time is important as in streaming or messaging applications.
  • You will overrun cloud API rate limits.
Envelope (data key) Encryption
  • Data is only sensitive for a short time period.
  • High volume systems where cloud APIs rate limits may be hit or where cost is a concern.
  • There is concern about the size of the data after encryption.
  • Data is highly sensitive and will be stored for long periods of time.









Semi-ephemeral should most likely be encrypted using the data key approach.


Video

Change Log

Created 2020 Feb 16

Comments

Popular Posts