Here’s a common requirement:
We want to transfer a file containing sensitive data to a partner; they want us to put the data in their S3 bucket. How can we do this securely?
Now you might start with putting controls around the S3 bucket itself; make sure it’s properly locked down, audit logs and so on. But there’s a number of issues with this. In particular, S3 bucket permissions are easy to get wrong. There’s a long history of data leaks because of misconfigurations. Even if the setup is correct today, it may be wrong tomorrow. Since the bucket isn’t owned by “me”, I can’t verify the setup. So I’m transferring data into an untrusted environment.
So use encryption
Since I’m responsible for the data until it reaches the partner, I look at encrypting it, instead. Now even if the bucket is open by mistake, the contents can’t be read.
The obvious method of encryption would be to use AES. This has a downside, though. It’s symmetric encryption and so the sender and receiver need to have a ‘shared secret’ (the encryption key). This can be done, but it introduces complexity.
So let’s look at asymmetric encryption; RSA. This just means the sender needs to know the recipients public key. So far so good. But RSA is quite slow and CPU intensive. When sending lots of data this can have a performance impact.
Hybrid encryption
We can steal a technique used in various places.
The underlying idea is that the data is AES256 encrypted using a unique one-time-use key and then the AES key is encrypted by a recipient generated RSA public key. The encrypted data and the encrypted AES key are placed in the bucket. The recipient can then use their private key to decrypt the AES key, and then use that to decrypt the data.
The steps generally are:
- The recipient creates an RSA key pair. This key can last a long time (e.g. 1 year)
- The sender generates a random AES256 key. This key is only used once.
- The data is encrypted with the AES key
- The AES key is encrypted with the RSA public key
- The encrypted data and the encrypted key is sent to the recipient (e.g. via S3)
- The recipient uses their RSA private key to decrypt the AES key
- The recipient uses the AES key to decrypt the data
- The recipient deletes the encrypted data and encrypted AES key.
The benefits of this process:
- No shared secret is required. The sender needs only know the public key of the recipient
- Each data object is encrypted with a different key
- The large data objects are encrypted decrypted with the fast AES algorithm; only the key, itself, uses the slower RSA algorithm.
- If permissions are set incorrectly (eg on the S3 bucket) then the data is still fully protected even if an attacker obtains the data.
If the same data needs to be sent to multiple recipients then this process is easily extended; for each recipient use their public key and store multiple copies of the encrypted AES key, one copy for each recipient.
Requirements
- The RSA key must be at least 2048bits in size (giving 112bits of strength); recommended to be 4096 bits (giving 128bits of strength).
- If a “Key Derivation Function” (KDF) is used to generate the key then the password used to generate that key must be at least 45 characters long (assuming a character set of 64 characters - 6 bits of entropy).
- If a “ZIP” tool is used to perform the encryption then it must support AES256 mode (e.g. SmartCrypt). Generic “zip” encryption is weak and must not be used.
- We must be sure the public key used belongs to the recipient; if we use a key supplied by an attacker then the attacker will be able to decrypt the data
Typically the recipient can publish the key in a trusted location (e.g. on a TLS protected website under their control). This has advantages that the recipient can replace their key and we can retrieve the new version automatically.
The security of the solution is dependent on using the correct public key
Example using Unix native tools
In Unix, the openssl
command does most of the hard work for us:
Generate RSA key:
Recipient generates the RSA key and extract the public key. This version has no password on the private key, so protection of the private key is essential. This only needs to be done once a year or two, on the recipient schedule.
$ openssl genrsa -out private.pem 4096
Generating RSA private key, 4096 bit long modulus
...++
............................................................++
e is 65537 (0x10001)
$ openssl rsa -in private.pem -outform PEM -pubout -out public.pem
writing RSA key
$ cat public.pem
-----BEGIN PUBLIC KEY-----
MIICIjANBgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEAtIvkiOFgQfGYa2SIZCPO
dMg8qju6BRgpBAf7el8nr1D/0dG03CusYUzbd9j6/F07CriZpfO5XQE/0Bza8dyf
abZhXtxe8Sa9SUxaOzfyrXlBe5zvO+DSbQ2L326Nw6ScptsNM0yeDaMgfxp0/Kcx
3fW4U6s1RJ6D7B/XHJ6FZiIX4e4AKEoOWnJnNCKET9TsYn64TaL0zop3su5QzSbL
B70mrIJhme+9EVf91CdIU+efa4KF+bMiM8Kqgpb7UauI+5mWJN66JJ82i6xMz2hl
OOIysBZRJCG7UoURblQqgYMtBEhpX9DPZmAvfOWkqO7kouSLYT5S4ZRnys47Fi6K
Sj64XtMB63/HMCq0HXb6UXxm98mICA+vtVGVhJLt9nq7bmABaa+jluI27k62/hUI
WmpREiSt1U80gmU2c17PhQknVl2iioSpnO7p12U/xWq9oFJbnQ3OKgL1jxzQsfVD
U/wAnYiZiLm8Oro1yMd832LBBC01OcMG1v93FZ1g5UUiCecYhqTERRkkhYEltIML
01JDVxEJeWHrtzUdWNKV2pFBGedWuB0/hpJsub+Xcp4sycY4P8mSdFgL99uL1nJc
HsXDEZTS+DoRSo6x+Nf6ytSl7ShESERrubmlvbWrgUQTb/vzJIhwr/Ov1DvOjanU
kjBXt/d7eYhmeMgrHeB6itcCAwEAAQ==
-----END PUBLIC KEY-----
The encryption process:
Generate a random 32byte (256bit) key in hex format:
$ aes_key=$(openssl rand -hex 32)
Encrypt our data file with with this key. For aes256, GCM
mode is preferred. However, my copy of openssl
doesn’t support that,
so we’re using a CBC mode. These are OK. Do not use EFB!
$ openssl enc -aes-256-cbc -K $aes_key -iv 0000000000000000 -in original_data -out encrypted_data
In this case we can use a fixed Initialization Vector (IV) of all zero’s because the AES encryption key itself is only to be used once. If we were using the key multiple times then we would need a random IV
Encrypt the AES key with the public key
$ echo "$aes_key" | openssl rsautl -encrypt -pubin -inkey public.pem -out aeskey.enc
The files aeskey.enc
and encrypted_data
can now be sent to the
recipient (e.g. placing them in their S3 bucket)
To decrypt the data
The recipient now needs to decode the data, first by extracting the AES key. Note it uses the private key to do this:
$ aes_key=$(openssl rsautl -decrypt -inkey private.pem -in aeskey.enc)
Now they can decrypt the data file:
$ openssl enc -d -aes-256-cbc -K $aes_key -iv 0000000000000000 -in encrypted_data -out results
The file results
matches the original_data
Conclusion
This wrapping of a random AES key within an RSA key is a pretty common technique for transmitting data.
In this scenario we’re not providing any checksum or validation that the data has been sent by us; we’re just ensuring that the data contents are kept confidential. This scheme could easily be extended by providing a checksum file signed with the sender’s private key. The recipient could use the senders public key to verify the checksum was properly signed, and the checksum matches the unencrypted datafile’s content.
However, the recipient may be content in knowing that only the permitted sender is allowed to write to the S3 bucket, and consider this additional step unnecessary. After all, they can see the bucket configuration!