Embedding secret data into Docker images

Posted on 2019-07-20 by Oleg Grenrus engineering

In Multi-stage docker build of Haskell webapp blog post I briefly mentioned data-files. They are problematic. A simpler way is to use e.g. file-embed-lzma or similar functionality to embed data into the final binary.

You can also embed secret data if you first encrypt it. This would reduce the pain when dealing with (large) secrets. I personally favor configuration (of running Docker containers) through environment variables. Injecting extra data into containers is inelegant: that's another way to "configure" running container, when one would be enough.

In this blog post, I'll show that dealing with encrypted data in Haskell is not too complicated. The code is in the same repository as the previous post. This post is based on Tutorial: AES Encryption and Decryption with OpenSSL, but is updated and adapted for Haskell.

#Encrypting: OpenSSL Command Line

To encrypt a plaintext using AES with OpenSSL, the enc command is used. The following command will prompt you for a password, encrypt a file called plaintext.txt and Base64 encode the output. The output will be written to encrypted.txt.

openssl enc -aes-256-cbc -salt -pbkdf2 -iter 100000 -base64 -md sha1 -in plaintext.txt  -out encrypted.txt

This will result in a different output each time it is run. This is because a different (random) salt is used. The Salt is written as part of the output, and we will read it back in the next section. I used HaskellCurry as a password, and placed an encrypted file in the repository.

Note that we use -pbkdf2 flag. It's available since OpenSSL 1.1.1, which is available in Ubuntu 18.04 at the time of writing. Update your systems! We use 100000 iterations.

The choice of SHA1 digest is done because pkcs5_pbkdf2_hmac_sha1 exists directly in HsOpenSSL. We will use it to derive key and IV from a password in Haskell. Alternatively, you could use -p flag, so openssl prints the used Key and IV and provide these to the running service.

#Decrypting: OpenSSL Command Line

To decrypt file on command line, we'll use -d option:

openssl enc -aes-256-cbc -salt -pbkdf2 -iter 100000 -base64 -md sha1 -d -in encrypted.txt

This command is useful to check "what's there". Next, the Haskell version.

#Decrypting: Haskell

To decrypt the output of an AES encryption (aes-256-cbc) we will use the HsOpenSSL library. Unlike the command line, each step must be explicitly performed. Luckily, it's a lot nice that using C. There 6 steps:

  1. Embed a file
  2. Decode Base64
  3. Extract the salt
  4. Get a password
  5. Compute the key and initialization vector
  6. Decrypt the ciphertext

#Embed a file

To embed file we use Template Haskell, embedByteString from file-embed-lzma library.

{-# LANGUAGE TemplateHaskell #-}

import Data.ByteString (ByteString)
import FileEmbedLzma (embedByteString)

encrypted :: ByteString
encrypted = $(embedByteString "encrypted.txt")

#Decode Base64

Decoding Base64 is an one-liner in Haskell. We use decodeLenient because we are quite sure input is valid.

import Data.ByteString.Base64 (decodeLenient)

encrypted' :: ByteString
encrypted' = decodeLenient encrypted

Note: HsOpenSSL can also handle Base64, but doesn't seem to provide lenient variant. HsOpenSSL throws exceptions on errors.

#Extract the salt

Once we have decoded the cipher, we can read the salt. The Salt is identified by the 8 byte header (Salted__), followed by the 8 byte salt. We start by ensuring the header exists, and then we extract the following 8 bytes:

extract
    :: ByteString     -- ^ password
    -> ByteString     -- ^ encrypted data
    -> IO ByteString  -- ^ decrypted data
extract password bs0 = do
    when (BS.length bs0 < 16) $ fail "Too small input"

    let (magic, bs1) = BS.splitAt 8 bs0
        (salt,  enc) = BS.splitAt 8 bs1

    when (magic /= "Salted__") $ fail "No Salted__ header"

    ...

#Get a password

Probably you have some setup to extract configuration from environment variables. The following is a very simple way, which is enough for us.

We use unix package, and System.Posix.Env.ByteString.getEnv to get environment variable as ByteString directly. The program will run in Docker in Linux: depending on unix is not a problem.

{-# LANGUAGE OverloadedStrings #-}

import System.Posix.Env.ByteString (getEnv)
import OpenSSL (withOpenSSL)

main :: IO ()
main = withOpenSSL $ do
    password <- getEnv "PASSWORD" >>= maybe (fail "PASSWORD not set") return
    ... 

We also initialize the OpenSSL library using withOpenSSL.

#Compute the key and initialization vector

Once we have extracted the salt, we can use the salt and password to generate the Key and Initialization Vector (IV). To determine the Key and IV from the password we use the pkcs5_pbkdf2_hmac_sha1 function. PBKDF2 (Password-Based Key Derivation Function 2) is a key derivation function. We (as openssl) derive both key and IV simultaneously:

import OpenSSL.EVP.Digest (pkcs5_pbkdf2_hmac_sha1)

iters :: Int
iters = 100000

    ...
    let (key, iv) = BS.splitAt 32
                  $ pkcs5_pbkdf2_hmac_sha1 password salt iters 48
    ...

#Decrypting the ciphertext

With the Key and IV computed, and the ciphertext decoded from Base64, we are now ready to decrypt the message.

import OpenSSL.EVP.Cipher (getCipherByName, CryptoMode(Decrypt), cipherBS)

    ...
    cipher <- getCipherByName "aes-256-cbc"
              >>= maybe (fail "no cipher") return
    plain <- cipherBS cipher key iv Decrypt enc
    ...

#Conclusion

In this post we embedded an encrypted file into Haskell application, which is then decrypted at run time. The complete copy of the code is at same repository, and changes done for this post are visible in a pull request.


Site proudly generated by Hakyll