Self-Host Hashicorp Vault Secrets Server with Docker
Posted on May 16, 2024 • 16 minutes • 3294 words • Other languages: Deutsch
Recently, I have been evaluating Hashicorp’s Vault Server and set it up on several machines in a simple setup. Since the instructions on the Internet are somewhat scattered, I document my approach in the hope that it may help others.
Environment
I will not be using Kubernetes or Docker Swarm (in both, you can solve it quite nicely with the appropriate placement of containers if you master the storage challenge). The reason we’re using separate Docker instances has historical reasons in the project. The advantage is that I can document the individual steps quite well 😇
Our setup contains three nodes/servers. Each runs Docker and they are connected via VLAN:
app01
, IP in VLAN 192.168.100.111app02
, IP in VLAN 192.168.100.112app03
, IP in VLAN 192.168.100.113
The servers run Ubuntu 22.04. Those who prefer not to use Docker can also install Vault as a service - the process is not too difficult, you just need to adjust the commands below accordingly. The version of Vault used is 1.16.
Creating Certificates
First, we create the certificates for encrypted communication with Vault (and between the instances). To do this, we set up a Certificate Authority (CA) and certificates for each Vault node. On any server, we execute the following:
sudo mkdir /etc/vault
cd /etc/vault
sudo tee extfile.cnf << EOF
[ req ]
distinguished_name = req_distinguished_name
x509_extensions = v3_ca
prompt = no
[ req_distinguished_name ]
C = DE
ST = MyState
L = MyLocation
O = MyPlace
CN = RootCA
[ v3_ca ]
subjectKeyIdentifier = hash
authorityKeyIdentifier = keyid:always,issuer
basicConstraints = critical,CA:true
keyUsage = critical, digitalSignature, cRLSign, keyCertSign
EOF
sudo tee extfile01.cnf << EOF
[ req ]
distinguished_name = req_distinguished_name
req_extensions = req_ext
prompt = no
[ req_distinguished_name ]
C = DE
ST = MyState
L = MyLocation
O = MyPlace
CN = app01
[req_ext]
subjectAltName = @alt_names
[alt_names]
IP.1 = 192.168.100.111
IP.2 = 127.0.0.1
DNS.1 = app01
DNS.2 = vault
EOF
sudo tee extfile02.cnf << EOF
[ req ]
distinguished_name = req_distinguished_name
req_extensions = req_ext
prompt = no
[ req_distinguished_name ]
C = DE
ST = MyState
L = MyLocation
O = MyPlace
CN = app02
[req_ext]
subjectAltName = @alt_names
[alt_names]
IP.1 = 192.168.100.112
IP.2 = 127.0.0.1
DNS.1 = app02
DNS.2 = vault
EOF
sudo tee extfile03.cnf << EOF
[ req ]
distinguished_name = req_distinguished_name
req_extensions = req_ext
prompt = no
[ req_distinguished_name ]
C = DE
ST = MyState
L = MyLocation
O = MyPlace
CN = app03
[req_ext]
subjectAltName = @alt_names
[alt_names]
IP.1 = 192.168.100.113
IP.2 = 127.0.0.1
DNS.1 = app03
DNS.2 = vault
EOF
This created the extension files for OpenSSL. These contain necessary extensions for the certificates required by Vault. Without them, Vault refuses to communicate with the other instances. It is important to specify all possible own DNS names and IPs of the Vault instances. If you want to access Vault via additional IPs or DNS names, the lists must be extended accordingly!
Now we can create the actual certificates for the servers:
cd /etc/vault
# CA
sudo openssl genrsa -out ca.key 4096
sudo openssl req -new -x509 -days 3650 -key ca.key -out ca.crt -config extfile.cnf
# Host 01
sudo openssl genrsa -out vault01.key 4096
sudo openssl req -new -key rsa:4096 -key vault01.key -out vault01.csr -config extfile01.cnf
sudo openssl x509 -req -days 3650 -in vault01.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out vault01.crt -extensions req_ext -extfile extfile01.cnf
# Host 02
sudo openssl genrsa -out vault02.key 4096
sudo openssl req -new -key rsa:4096 -key vault02.key -out vault02.csr -config extfile02.cnf
sudo openssl x509 -req -days 3650 -in vault02.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out vault02.crt -extensions req_ext -extfile extfile02.cnf
# Host 03
sudo openssl genrsa -out vault03.key 4096
sudo openssl req -new -key rsa:4096 -key vault03.key -out vault03.csr -config extfile03.cnf
sudo openssl x509 -req -days 3650 -in vault03.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out vault03.crt -extensions req_ext -extfile extfile03.cnf
So, we first create the CA certificate and then the keys and certificates for each instance. The certificates are valid for about 10 years. In a production environment, it is advisable to use certificates with a shorter validity period - or to set up a dedicated authority for issuing such certificates, such as step-ca .
Finally, we test whether the certificates also contain the necessary extensions:
openssl x509 -noout -text -in vault01.crt | grep -A 1 "Subject Alternative Name"
openssl x509 -noout -text -in vault02.crt | grep -A 1 "Subject Alternative Name"
openssl x509 -noout -text -in vault03.crt | grep -A 1 "Subject Alternative Name"
All the necessary IPs and DNS names should be listed in the output.
Important: In the end, all created files must be copied to the other servers, e.g., via rsync or scp.
At the end of this step, I assume the certificates will be on all servers in the /etc/vault
folder.
Also important: The services must be able to communicate with each other. For this, the TCP ports 8200 and 8201 must be open in the VLAN. Here are example commands for UFW:
sudo ufw allow proto tcp from 192.168.100.0/24 to any port 8200
sudo ufw allow proto tcp from 192.168.100.0/24 to any port 8201
Auto-Unsealing with AWS KMS
If you want to run Vault in a production system, you want it to unseal automatically on restarts. There are several possibilities to implement this:
- Another Vault instance (Transit seal )
- Using an unsealing service like vault-unseal
- KMS via Cloud provider
I have experimented a bit and ultimately ended up with AWS KMS. It is quite affordable at 1 USD per month and is easy to set up (if you know where to click in AWS 😱). The other solutions are feasible but more complex:
- A Transit Seal instance requires another node with appropriate certificates/access. This node also needs to be unsealed for our service to start, which presents its own challenges.
- vault-unseal runs smoothly and is easy to set up. However, due to polling, the Vault cluster takes some time to start after a restart (polling is set to 30 seconds by default). This made cold starts of all nodes a bit bumpy and could potentially block other services dependent on Vault if multiple nodes go down.
Here’s a brief guide for AWS:
- We need an AWS account that can create and manage users as well as KMS-Secretes (yes, a manager account works too).
- In the IAM management console, create a group “Vault” with the following permissions:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VaultKMSUnseal",
"Effect": "Allow",
"Action": [
"kms:Decrypt",
"kms:Encrypt",
"kms:DescribeKey"
],
"Resource": "*"
}
]
}
- Then create a user with this role.
- In the user’s view, create an access key (note down the key ID and secret).
- Now create a new KMS-Schlüssel
in the KMS management.
- Pay attention to the correct zone!
- Select “Customer managed key”.
- The key is symmetric.
- The key user (not admin!) is the user created above.
- Note down the key ID!
This gives us four pieces of data:
- User’s access key ID =
access_key
- User’s access key secret =
secret_key
- KMS key ID =
kms_key_id
- AWS Region of the key =
kms_region
Starting Vault on the individual Servers
app01
Wir start on app01 - the procedure is similar on the other machines. We need the key data from AWS and execute the following commands:
access_key=123
secret_key=secret
kms_key_id=key
kms_region=eu-central-1
cd /etc/vault
# Config
sudo tee vault.hcl << EOF
cluster_addr = "https://192.168.100.111:8201"
api_addr = "https://0.0.0.0:8200"
storage "raft" {
path = "/vault/data"
retry_join {
leader_api_addr = "https://app02:8200"
leader_ca_cert_file = "/vault/config/ca.crt"
leader_client_cert_file = "/vault/config/vault02.crt"
leader_client_key_file = "/vault/config/vault02.key"
}
retry_join {
leader_api_addr = "https://app03:8200"
leader_ca_cert_file = "/vault/config/ca.crt"
leader_client_cert_file = "/vault/config/vault03.crt"
leader_client_key_file = "/vault/config/vault03.key"
}
}
listener "tcp" {
address = "0.0.0.0:8200"
tls_cert_file = "/vault/config/vault01.crt"
tls_key_file = "/vault/config/vault01.key"
}
seal "awskms" {
region = "${kms_region}"
access_key = "${access_key}"
secret_key = "${secret_key}"
kms_key_id = "${kms_key_id}"
}
EOF
sudo chmod 600 vault.hcl
# Network
docker network create --driver=bridge --subnet=192.168.128.0/24 app
# Volumes
docker volume create vault
docker volume create vault_log
# Access
docker run --rm -v /etc/vault:/data:rw docker.io/hashicorp/vault chown vault:vault /data
docker run --rm -v vault:/data:rw docker.io/hashicorp/vault chown vault:vault /data
docker run --rm -v vault_log:/data:rw docker.io/hashicorp/vault chown vault:vault /data
# Service
docker run -d --restart unless-stopped --network=app --name vault --cap-add IPC_LOCK -p 192.168.100.111:8200:8200 \
-p 192.168.100.111:8201:8201 --add-host app01:192.168.100.111 --add-host app02:192.168.100.112 \
--add-host app03:192.168.100.113 -v /etc/vault:/vault/config:ro -v vault:/vault/data -v vault_log:/vault/logs \
docker.io/hashicorp/vault server
So, what is happening here?
- We are creating a
vault.hcl
file with the corresponding access data (perhaps delete your history afterwards).- Storage is
raft
with hints for possible other nodes - we leave our own node ID free, as I have had better experiences with this approach. - The certificates of the other services are used as client certificates by our own service. I’m not 100% happy with this - in a production environment, it would certainly be advisable to use client certificates. However, in my test setup, I didn’t get around to testing this. If anyone has information on this, please let me know!
- The service listens on port 8200 and uses its own certificates.
- Storage is
- Since the file contains sensitive data, it should be secured.
- For Vault and the services that want to use Vault locally, it is useful to set up a separate Docker network. I also like to give such networks a known subnet, which is practical for later access restrictions in Vault, as the network is the same on all nodes.
- We also create two volumes for the raft data and the log.
- Now we need to set the permissions correctly.
- In the local
/etc/vault
directory, the Docker service must have read permissions. If the first command doesn’t work, the permissions must be adjusted accordingly (on Ubuntu, this was the user_apt
and the group with ID 1000). In a live environment, I would separate the certificates from the configuration, which is certainly cleaner.
- In the local
- Finally, we start the Docker service:
- We start the service in the background with a restarting policy.
- The network is the one created above (optional).
- The name is vault (important: own DNS name, or name in the Docker network, must match the certificate).
- IPC_LOCK must be set for Vault.
- We listen on the local VLAN at ports 8200 and 8201 - this must be adjusted accordingly in your own network. It is advisable to specify IPs here, as in certain setups Docker tends to bypass its own firewall through bridging. So be careful at this point!
- We specify the other hosts by their names.
- Volumes: Configuration folder should be read-only, the others should be writable.
This should get the server running. You can check the log to see if everything is okay:
docker logs vault
The log will currently contain a lot of error messages, which is okay and normal. First, we set up the other nodes.
app02
Commands only:
access_key=123
secret_key=secret
kms_key_id=key
kms_region=eu-central-1
cd /etc/vault
sudo tee vault.hcl << EOF
cluster_addr = "https://192.168.100.112:8201"
api_addr = "https://0.0.0.0:8200"
storage "raft" {
path = "/vault/data"
retry_join {
leader_api_addr = "https://app01:8200"
leader_ca_cert_file = "/vault/config/ca.crt"
leader_client_cert_file = "/vault/config/vault01.crt"
leader_client_key_file = "/vault/config/vault01.key"
}
retry_join {
leader_api_addr = "https://app03:8200"
leader_ca_cert_file = "/vault/config/ca.crt"
leader_client_cert_file = "/vault/config/vault03.crt"
leader_client_key_file = "/vault/config/vault03.key"
}
}
listener "tcp" {
address = "0.0.0.0:8200"
tls_cert_file = "/vault/config/vault02.crt"
tls_key_file = "/vault/config/vault02.key"
}
seal "awskms" {
region = "${kms_region}"
access_key = "${access_key}"
secret_key = "${secret_key}"
kms_key_id = "${kms_key_id}"
}
EOF
sudo chmod 600 vault.hcl
# Network
docker network create --driver=bridge --subnet=192.168.128.0/24 app
# Volumes
docker volume create vault
docker volume create vault_log
# Access
docker run --rm -v /etc/vault:/data:rw docker.io/hashicorp/vault chown vault:vault /data
docker run --rm -v vault:/data:rw docker.io/hashicorp/vault chown vault:vault /data
docker run --rm -v vault_log:/data:rw docker.io/hashicorp/vault chown vault:vault /data
# Service
docker run -d --restart unless-stopped --network=app --name vault --cap-add IPC_LOCK -p 192.168.100.112:8200:8200 \
-p 192.168.100.112:8201:8201 --add-host app01:192.168.100.111 --add-host app02:192.168.100.112 \
--add-host app03:192.168.100.113 -v /etc/vault:/vault/config:ro -v vault:/vault/data -v vault_log:/vault/logs \
docker.io/hashicorp/vault server
Start app03
Commands only:
access_key=123
secret_key=secret
kms_key_id=key
kms_region=eu-central-1
cd /etc/vault
sudo tee vault.hcl << EOF
cluster_addr = "https://192.168.100.113:8201"
api_addr = "https://0.0.0.0:8200"
storage "raft" {
path = "/vault/data"
retry_join {
leader_api_addr = "https://app01:8200"
leader_ca_cert_file = "/vault/config/ca.crt"
leader_client_cert_file = "/vault/config/vault01.crt"
leader_client_key_file = "/vault/config/vault01.key"
}
retry_join {
leader_api_addr = "https://app02:8200"
leader_ca_cert_file = "/vault/config/ca.crt"
leader_client_cert_file = "/vault/config/vault02.crt"
leader_client_key_file = "/vault/config/vault02.key"
}
}
listener "tcp" {
address = "0.0.0.0:8200"
tls_cert_file = "/vault/config/vault03.crt"
tls_key_file = "/vault/config/vault03.key"
}
seal "awskms" {
region = "${kms_region}"
access_key = "${access_key}"
secret_key = "${secret_key}"
kms_key_id = "${kms_key_id}"
}
EOF
sudo chmod 600 vault.hcl
# Network
docker network create --driver=bridge --subnet=192.168.128.0/24 app
# Volumes
docker volume create vault
docker volume create vault_log
# Access
docker run --rm -v /etc/vault:/data:rw docker.io/hashicorp/vault chown vault:vault /data
docker run --rm -v vault:/data:rw docker.io/hashicorp/vault chown vault:vault /data
docker run --rm -v vault_log:/data:rw docker.io/hashicorp/vault chown vault:vault /data
# Service
docker run -d --restart unless-stopped --network=app --name vault --cap-add IPC_LOCK -p 192.168.100.113:8200:8200 \
-p 192.168.100.113:8201:8201 --add-host app01:192.168.100.111 --add-host app02:192.168.100.112 \
--add-host app03:192.168.100.113 -v /etc/vault:/vault/config:ro -v vault:/vault/data -v vault_log:/vault/logs \
docker.io/hashicorp/vault server
Initialize Vault
After Vault is now running on all nodes, we can initialize the service. To do this, we run a temporary container on any server - this has the advantage that nothing remains in the user’s history after the work is completed.
docker run --rm -ti --network=app -v /etc/vault:/vault/config:ro -e VAULT_ADDR=https://vault:8200 \
-e VAULT_CACERT=/vault/config/ca.crt -P docker.io/hashicorp/vault ash
The container is temporary and interactive and in the same network as our Vault service. We also need the
configuration folder and set two environment variables for the vault
command below. The -P
is important so that we
do not block our port range above. The image uses the ash shell.
In the shell, we initialize the cluster:
vault operator init
The keys and the initial root token should be stored in a secure location!
With that, Vault is ready for use and should remain accessible even after restarting individual containers. To test this, you can check the status (in the temporary container):
vault operator status
You can restart vault on your host (docker restart vault
) and should see that Vault should be unsealed a short while
later.
Test: Access and Go Snippets
Here a small dive into how Go could call your Vault servers.
Putting Data into Vault
We store some data in Vault for this purpose. Once again, we log in to any node and create a temporary container for Vault administration.
docker run --rm -ti --network=app -v /etc/vault:/vault/config:ro -e VAULT_ADDR=https://vault:8200 \
-e VAULT_CACERT=/vault/config/ca.crt -P docker.io/hashicorp/vault ash
In the container, we need to set the root token that we obtained during initialization:
export VAULT_TOKEN=MyTOken
To test, we first create a secret that our program should retrieve:
vault secrets enable -version=2 -path=app -description="Application secrets" kv
vault kv put -mount=app apiKey key=PaipCijvonEtdysilgEirlOwUbHahachdyazVopejEnerekBiOmukvauWigbimVi
The application requires access, which we create using Approle:
vault auth enable approle
echo 'path "app/data/apiKey" {
capabilities = ["read"]
}' | vault policy write myapp -
vault write auth/approle/role/myapp token_ttl=1h token_max_ttl=8h secret_id_ttl=0 token_policies="myapp"
# read data
vault read auth/approle/role/myapp/role-id
# create secret id
vault write -force auth/approle/role/myapp/secret-id
We receive two UUIDs in return, namely the role ID and the secret. Both are required in the application. In a live environment, the role can be further restricted by setting IP ranges from which the application can access - I have omitted this here to facilitate testing. Let’s try the login right away:
vault write auth/approle/login role_id="123" secret_id="456"
export VAULT_TOKEN=AppRoleToken
vault kv get -mount=app apiKey
We first create a login with the role and the secret. We receive a token in return, which we can then set as an environment variable (thus overriding the root token). Essentially, we assume the role of the service. In this role, we can attempt to read the API key, which should hopefully work.
Go-Program
Here, I present a small code snippet to use. Quite a part of it has been copied from the token renewal example and put it into a small library.
You must set the following environmental variables:
VAULT_ADDR
: Address of the Vault server (Default:https://vault:8200
)VAULT_CACERT
: File path to the CA certificate (Default:/etc/vault/ca.crt
)APPROLE_ROLE_ID
: Role ID (from above)APPROLE_SECRET_ID
: Role secret (from above)
The last to variables must not be empty!
The library looks like this:
// vault.go
package main
import (
"cmp"
"context"
"fmt"
vault "github.com/hashicorp/vault/api"
"github.com/hashicorp/vault/api/auth/approle"
"github.com/rs/zerolog/log"
"os"
)
// VaultClient is the global vault client
var VaultClient *vault.Client
// InitVault initializes the vault client
func InitVault() {
vaultAddress := cmp.Or(os.Getenv("VAULT_ADDR"), "https://vault:8200")
vaultCAFile := cmp.Or(os.Getenv("VAULT_CACERT"), "/etc/vault/ca.crt")
// define config
config := vault.DefaultConfig() // modify for more granular configuration
config.Address = vaultAddress
if err := config.ConfigureTLS(&vault.TLSConfig{
CAPath: vaultCAFile,
}); err != nil {
log.Fatal().Str("VAULT_ADDR", vaultAddress).Str("VAULT_CACERT", vaultCAFile).Err(err).Msg("Failed to configure Vault TLS")
}
// create client
client, err := vault.NewClient(config)
if err != nil {
log.Fatal().Str("VAULT_ADDR", vaultAddress).Str("VAULT_CACERT", vaultCAFile).Err(err).Msg("Failed to create Vault client")
}
ctx, cancelContextFunc := context.WithCancel(context.Background())
defer cancelContextFunc()
// copy to global variable
VaultClient = client
// initial login
authInfo, err := vaultLogin(ctx)
if err != nil {
log.Fatal().Str("VAULT_ADDR", vaultAddress).Str("VAULT_CACERT", vaultCAFile).Err(err).Msg("Failed to login to Vault")
}
// start the lease-renewal goroutine & wait for it to finish on exit
go vaultStartRenewLeases(authInfo)
// everything ok, log success
log.Info().Str("VAULT_ADDR", vaultAddress).Msg("Vault successfully connected and initial token created.")
}
func vaultLogin(ctx context.Context) (*vault.Secret, error) {
// Get environment variables for Vault
vaultAppRoleId := os.Getenv("APPROLE_ROLE_ID")
if vaultAppRoleId == "" {
log.Fatal().Msg("Error: Vault App Role not set.")
}
// initial login with AppRole
appRoleAuth, err := approle.NewAppRoleAuth(vaultAppRoleId, &approle.SecretID{
FromEnv: "APPROLE_SECRET_ID",
})
// TODO: we might want to create ResponseWrapping somehow
// ref: https://www.vaultproject.io/docs/concepts/response-wrapping
// ref: https://learn.hashicorp.com/tutorials/vault/secure-introduction?in=vault/app-integration#trusted-orchestrator
// ref: https://learn.hashicorp.com/tutorials/vault/approle-best-practices?in=vault/auth-methods#secretid-delivery-best-practices
// and example in: https://github.com/hashicorp/hello-vault-go/blob/main/sample-app/vault.go
if err != nil {
return nil, err
}
return VaultClient.Auth().Login(ctx, appRoleAuth)
}
func vaultStartRenewLeases(authToken *vault.Secret) {
ctx, cancelContextFunc := context.WithCancel(context.Background())
defer cancelContextFunc()
log.Info().Msg("Starting lease renewal service.")
defer log.Info().Msg("Stopping lease renewal service.")
currentAuthToken := authToken
for {
renewed, err := renewLeases(ctx, currentAuthToken)
if err != nil {
log.Fatal().Err(err).Msg("Failed to renew leases")
}
if renewed&exitRequested != 0 {
return
}
if renewed&expiringAuthToken != 0 {
log.Printf("auth token: can no longer be renewed; will log in again")
authToken, err := vaultLogin(ctx)
if err != nil {
log.Fatal().Err(err).Msg("Failed to login to Vault")
}
currentAuthToken = authToken
}
}
}
// renewResult is a bitmask which could contain one or more of the values below
type renewResult uint8
const (
renewError renewResult = 1 << iota
exitRequested
expiringAuthToken // will be revoked soon
)
func renewLeases(ctx context.Context, authToken *vault.Secret) (renewResult, error) {
log.Info().Msg("Starting lease renewal.")
// auth token
authTokenWatcher, err := VaultClient.NewLifetimeWatcher(&vault.LifetimeWatcherInput{
Secret: authToken,
})
if err != nil {
return renewError, fmt.Errorf("unable to initialize auth token lifetime watcher: %w", err)
}
go authTokenWatcher.Start()
defer authTokenWatcher.Stop()
// monitor events from all watchers
for {
select {
case <-ctx.Done():
return exitRequested, nil
// DoneCh will return if renewal fails, or if the remaining lease
// duration is under a built-in threshold and either renewing is not
// extending it or renewing is disabled. In both cases, the caller
// should attempt a re-read of the secret. Clients should check the
// return value of the channel to see if renewal was successful.
case err := <-authTokenWatcher.DoneCh():
// Leases created by a token get revoked when the token is revoked.
return expiringAuthToken, err
// RenewCh is a channel that receives a message when a successful
// renewal takes place and includes metadata about the renewal.
case info := <-authTokenWatcher.RenewCh():
log.Printf("auth token: successfully renewed; remaining duration: %ds", info.Secret.Auth.LeaseDuration)
//case info := <-databaseCredentialsWatcher.RenewCh():
// log.Printf("database credentials: successfully renewed; remaining lease duration: %ds", info.Secret.LeaseDuration)
//}
}
}
}
The library can be used in a long-running service and automatically renews the access token in the background (as a Go routine).
A small test program:
// main.go
package main
import (
"context"
"fmt"
)
func main() {
InitVault()
secret, err := VaultClient.KVv2("app").Get(context.Background(), "apiKey")
if err != nil {
panic("Failed to get token key from Vault")
}
fmt.Printf("API-Key: %s\n", secret.Data["key"])
}
In a service, you can always access the global variable VaultClient
. It will always contain a valid session (a token).
Title Image: Immeuble du Crédit Lyonnais - Used under the conditions of CC BY-SA 3.0 .