Skip to main content

Consul Snapshot Agent with Dynamic ACL Tokens via Vault Agent

Introduction

The ability to automatically revoke or rotate credentials is something that should be sought after by any operations team. HashiCorp Vault provides the ability to generate dynamic credentials for supported systems and handle their lifecycle. While this sounds great, in order to achieve true automation and a hands-off experience, an additional tool is needed. This is where Vault Agent comes in, becoming responsible for authenticating and handling the automatic renewal of dynamic credentials.

Consul Snapshot Agent is another handy agent tool from HashiCorp to handle the automation of data in Consul. Of course, this agent needs a way to authenticate to Consul (assuming ACLs are enabled) and operators will want to do this in the most dynamic way possible.

With all of that said, one approach to this is to deploy the Vault Agent on the same machine that the Consul Snapshot Agent will run on. The job of the Vault Agent is to provide a dynamic, always valid, Consul ACL token to the snapshot agent so that it always has the correct permissions to take a snapshot. By controlling everything through Vault, operators can see who exactly generated a secret and when and also be in full control of when that secret is valid. In the event a compromise is suspected, the secret can be immediately revoked.

In this blog, I am going to go through the exercise of setting up the Consul Secrets Engine to provide dynamic ACL tokens and then leverage the Vault Agent to service those dynamic ACL tokens to my Consul Snapshot Agent.

Prerequisites

To follow this blog, you will need:

  • Access to a Consul cluster (or server) with ACLs enabled
  • Access to a Vault cluster (or server) with the permissions to create and manage Secrets Engines
  • Access to a server that will run the Consul Snapshot and Vault Agent

Configuring the Consul Secrets Engine

To start, I’m going to authenticate to Vault and enable the Consul Secrets Engine. Once enabled, I can configure the Secrets Engine with my global-management Consul token so that Vault can authenticate and have sufficient permissions to create additional tokens in Consul.

# Authenticate to Vault
$ vault login -method=userpass username=jacobm

# Enable the Consul Secrets Engine
$ vault secrets engine enable consul
Success! Enabled the consul secrets engine at: consul/

# Configure the Secrets Engine to leverage the global-management token
$ vault write consul/config/access address=127.0.0.1:8500 \
  token=7652ba4c-0f6e-8e75-5724-5e083d72cfe4

With the Consul address and global-management token handy, I then configure the newly created Secrets Engine.

# Configure the Secrets Engine
$ vault write consul/config/access address=https://consul.blizzard.lab:8501 \
  token=0d3cc34b-ac01-a982-435c-9330531d5f39
Success! Data written to: consul/config/access

Now that the Secrets Engine is configured, I’m going to create a role that will map to a policy pre-defined in Consul. In this case, I am going to create a role called snapshot-agent that will map to my snapshot-agent policy in Consul.

# Create the snapshot-agent role attached to the snapshot-agent Consul policy
$ vault write consul/roles/snapshot-agent policies=snapshot-agent
Success! Data written to: consul/roles/snapshot-agent

One last piece is to tune the Secrets Engine so that Vault will issue a short-lived Consul token.

# Tune the Secrets Engine to give out 1h tokens
$ vault secrets tune -default-lease-ttl=1h consul/
Success! Tuned the secrets engine at: consul/

Finally, I’ll test the Secrets Engine by getting Vault to generate a Consul token to ensure everything is working as expected.

# Generate a Consul token
$ vault read consul/creds/snapshot-agent
Key                Value
---                -----
...
lease_duration     1h
lease_renewable    true
accessor           7dbfd151-26b8-4037-fa9a-f3c370585ae4
...
token              a8929b12-8fe2-c437-105b-d18d056ca53f

# Validate the token was attached to the policy in Consul
$ consul acl token read -id 7dbfd151-26b8-4037-fa9a-f3c370585ae4 \
  -token 0d3cc34b-ac01-a982-435c-9330531d5f39
AccessorID:       7dbfd151-26b8-4037-fa9a-f3c370585ae4
SecretID:         a8929b12-8fe2-c437-105b-d18d056ca53f
Namespace:        default
Description:      Vault snapshot-agent userpass-jacobm 1606779308288887959
...
Policies:
   8b9368cb-86f2-bb0f-db14-87ef7bbd252a - snapshot-agent

Everything looks good. I am able to successfully generate Consul tokens attached to the write ACL policy in Vault. Now, I’ll move onto configuring the Vault Agent.

Configuring the Vault Agent

I’m going to be configuring both the Vault Agent and Consul Snapshot Agent on a separate host from my Vault and Consul clusters. I would recommend this approach as well that way, if you ever lost your Consul nodes, your snapshots are living on a separate host.

I’m going to create a file called vault-agent.hcl that will hold the configuration for my Vault Agent. This file gives the agent all the information it needs to find my Vault cluster, authenticate to it, the secret to retrieve, and where to store it. The Vault Agent is responsible for authenticating to Vault, getting a token, and then retrieving the desired secret. Since the desired secret in this case is a dynamic secret with a lease attached to it, the Vault Agent will renew the Consul ACL token at 1/3 of its lease time.

For authentication, I’m leveraging AppRole and providing the RoleID and SecretID via a local file path. This AppRole is tied to a Vault policy that only has the permissions to read at the consul/creds/snapshot-agent path in Vault and also has a SecretID CIDR bound which only allows the SecretID to be used by this machine. On top of this, those files are owned by the same user running the Vault agent and is only readable by that user. It is important, when working with AppRole to keep your SecretID as secure as possible and only pair it with the RoleID when absolutely necessary.

The final section tells the Vault Agent to store the secret it retrieves in /etc/consul.d/consul-snapshot/acl-token. That is where the Consul Snapshot Agent will look for its ACL token when it runs. The contents section tells the agent to get the Consul ACL token from consul/creds/snapshot-agent which is the same path we used earlier to test the Secrets Engine.

I’m saving this file at /etc/vault.d/vault-agent.hcl.

{% highlight jinja %}
{% raw %}
pid_file = “./pidfile”

vault {
address = “https://vault.blizzard.lab:8200”
}

auto_auth {
method “approle” {
mount_path = “auth/approle”
config = {
role_id_file_path = “/etc/vault.d/role_id”
secret_id_file_path = “/etc/vault.d/secret_id”
remove_secret_id_file_after_reading = false
}
}
}

template {
destination = “/etc/consul.d/snapshot-agent/acl-token”
contents = “{{ with secret consul/creds/snapshot-agent }}{{ .Data.token }}{{ end }}”
}
{% endraw %}
{% endhighlight %}

Now that I have created that file on my system, my Vault Agent is ready to go. To run it, I can use the vault agent command but since I want this to agent to run consistently, I’m going to run it as a service unit on the system. To do this, I’m going to create the service file below and save it at /etc/systemd/system/vault-agent.service.

[Unit]
Description="HashiCorp Vault Agent"
Documentation=https://www.vaultproject.io/docs/agent
Requires=network-online.target
After=network-online.target
ConditionFileNotEmpty=/etc/vault.d/vault-agent.hcl

[Service]
Type=simple
User=root
Group=root
ExecStart=/opt/vault/bin/vault agent -config=/etc/vault.d/vault-agent.hcl
KillMode=process
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

Once both files are in place, I can run the service unit which will start up the Vault Agent and keep it running.

# Start the Vault Agent
$ systemctl start vault-agent

Configuring Consul Snapshot Agent

The last piece is to configure the Consul Snapshot Agent to take snapshots of the Consul cluster and do so using the dynamic ACL token that Vault has retrieved.

To begin, I’m going to create a file called snapshot-config.json that will hold all the configuration needed to run the agent. I have removed some of the TLS certificate values but the main configuration is shown below. I’m going to save this file at /etc/consul.d/snapshot-agent/config.json

{
  "snapshot_agent": {
    "http_addr": "https://consul.blizzard.lab:8501",
    "datacenter": "blizzard",
    ...
    "log": {
      "level": "INFO",
      "enable_syslog": false,
      "syslog_facility": "LOCAL0"
    },
    "snapshot": {
      "interval": "30m",
      "retain": 20,
      "stale": false,
      "service": "consul-snapshot",
      "deregister_after": "72h",
      "lock_key": "consul-snapshot/lock",
      "max_failures": 3,
      "local_scratch_path": ""
    },
    "local_storage": {
      "path": "/etc/consul.d/snapshots/"
    }
  }
}

At a high-level, I told the Consul Snapshot Agent where my Consul cluster is and that I want a snapshot taken every 30 minutes and only retain the last 20 snapshots.

Once that file is saved, the last step to do is to do what we did with the Vault Agent and create a service unit. In this service unit is where I am specifying the path to the token file located at /etc/consul.d/snapshot-agent/consul-acl-token. The service unit can be saved at /etc/systemd/system/consul-snapshot.service.

[Unit]
Description="HashiCorp Consul Snapshot Agent"
Documentation=https://www.consul.io/commands/snapshot/agent
Requires=network-online.target
After=network-online.target
ConditionFileNotEmpty=/etc/consul.d/snapshot-agent/config.json

[Service]
Type=simple
User=consul
Group=hashicorp
ExecStart=/opt/consul/bin/consul snapshot agent -token-file /etc/consul.d/snapshot-agent/consul-acl-token -config-file /etc/consul.d/snapshot-agent/config.json
KillMode=process
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

Once that is saved, I can kick off the service unit and the Consul Snapshot Agent should be able to read the dynamically generated ACL token Vault provided and begin taking snapshots of my Consul cluster.

# Start the Consul Snapshot Agent
$ systemctl start consul-snapshot
==> Consul snapshot agent running!
             Version: 1.9.0+ent-rc1
          Datacenter: "blizzard"
            Interval: "30m0s"
              Retain: 20
              ...

Everything looks good! The agent is able to authenticate and has the necessary permissions to take snapshots and write to the Consul Key/Value storage.

Conclusion

Wrapping up, I was able to configure the Consul Secrets Engine in Vault to generate dynamic Consul ACL tokens and distribute them to a system via the Vault Agent. The Consul Snapshot Agent was then able to use that token to authenticate to Consul and begin snapshotting Consul data on a 30-minute interval. By doing this, operators of these clusters do not have to worry about ACL tokens going stale or containing secret sprawl as they can now tightly control access and longevity of the token, all through Vault. In the event the token needs to be revoked or rotated, it can easily be done through Vault. Dynamic secrets in Vault can be super powerful, especially when used with Vault Agent. I encourage you to try the exercise above or modify it a bit with a different Secrets Engine like Database or Cloud. Of course, if there are any questions, let me know in the comments below!

Share this story

Arctiq Team

We service innovation.