Audit Vault with Elasticsearch for incident response
Challenge
As a Vault operator or security practitioner, you need to respond to common incidents which can arise in the operation of a Vault cluster.
Critical Vault-specific incident types can include, but are not limited to the following:
- User access
- Authentication failure
- Compromised client host
- Exposed credentials
Synthesizing information around these types of incidents and appropriately responding them within a tight timeline is of utmost importance to reduce production impact.
Solution
You can use Vault Audit Devices, and send logs from them to a Security Information and Event Management (SIEM) tool for aggregation, inspection, and alerting capabilities. This solution provides timely information for incident response workflows.
Elasticsearch with Kibana and the Elastic Agent are an example of available open source solutions for aggregating and searching Vault audit device logs. The scenario in this tutorial will use these technologies as a reference to help inform you of what is possible.
Scenario introduction
You will use a terminal and command line interfaces for Vault and Terraform along with the Terraform Docker and Vault Providers to complete this scenario.
You will deploy and configure Elasticsearch, Kibana, and Elastic Agent containers, and use the Elasticsearch web UI for some configuration steps.
You will then deploy and configure a Vault development mode server container with 2 audit devices:
- A socket-based audit device that sends audit device logs to the Elastic Agent for consumption by Elasticsearch.
- A file-based audit device for terminal session use.
Once the environment is established, you will then deploy some configuration to Vault with the Terraform Vault Provider that simulates common incidents in the audit device logs.
Finally, you will use a combination of the Kibana UI and Kibana Query Language (KQL) queries or terminal session with jq
to identify the incidents in the audit device logs.
Audit device filters
Starting in Vault 1.16.0, you can enable audit devices with a filter
option that Vault uses to evaluate audit entries to determine whether it writes them to the log. You should determine if your own audit devices are filtered and make necessary changes to expose the log fields which you need to monitor for your use case.
You can familiarize yourself with Vault filtering concepts and filtering audit entries and how to enable audit filters in the documentation.
Prerequisites
- Vault binary installed in your system PATH.
- Terraform CLI version 1.2.0 or newer binary installed in your system PATH.
- Docker installed.
- This scenario requires at least 4GB of memory allocated to Docker.
- Git binary installed in your system PATH.
- jq binary installed in your system PATH for querying the file audit device log.
- Learn lab Terraform configuration repository.
- A web browser for accessing the Kibana UI.
- Internet access from the host computer.
Docker resource tip
You need to configure Docker with access to 3GB of RAM and 2 CPUs to complete this hands-on lab scenario.
Prepare scenario environment
This scenario environment is based mostly on Docker containers, which run on your local host. The goal of this section is to establish a working directory and clone the Terraform configuration repository.
For ease of cleanup later, create a temporary directory that will contain all required configuration for the scenario named learn-vault-lab
, and export its path value as the environment variable HC_LEARN_LAB
.
$ mkdir /tmp/learn-vault-lab && export HC_LEARN_LAB="/tmp/learn-vault-lab"
Terraform configuration repository
You can find the Terraform configuration that you will use for this scenario in the learn-vault-audit-elasticsearch GitHub repository.
Clone the repository into the scenario directory.
$ git clone \ https://github.com/hashicorp-education/learn-vault-audit-elasticsearch.git \ "${HC_LEARN_LAB}"/learn-vault-audit-elasticsearch
Change into the repository directory.
$ cd "${HC_LEARN_LAB}"/learn-vault-audit-elasticsearch
Examine the contents.
$ ls -1R1-elasticsearch-kibana2-fleet-agent-bootstrap3-enroll-elastic-agent4-vault5-example-workflowsLICENSEREADME.md ./1-elasticsearch-kibana:main.tfversions.tf ./2-fleet-agent-bootstrap:certmain.tfversions.tf ./2-fleet-agent-bootstrap/cert:README.md ./3-enroll-elastic-agent:certmain.tfversions.tf ./3-enroll-elastic-agent/cert:README.md ./4-vault:main.tfversions.tf ./5-example-workflows:main.tfversions.tf
The repository consists of Terraform configuration that you'll apply to complete the hands on scenario.
Run Elasticsearch & Kibana containers
The goal of this section is for you to use Terraform configuration with the Docker provider to deploy Elasticsearch and Kibana Docker containers, then verify that they are available. You'll also gather information for use in configuring Elasticsearch Vault integration in Kibana.
Change into the 1-elasticsearch-kibana
directory.
$ cd 1-elasticsearch-kibana
Initialize the Terraform workspace.
$ terraform init
Apply the Terraform configuration to deploy the containers. For convenience in this scenario, use the -auto-approve
flag to skip confirmation.
$ terraform apply -auto-approve
Tip
If it returns an error, re-run the terraform apply
command again.
The successful output from this command should end with a line resembling this example:
Apply complete! Resources: 6 added, 0 changed, 0 destroyed.
Check the container status.
$ docker ps -f name=learn_lab --format "table {{.Names}}\t{{.Status}}"
Example output:
NAMES STATUSlearn_lab_elasticsearch Up 45 seconds (unhealthy)learn_lab_kibana Up 45 seconds (healthy)
Note
It is expected that the learn_lab_elasticsearch container health status is starting
or unhealthy
until you configure it in the following steps.
With the kibana
container in healthy
status, you can get the URL for accessing Kibana.
$ docker logs learn_lab_kibana | grep 'Go to' | cut -d ' ' -f3
The Kibana URL is in the output:
http://0.0.0.0:5601/?code=378441
Note
This URL value is unique to your Kibana deployment, so keep it handy for use in the following steps.
Generate a new Kibana enrollment token.
$ docker exec learn_lab_elasticsearch \ /usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token \ -s kibana
This results in a Kibana enrollment Token value.
eyJ2ZXIiOiI4LjMuMyIsImFkciI6WyIxMC40Mi40Mi4xMDA6OTIwMCJdLCJmZ3IiOiI5ZmNlYTE0ZDA5ODM2MTg2NzIzOGFkMDdlYmU1NDJhMWE4NjM3MzEwNTUyNTE3ZDI4MDQyYzI1ZDcxOTVlODAzIiwia2V5IjoiR1lBY2FvSUJJc05aVzgwTEI1RDg6SkZXM0FjWURSaXFMem5kQzlfN2RlUSJ9
Keep this value handy somewhere from which you can copy and paste it in the next step.
Access Kibana and configure Vault integration
In this section, your goal is to access Kibana and configure Elasticsearch with the Vault Integration.
Use a web browser to open the URL that you retrieved from the Elasticsearch container logs.
Note
This URL is dynamically generated, so you need to use your actual URL value that you captured earlier, not the example value.
Open the URL, where you are greeted by the Configure Elastic to get started dialog.
Paste your Kibana Enrollment Token value into the Enrollment token text field and click Configure Elastic.
A series of configuration steps are automatically completed. When the configuration is finished, you are presented with a Welcome to Elastic dialog where you can authenticate.
Enter
elastic
in the Username text input field.Enter
2learnVault
into the Password text input field.Click Log in.
You are greeted by a Welcome to Elastic dialog.
Click Add integrations.
Use the search text input and begin to enter
HashiCorp
into it; you should notice that the HashiCorp Vault integration appears.Click the HashiCorp Vault integration.
You are now on the HashiCorp Vault integration page.
Click Add HashiCorp Vault to add the integration.
Configure Vault Elasticsearch integration
Configure the integration as follows.
Click Logs from file to switch off gathering audit device information from a file as you will be using the network socket with Elasticsearch and manually inspecting the file audit device for this scenario.
Click Logs from TCP socket to switch the setting on.
Click Change defaults.
Enter
0.0.0.0
into the Listen Address text input to listen on all interfaces so that the Elastic Agent container can later connect to the integration.As this tutorial is focused on audit device logs, click Metrics to switch off the gathering of telemetry metrics from Vault.
Enter
agent-policy-vault
into the New agent policy name text input.Click Save and continue.
After a moment, a HashiCorp Vault integration added dialog appears where you can configure the Elastic Agent.
Configure Elastic Agent for Vault server
Click Add Elastic Agent to your hosts.
You will use a Fleet server to manage the Elastic Agent, so in the Add agent dialog, click Add Fleet Server.
In the Add a Fleet Server dialog, specify the Elasticsearch container IP address for the Fleet Server host value:
https://10.42.42.100:8220
.Click Generate Fleet Server policy. After a short delay, you should notice a Fleet Server policy created. message.
You can skip over Add a Fleet Server step 2 in the UI view as you will be using a container instead of the options shown there.
Scroll down in this UI view until you arrive at step 3, Confirm connection. The Fleet server is now waiting for the Elastic Agent to connect. Continue with the following steps to establish a connection to the Fleet server in the terminal.
In your terminal session, generate new Fleet Server Service Token, and set it as the value of the
TF_VAR_fleet_server_service_token
environment variable.$ export TF_VAR_fleet_server_service_token="$(curl \ -k \ -u 'elastic:2learnVault' \ -s \ -X POST http://127.0.0.1:5601/api/fleet/service-tokens \ --header 'kbn-xsrf: true' \ | jq -r .value)"
Confirm that the value is correctly set:
$ echo $TF_VAR_fleet_server_service_token
Successful example output:
AAEAAWVsYXN0aWMvZmxlZXQtc2VydmVyL3Rva2VuLTE2NjEzNDgyODMxNzE6WFlTMUFtSjJUMEdrMEo2Qk14VzJjdw
Once the Fleet Server policy is created and you have exported the environment variable, you are ready to bootstrap the Elastic Agent Container with Fleet.
Bootstrap Elastic Agent with Fleet
The Elastic Agent container needs to establish a connection to the Fleet server before the agent can be successfully enrolled. Your goal for this section is to start an Elastic Search container, and bootstrap this connection with the Fleet server.
Before starting the Elastic Agent container, access your terminal session, and change into the directory containing the correct Terraform configuration.
$ cd ../2-fleet-agent-bootstrap
Initialize the Terraform workspace.
$ terraform init
Run the container by applying the Terraform configuration.
$ terraform apply -auto-approve
Successful output example:
Apply complete! Resources: 2 added, 0 changed, 0 destroyed.
You can verify that the Elastic Agent successfully connects to Elasticsearch in the Kibana web UI. After about 10 seconds, you should notice a green check dynamically appear beside Fleet server connected as shown in the screenshot.
With the Fleet Server added and agent connected, click Continue enrolling Elastic Agent.
Return to your terminal and destroy the Elastic Agent bootstrap container so that it can be restarted with updated configuration in the enrollment step.
$ terraform destroy -auto-approve
Successful output example:
Destroy complete! Resources: 2 destroyed.
Enroll Elastic Agent
Now you can to enroll the Elastic Agent for use. This requires setting the enrollment token as an environment variable and starting the container again with another Terraform configuration.
In the Kibana UI under the Add Agent section, click Enroll in Fleet.
Scroll to the Install Elastic Agent on your host section and within the Linux Tar example tab, scroll horizontally in the text area until you can copy the value of
enrollment-token
as shown in the screenshot.Return to your terminal session and export the enrollment token value as the
TF_VAR_fleet_enrollment_token
environment variable. Replace the variable value$FLEET_ENROLLMENT_TOKEN
with your actual enrollment token value.$ export TF_VAR_fleet_enrollment_token=$FLEET_ENROLLMENT_TOKEN
Change into the directory containing the Elastic Agent enrollment configuration.
$ cd ../3-enroll-elastic-agent
Initialize the Terraform workspace.
$ terraform init
Run the container by applying the Terraform configuration.
$ terraform apply -auto-approve
Successful output example
Apply complete! Resources: 2 added, 0 changed, 0 destroyed.
You can verify that the Elastic Agent successfully enrolls to the Fleet server in the Kibana web UI, where you will notice Agent enrollment confirmed and Incoming data confirmed messages highlighted in green as shown in the screenshot.
Run Vault container
Your goal in this section is to start a Vault container that you will use for all Vault related actions in the scenario.
The Vault container you will deploy runs in dev mode with in-memory storage, and a preset initial root token value. This insecure configuration is for the simplicity of the lab scenario, and is not recommended for production use.
You will enable 2 audit devices: a socket audit device that Vault logs to for the Elastic Agent, and a file device you will use from the terminal.
Return to your terminal session, and change into the directory containing the Vault configuration.
$ cd ../4-vault
Initialize the Terraform workspace.
$ terraform init
Apply the configuration to run the Vault container.
$ terraform apply -auto-approve
Successful output example:
Apply complete! Resources: 2 added, 0 changed, 0 destroyed.
Export a
VAULT_ADDR
environment to address the Vault container.$ export VAULT_ADDR=http://127.0.0.1:8200
Log in with the root token value to authenticate with Vault.
Note
For the purpose of this tutorial, you can use the
root
token to work with Vault. However, it is recommended that root tokens are only used for just enough initial setup or in emergencies. As a best practice, use tokens with an appropriate set of policies based on your role in the organization.$ vault login -no-print root
Enable the socket audit device, and specify the Elastic Agent container IP address along with specifying the TCP socket type.
$ vault audit enable socket \ description="Socket audit device for Elastic Agent" \ address=10.42.42.130:9007 \ socket_type=tcp
Successful output example:
Success! Enabled the socket audit device at: socket/
Enable the filesystem audit device, and specify the path as
/vault/logs/vault-audit.log
which is a volume mapped to thelog
subdirectory of the present working directory.$ vault audit enable file \ description="File audit device" \ file_path=/vault/logs/vault-audit.log
Successful output example:
Success! Enabled the file audit device at: file/
For the purposes of inspecting this file audit device log later, you will use a relative host path like ../4-vault/log/vault-audit.log
.
You are now ready to access the Elasticsearch Vault Integration in Kibana to familiarize yourself with it before some simple testing and executing the example incident workflows.
Access Vault integration dashboard
Your goal for this section is to confirm that Vault is sending information to Elasticsearch by following these steps.
In the Kibana UI, click the navigation menu icon.
Click Dashboard.
Click the [HashiCorp Vault] Audit Logs link.
You will notice the default dashboard for audit device logs from Vault. Note the number in the Requests tile is 2.
Return to your terminal and list the available secrets engines to generate 1 new request for Vault.
$ vault secrets list
In the Kibana UI, click Refresh, and you should notice that the Requests tile counter increases to 3. Vault is sending data to the Elastic Agent, which forwards the data to Elasticsearch for the visualization in Kibana.
Now that you have confirmed that Vault is sending audit device log entries via the socket to Elasticsearch, you can try the example workflows.
Execute example incidents
Your goal in this section is to apply the final Terraform configuration to enable resources and perform actions to generate the audit device log entries you will explore.
The configuration will enable AppRole and Username and Password auth methods along with a Key/Value secrets engine. The configuration also performs some extra actions such as failed authentication attempts.
Before executing the example scenarios with Terraform, access your terminal session, and change into the directory containing the correct configuration.
$ cd ../5-example-workflows
Initialize the Terraform workspace.
$ terraform init
Run the example scenarios by applying the Terraform configuration.
$ terraform apply -auto-approve
Successful output example:
Apply complete! Resources: 28 added, 0 changed, 0 destroyed.
Incident response examples
In the rest of this tutorial, you will focus on some examples of incidents and the workflows for identifying them in the audit device logs.
Note
The Elasticsearch Vault integration ingests raw audit device logs from Vault, but adds several extra fields which are helpful for building more complex queries. These fields are not present in raw Vault audit device output, so you will not find them present in the file based audit device logs.
User access
You will learn how to examine the audit device log in the Kibana UI with KQL or with jq
in a terminal session for the following elements of user access:
- IP CIDR range of remote client
- Attempted access to a secret path
- Attempted access to an AppRole auth method role
IP CIDR range
Use the following steps to explore the audit device logs in search of Vault requests by remote client IP addresses within the CIDR 10.42.42.0/24.
Click the navigation hamburger menu and select Discover.
If you begin to type
hashicorp
into the Filter your data using KQL syntax text input, you'll notice a list of the available audit device log parameters available for building search queries on.In the Filter your data using KQL syntax text input, enter
hashicorp_vault.audit.request.remote_address : "10.42.42.0/24"
to begin searching for log entries from the CIDR.Click the calendar icon beside the query text input and select Today for the time range.
This will result in narrowing down the entries to just those from the CIDR as shown in the screenshot.
Try narrowing down the scope to a specific remote client IP address
10.42.42.128
.In the same text input replace the last query you used with
hashicorp_vault.audit.request.remote_address : "10.42.42.128"
instead.Now there are just 2 entries returned. Click the Document area of the first entry and expand it by clicking the icon.
You can browse the full audit device log entry this way to find information such as the path in the request from this remote IP address, which is sys/mounts. Attempted access to this endpoint means that this remote IP address attempted to list all secrets engines. If you examine the rest of the audit log entry, you will discover that the request was successful and a list of the secrets engines were returned.
Tip
You can also use the clipboard icon to copy the entire audit device log entry to the system clipboard for use wherever you like.
Attempted access to secret path
In this example, you examine the audit device logs in relation to specific API key secret at the path api-credentials/deployment-api-key. Find any attempts to access the secret which resulted in a "permission denied" error from Vault.
Use the following steps for the Kibana UI or CLI with jq
to search the Vault requests for any entries which reference the specific secrets path and error message.
In the Filter your data using KQL syntax text input, enter
hashicorp_vault.audit.request.path.text : "api-credentials/deployment-api-key" and hashicorp_vault.audit.error : "permission denied"
.Click the Document area of the entry and expand it by clicking the icon; you can find all the details of this failed request, such as the remote IP from which it originated.
Attempted access to AppRole auth method role
In this scenario, you need to examine the audit device logs for any attempted access to an AppRole auth method role named learn
by a specific IP address, 10.42.42.24
.
Use the following steps for the Kibana UI or CLI with jq
to search the Vault requests for any entries which reference the specific role name and IP address.
In the Filter your data using KQL syntax text input, enter
hashicorp_vault.audit.request.path.text : "learn" and hashicorp_vault.audit.request.remote_address : "10.42.42.24"
.Click the Document area of the first entry and expand it by clicking the icon; you can find all the details of this request, such as it being a
read
request that was successful as shown in the example.
Among the 2 entries are at least one where permission was denied to read the role. Your incident response workflow can take these details into account as it progresses.
Authentication failure
Authentication failure in Vault arises from any auth method, but also from operations such as use of a wrapped token outside of its scoped parameters.
In these examples, you will locate instances of authentication failure in the audit device logs for both of these scenarios.
Username and password
In this example, you are responding to a reported incident where a user was not able to successfully authenticate with the username and password auth method at the "research" path (auth/userpass/login/research
). The user reports that they encountered an error, "invalid username or password".
You need to locate the audit device log entry which corresponds to this incident to work on a response. In this particular example, you will also find an HMAC error
field that needs to be compared with the plaintext error message.
To find log entries which correspond to the research username and password auth method and contain a non-empty
error
field, use the following query in the Filter your data using KQL syntax text input:hashicorp_vault.audit.request.path : "auth/userpass/login/research" and hashicorp_vault.audit.response.data : *
.Click the Document area of the first entry and expand it by clicking the icon; you can find all the details of this request, including the HMAC version of the error message.
You can confirm that the error message field from the audit device log has the error "invalid username or password" by using the audit-hash API to compare the HMAC values. Pass the socket audit device name, "socket" as the path endpoint and the error plaintext as the value to the
input
parameter.$ vault write /sys/audit-hash/socket input="invalid username or password"
Example output:
Key Value--- -----hash hmac-sha256:05807cf0931c8bf8f0fa25799044be0e9cf0f884d9c775e01940e7e09e45ec80
Notice that the value of hash
returned is the same value as the error
field from the log entry. You have located the correct corresponding log entry for this user's reported issue.
Wrapped token
The first sign of an authentication failure with a wrapped token arises from failure to unwrap the token as in this simple API example.
The attacker attempts unwrapping of an invalid token.
$ curl \ --header "X-Vault-Token: $VAULT_TOKEN" \ --request POST \ --data "{ \"token\": \"INVALID_WRAPPED_TOKEN\" }" \ $VAULT_ADDR/v1/sys/wrapping/unwrap
The result is an error in the following form:
{"errors":["wrapping token is not valid or does not exist"]}
This error message appears in the audit device entries, and you can search for it with KQL in the Kibana UI or with jq
from a terminal session.
In the query text input, enter hashicorp_vault.audit.error : "wrapping token is not valid or does not exist"
The result is a single entry with the error message highlighted.
You can use the log details to discover the information you need for further incident response, such as revoking tokens and identifying remote client addresses involved.
An investigation should be triggered at the first sign of any failure to unwrap or look up errors, such as in this example case.
Compromised host
As an example scenario, an IDS alerts that a certain server host with the IP address 10.42.42.199 was compromised by a malicious actor. In this scenario, a security operator might want to proactively revoke all secret leases associated with the compromised host IP address.
You can leverage the audit device log entries to revoke all secret leases associated with the IP address.
In the query text input, enter hashicorp_vault.audit.request.remote_address : 10.42.42.199 and hashicorp_vault.audit.response.secret.lease_id:*
The result is a single entry with the error message highlighted.
You can use the value from hashicorp_vault.audit.response.secret.lease_id
with the vault
command in a terminal session to revoke the associated secret lease by its prefix.
$ vault lease revoke -prefix postgres/creds/dev/vCaUIVYrOQYjqYHQCciiY7FFAll revocation operations queued successfully!
Summary
You have learned how to integrate Vault with Elasticsearch and use Kibana and KQL queries to examine socket based audit device log data. You also learned about using the same searches with a terminal session and the jq
utility with a file based audit device log. In addition, you learned tips and techniques for responding to incidents, and to help you enhance your incident response workflows.
Clean up
From within the
5-example-workflows
directory, destroy the related resources.$ terraform destroy -auto-approve
Successful output example:
Destroy complete! Resources: 7 destroyed.
Change into the
4-vault
directory.$ cd ../4-vault
Destroy the Vault server container and related resources.
$ terraform destroy -auto-approve
Successful output example:
Destroy complete! Resources: 2 destroyed.
Change into the
3-enroll-elastic-agent
directory.$ cd ../3-enroll-elastic-agent
Destroy the Elastic Agent container and related resources.
$ terraform destroy -auto-approve
Successful output example:
Destroy complete! Resources: 2 destroyed.
Change into the
1-elasticsearch-kibana
directory.$ cd ../1-elasticsearch-kibana
Destroy the Elastic Search and Kibana containers along with their related resources.
$ terraform destroy -auto-approve
Successful output example:
Destroy complete! Resources: 3 destroyed.
Change into your home directory.
$ cd
Remove the temporary lab directory.
$ rm -rf "${HC_LEARN_LAB}"
Next steps
You are encouraged to use the techniques described in this tutorial to further your own audit device log inspection and monitoring solution. For example, you can extend what you have learned here, and trigger alerts to detect security threats for efficient notification and handling during incident response.
You could also consider the advanced case of automated remediation for certain incidents based on interpretation of audit device log content.
Help and reference
- Socket audit device documentation
- File audit device documentation
- Enable audit device filter documentation
- Filter concepts documentation
- filter audit entries documentation
- Audit filter API documentation
- Audit filter CLI documentation
- HashiCorp Vault integration with Elastic Agent
- Install Elasticsearch with Docker
- Install Kibana with Docker
- Run Elastic Agent in a container
- Kibana Query Language