Health
Terraform Cloud can perform automatic health assessments in a workspace to assess whether its real infrastructure matches the requirements defined in its Terraform configuration. Health assessments include the following types of evaluations:
- Drift detection determines whether your real-world infrastructure matches your Terraform configuration.
- Continuous validation determines whether custom conditions in the workspace’s configuration continue to pass after Terraform provisions the infrastructure.
When enabled, Terraform Cloud automatically runs a health assessment for your workspace about every 24 hours. Refer to Health Assessment Scheduling for details.
Permissions
Working with health assessments requires the following permissions:
- To view health status for a workspace, you need read access to that workspace.
- To change organization health settings, you must be an organization owner.
- To change a workspace’s health settings, you must be an administrator for that workspace.
Workspace Requirements
Workspaces require the following settings to receive health assessments:
- Terraform version 0.15.4+ for drift detection only
- Terraform version 1.3.0+ for drift detection and continuous validation
- Remote execution mode or Agent execution mode for Terraform runs
The latest Terraform run in the workspace must have been successful. If the most recent run ended in an errored, canceled, or discarded state, Terraform Cloud pauses health assessments until there is a successfully applied run.
The workspace must also have at least one run in which Terraform successfully applies a configuration. Terraform Cloud does not perform health assessments in workspaces with no real-world infrastructure.
Enable Health Assessments
You can enforce health assessments across all eligible workspaces in an organization within the organization settings. Enforcing health assessments at an organization-level overrides workspace-level settings. You can only enable health assessments within a specific workspace when Terraform Cloud is not enforcing health assessments at the organization level.
To enable health assessments within a workspace:
- Verify that your workspace satisfies the requirements.
- Go to the workspace and click Settings > Health.
- Select Enable under Health Assessments.
- Click Save settings.
Health Assessment Scheduling
The timing of the first health assessment in the workspace depends on whether you enable health assessments during active Terraform runs:
- No active runs: The first health assessment starts a few minutes after you enable the feature.
- Active speculative plan: The first health assessment starts soon after that plan's completion.
- Other active runs: The first health assessment starts in about 24 hours.
After the first health assessment, Terraform Cloud starts a new health assessment if at least 24 hours have passed since the last assessment and there are no active runs in the workspace. Health assessments may take longer to complete when you enable health assessments in many workspaces at once or your workspace contains a complex configuration with many resources.
A health assessment never interrupts or interferes with runs. If you start a new run during a health assessment, Terraform Cloud cancels the current assessment and runs the next assessment in 24 hours. This behavior may prevent Terraform Cloud from performing health assessments in workspaces with frequent runs.
Terraform Cloud pauses health assessments if the latest run ended in an errored state. This behavior occurs for all run types, including plan-only runs and speculative plans. Once the workspace completes a successful run, Terraform Cloud restarts health assessments after 24 hours.
Terraform Enterprise administrators can modify their installation's assessment frequency and number of maximum concurrent assessments from the admin settings console.
Concurrency
If you enable health assessments on multiple workspaces, assessments may run concurrently. Health assessments do not affect your concurrency limit. Terraform Cloud also monitors and controls health assessment concurrency to avoid issues for large-scale deployments with thousands of workspaces. However, Terraform Cloud performs health assessments in batches, so health assessments may take longer to complete when you enable them in a large number of workspaces.
Notifications
Terraform Cloud sends notifications about health assessment results according to your workspace’s settings.
Workspace Health Status
On the organization's Workspaces page, Terraform Cloud displays a Health warning status for workspaces with infrastructure drift or failed continuous validation checks.
On the right of a workspace’s overview page, Terraform Cloud displays a Health bar that summarizes the results of the last health assessment.
- The Drift summary shows the total number of resources in the configuration and the number of resources that have drifted.
- The Checks summary shows the number of passed, failed, and unknown statuses for objects with continuous validation checks.
Drift Detection
Drift detection helps you identify situations where your actual infrastructure no longer matches the configuration defined in Terraform. This deviation is known as configuration drift. Configuration drift occurs when changes are made outside Terraform's regular process, leading to inconsistencies between the remote objects and your configured infrastructure.
For example, a teammate could create configuration drift by directly updating a storage bucket's settings with conflicting configuration settings using the cloud provider's console. Drift detection could detect these differences and recommend steps to address and rectify the discrepancies.
Configuration drift differs from state drift. Drift detection does not detect state drift.
Configuration drift happens when external changes affecting remote objects invalidate your infrastructure configuration. State drift occurs when external changes affecting remote objects do not invalidate your infrastructure configuration. Refer to Refresh-Only Mode to learn more about remediating state drift.
View Workspace Drift
To view the drift detection results from the latest health assessment, go to the workspace and click Health > Drift. If there is configuration drift, Terraform Cloud proposes the necessary changes to bring the infrastructure back in sync with its configuration.
Resolve Drift
You can use one of the following approaches to correct configuration drift:
- Overwrite drift: If you do not want the drift's changes, queue a new plan and apply the changes to revert your real-world infrastructure to match your Terraform configuration.
- Update Terraform configuration: If you want the drift's changes, modify your Terraform configuration to include the changes and push a new configuration version. This prevents Terraform from reverting the drift during the next apply. Refer to the Manage Resource Drift tutorial for a detailed example.
Continuous Validation
Continuous validation regularly verifies whether your configuration’s custom assertions continue to pass, validating your infrastructure. For example, you can monitor whether your website returns an expected status code, or whether an API gateway certificate is valid. Identifying failed assertions helps you resolve the failure and prevent errors during your next time Terraform operation.
Continuous validation evaluates preconditions, postconditions, and check blocks as part of an assessment, but we recommend using check blocks for post-apply monitoring. Use check blocks to create custom rules to validate your infrastructure's resources, data sources, and outputs.
Preventing false positives
Health assessments create a speculative plan to access the current state of your infrastructure. Terraform evaluates any check blocks in your configuration as the last step of creating the speculative plan. If your configuration relies on data sources and the values queried by a data source change between the time of your last run and the assessment, the speculative plan will include those changes. Terraform Cloud will not modify your infrastructure as part of an assessment, but it can use those updated values to evaluate checks. This may lead to false positive results for alerts since your infrastructure did not yet change.
To ensure your checks evaluate the current state of your configuration instead of against a possible future change, use nested data sources that query your actual resource configuration, rather than a computed latest value. Refer to the AMI image scenario below for an example.
Example use cases
Review the provider documentation for check
block examples with AWS, Azure, and GCP.
Monitoring the health of a provisioned website
The following example uses the HTTP Terraform provider and a scoped data source within a check
block to assert the Terraform website returns a 200
status code, indicating it is healthy.
check "health_check" { data "http" "terraform_io" { url = "https://www.terraform.io" } assert { condition = data.http.terraform_io.status_code == 200 error_message = "${data.http.terraform_io.url} returned an unhealthy status code" }}
Continuous Validation alerts you if the website returns any status code besides 200
while Terraform evaluates this assertion. You can also find failures in your workspace's Continuous Validation Results page. You can configure continuous validation alerts in your workspace's notification settings.
Monitoring certificate expiration
Vault lets you secure, store, and tightly control access to tokens, passwords, certificates, encryption keys, and other sensitive data. The following example uses a check
block to monitor for the expiration of a Vault certificate.
resource "vault_pki_secret_backend_cert" "app" { backend = vault_mount.intermediate.path name = vault_pki_secret_backend_role.test.name common_name = "app.my.domain"} check "certificate_valid" { assert { condition = !vault_pki_secret_backend_cert.app.renew_pending error_message = "Vault cert is ready to renew." }}
Asserting up-to-date AMIs for compute instances
HCP Packer stores metadata about your Packer images. The following example check fails when there is a newer AMI version available.
data "hcp_packer_image" "hashiapp_image" { bucket_name = "hashiapp" channel = "latest" cloud_provider = "aws" region = "us-west-2"} resource "aws_instance" "hashiapp" { ami = data.hcp_packer_image.hashiapp_image.cloud_image_id instance_type = var.instance_type associate_public_ip_address = true subnet_id = aws_subnet.hashiapp.id vpc_security_group_ids = [aws_security_group.hashiapp.id] key_name = aws_key_pair.generated_key.key_name tags = { Name = "hashiapp" }} check "ami_version_check" { data "aws_instance" "hashiapp_current" { instance_tags = { Name = "hashiapp" } } assert { condition = aws_instance.hashiapp.ami == data.hcp_packer_image.hashiapp_image.cloud_image_id error_message = "Must use the latest available AMI, ${data.hcp_packer_image.hashiapp_image.cloud_image_id}." }}
View continuous validation results
To view the continuous validation results from the latest health assessment, go to the workspace and click Health > Continuous validation.
The page shows all of the resources, outputs, and data sources with custom assertions that Terraform Cloud evaluated. Next to each object, Terraform Cloud reports whether the assertion passed or failed. If one or more assertions fail, Terraform Cloud displays the error messages for each assertion.
The health assessment page displays each assertion by its named value. A check
block's named value combines the prefix check
with its configuration name.
If your configuration contains multiple preconditions and postconditions within a single resource, output, or data source, Terraform Cloud will not show the results of individual conditions unless they fail. If all custom conditions on the object pass, Terraform Cloud reports that the entire check passed. The assessment results will display the results of any precondition and postconditions alongside the results of any assertions from check
blocks, identified by the named values of their parent block.