Introduction

Terraform provides automation to provision your infrastructure in the cloud. To do this, Terraform authenticates with cloud providers (and other providers) to deploy the resources and perform the planned actions. However, the information Terraform needs for authentication is very valuable, and generally, is sensitive information that you should always keep secret since it unlocks access to your services. For example, you can consider API keys or passwords for database users as sensitive data.

If a malicious third party were to acquire the sensitive information, they would be able to breach the security systems by presenting themselves as a known trusted user. In turn, they would be able to modify, delete, and replace the resources and services that are available under the scope of the obtained keys. To prevent this from happening, it is essential to properly secure your project and safeguard its state file, which stores all the project secrets.

By default, Terraform stores the state file locally in the form of unencrypted JSON, allowing anyone with access to the project files to read the secrets. While a solution to this is to restrict access to the files on disk, another option is to store the state remotely in a backend that encrypts the data automatically; we will be using DigitalOcean Spaces to demonstrate this today.

In this tutorial, you’ll hide sensitive data in outputs during execution and store your state in a secure cloud object storage, which encrypts data at rest. You’ll use DigitalOcean Spaces in this tutorial as your cloud object storage. You’ll also use tfmask, which is an open source program written in Go that dynamically censors values in the Terraform execution log output.

Prerequisites

  • A DigitalOcean Personal Access Token, which you can create via the DigitalOcean control panel.
  • Terraform installed on your local machine and a project set up with the DigitalOcean provider.
  • A DigitalOcean Space with API keys (access and secret).

Note: This tutorial has specifically been tested with Terraform 0.13.

Marking Outputs as sensitive

In this step, you’ll hide outputs in code by setting their sensitive parameter to true. This is useful when secret values are part of the Terraform output that you’re storing indefinitely, or you need to share the output logs beyond your team for analysis.

Assuming you are in the terraform-sensitive directory, which you created as part of the prerequisites, you’ll define a Droplet and an output showing its IP address. You’ll store it in a file named droplets.tf, so create and open it for editing by running:

Add the following lines:terraform-sensitive/droplets.tf

This code will deploy a Droplet called web-1 in the fra1 region, running Ubuntu 18.04 on 1GB RAM and one CPU core. Here you’ve given the droplet_ip_address output a value and you’ll receive this in the Terraform log.

To deploy this Droplet, execute the code by running the following command:

The actions Terraform will take will be the following:

Enter yes when prompted. You’ll receive the following output:

You will find that the IP address is in the output. If you’re sharing this output with others, or in case it will be publicly available because of automated deployment processes, it’s important to take actions to hide this data in the output.

To censor it, you’ll need to set the sensitive attribute of the droplet_ip_address output to true.

Open droplets.tf for editing:

Add the highlighted line:terraform-sensitive/droplets.tf

Save and close the file when you’re done.

Apply the project again by running:

The output will be:

You’ve now explicitly censored the IP address—the value of the output. Censoring outputs is useful in situations when the Terraform logs would be in a public space, or when you want them to remain hidden, but not delete them from the code. You’ll also want to censor outputs that contain passwords and API tokens, as they are sensitive info as well.

You’ve now hidden the values of the defined outputs by marking them as sensitive. In the next step, you’ll configure Terraform to store your project’s state in the encrypted cloud, instead of locally.

Storing State in an Encrypted Remote Backend

The state file stores all information about your deployed infrastructure containing all its internal relationships and secrets. By default, it’s stored in plaintext, locally on the disk. Storing it remotely, in the cloud, provides a higher level of security. If the cloud storage service supports encryption at rest, it will store the state file in an encrypted state at all times, so that potential attackers won’t be able to gather information from it. Storing the state file encrypted remotely is different from marking outputs as sensitive—this way, all secrets are securely stored in the cloud, which only changes how Terraform stores data, not when it’s displayed.

You’ll now configure your project to store the state file in a DigitalOcean Space. As a result it will be encrypted at rest and protected with TLS in transit.

By default, the Terraform state file is called terraform.tfstate and is located in the root of every initialized directory. You can view its contents by running:

The contents of the file will be similar to this:

The state file contains all the resources you’ve deployed, as well as all outputs and their computed values. Gaining access to this file is enough to compromise the entire deployed infrastructure. To prevent that from happening, you can store it encrypted in the cloud.

Terraform supports multiple backends, which are storage and retrieval mechanisms for the state. Examples are: local for local storage, pg for the Postgres database, and s3 for S3 compatible storage, which you’ll use to connect to your Space.

The back-end configuration is specified under the main terraform block, which is currently in provider.tf. Open it for editing by running:

Add the following lines:terraform-sensitive/provider.tf

The s3 back-end block first specifies the key, which is the location of the Terraform state file on the Space. Passing in state/terraform.tfstate means that you will store it as terraform.tfstate under the state directory.

The endpoint parameter tells Terraform where the Space is located and bucket defines the exact Space to connect to. The skip_region_validation and skip_credentials_validation disable validations that are not applicable to DigitalOcean Spaces. Note that region must be set to a conforming value (such as us-west-1), which has no reference to Spaces.

Remember to put in your bucket name and the Spaces endpoint, including the region, which you can find in the Settings tab of your Space. When you are done customizing the endpoint, save and close the file.

Next, put the access and secret keys for your Space in environment variables, so you’ll be able to reference them later. Run the following commands, replacing the highlighted placeholders with your key values:

Then, configure Terraform to use the Space as its backend by running:

The -backend-config argument provides a way to set back-end parameters at runtime, which you are using here to set the Space keys. You’ll be asked if you wish to copy the existing state to the cloud, or start anew:

Enter yes when prompted. The rest of the output will be the following:

Your project will now store its state in your Space. If you receive an error, double-check that you’ve provided the correct keys, endpoint, and bucket name.

Your project is now storing state in your Space. The local state file has been emptied, which you can check by showing its contents:

There will be no output, as expected.

You can try modifying the Droplet definition and applying it to check that the state is still being correctly managed.

Open droplets.tf for editing:

Modify the highlighted lines:terraform-sensitive/droplets.tf

Save and close the file, then apply the project by running:

You will receive the following output:

Enter yes when prompted, and Terraform will apply the new configuration to the existing Droplet, meaning that it’s correctly communicating with the Space its state is stored on:

You’ve configured the s3 backend for your project, so that you’re storing the state encrypted in the cloud, in a DigitalOcean Space. In the next step, you’ll use tfmask, a tool that will dynamically censor all sensitive outputs and information in Terraform logs.

Using tfmask in CI/CD Environments

In this section, you’ll download tfmask and use it to dynamically censor sensitive data from the whole output log Terraform generates when executing a command. It will censor the variables and parameters whose values are matched by a RegEx expression that you provide.

Dynamically matching their names is possible when they follow a pattern (for example, contain the word password or secret). The advantage of using tfmask over only marking the outputs as sensitive, is that it also censors matched parts of the resource declarations that Terraform prints out while executing. It’s imperative you hide them when the execution logs may be public, such as in automated CI/CD environments, which may often list execution logs publicly.

Compiled binaries of tfmask are available at its releases page on GitHub. For Linux, run the following command to download it:

Mark it as executable by running:

tfmask works on the outputs of terraform plan and terraform apply by masking the values of all variables whose names are matched by a RegEx expression that you specify. The regex expression and the character with which the actual values will be replaced, you supply using environment variables called TFMASK_CHAR and TFMASK_VALUES_REGEX, respectively.

You’ll now use tfmask to censor the name and ipv4_address of the Droplet that Terraform would deploy. First, you’ll need to set the mentioned environment variables by running:

This regex expression will match all strings starting with ipv4_address or name, and will not be case sensitive.

To make Terraform plan an action for your Droplet, modify its definition:

Modify the Droplet’s name:terraform-sensitive/droplets.tf

Save and close the file.

Because you’ve changed an attribute of the Droplet, Terraform will show its full definition in its output. Plan the configuration, but pipe it to tfmask to censor variables according to the regex expression:

You’ll receive output similar to the following:

Note that tfmask has censored the values for nameipv4_address, and ipv4_address_private using the character you specified in the TFMASK_CHAR environment variable, because they match the regex expression.

This way of value censoring in the Terraform logs is very useful for CI/CD, where the logs may be publicly available. The benefit of tfmask is that you have full control over what variables to censor (using the regex expression). You can also specify keywords that you want to censor, which may not currently exist, but you are anticipating using in the future.

You can destroy the deployed resources by running the following command and entering yes when prompted:

Conclusion

In this article, you’ve worked with a couple of ways to hide and secure sensitive data in your Terraform project. The first measure, using sensitive to hide values from the outputs, is useful when only logs are accessible, but the values themselves stay present in the state stored on disk.

To remedy that, you can opt to store the state file remotely, which you’ve achieved with DigitalOcean Spaces. This allows you to make use of encryption at rest. You also used tfmask, a tool that censors values of variables—matched using a regex expression—during terraform plan and terraform apply.

You can also check out Hashicorp Vault to store secrets and secret data. It can be integrated with Terraform to inject secrets in resource definitions, so you’ll be able to connect your project with your existing Vault workflow.