The aim of this workshop is to know what Terraform is, know the basic concepts to deploy a simple project as well as acquire the skills to deploy more complex projects.
Note: Terraform binaries, documentation and additional information are available on the Terraform website https://terraform.io
Terraform is a tool to manage Infrastructure as Code (IaC)1. It is used to create, manage and update infrastructure resources by creating simple text files following the Terraform Configuration Language.
Terraform is an open source tool written in Go destined to write Infrastructure as Code and to deploy resources defined in the code. Code must be written using the `Hashicorp Configuration Language declarative language.
Terraform supports multiple providers to manage/deploy our infrastructure. In this workshop we will focus on AWS provider. However, there is a full list of supported providers on terraform documentation page. It is worth mentioning other providers such as Azure, Google Cloud Platform or OpenStack as other important cloud providers or tools such as Docker or Kubernetes.
1 Infrastructure as code is the process of provisioning and managing cloud resources by writing template files that humans can easily read and machines can process. Features:
AWS CloudFormation provides a common language for you to model and provision AWS and third party application resources in your cloud environment. AWS CloudFormation allows you to use programming languages or a simple text file to model and provision, in an automated and secure manner, all the resources needed for your applications across all regions and accounts. This gives you a single source of truth for your AWS and third party resources.1
1 source: https://aws.amazon.com/cloudformation/
To be able to proceed to the following steps, we need a set of tools to be installed. Some of them are mandatory to make the development and deployment of Terraform work. However, other tools are optional but recommended to facilitate our work.
pip
installer. To get details instructions of how to do this follow the AWS Command Line Interface page. For instance, for Unix OS, you need Python3 and pip to be installed. Then, execute:$ pip install awscli
$ pip install aws-mfa
terraform
in the extensions/plugins tab/section of your IDEFirst of all, we need to set up our environment to later be able to deploy our resources. To do so, we must follow the AWS documentation. We describe below the procedure anyway.
In the IAM Section in the AWS Console, we have to go to users and select the user we want to use for the deployment. We select Security credentials
and Create access key
(if not previously created). Copy the user access key
and secret access key
provided.
If you do not find the file
~/.aws/credentials
in the cases below, type:
$ aws configure
and set the default region (e.g. eu-west-1).
If you are not using MFA and simply use an account for users and services (which is not recommended), just set the values copied above in an ~/.aws/credentials
file:
[default]
aws_access_key_id = YOURACCESSKEYID
aws_secret_access_key = YOURSECRETACCESSKEY
If you use MFA and a user account for the IAM users and a service account for the services (and where you want to deploy your services) as shown in the image below…
… you must follow this configuration:
$ pip install aws-mfa
[profile default]
...
[profile terraform-long-term]
region = eu-west-1
output = json
role_arn = arn:aws:iam::123456789012:role/role_to_be_assumed
source_profile = default
Note that the
source_profile
is the profile that gives us permissions to do an assume role. Therefore, we need to define those credentials for it in the~/.aws/config
file.
[terraform-long-term]
aws_access_key_id = YOURACCESSKEYID
aws_secret_access_key = YOURSECRETACCESSKEY
aws_mfa_device = arn:aws:iam::123456789012:mfa/iam_user
aws-mfa --profile terraform
This will place the right credentials in the ~/.aws/credentials
file. However, note that this credentials will expire and we will need to execute the command below again once they are expired.
Task: Set up your own environment.
A Terraform project might consist of the following templates:
main.tf: It is the main template. This is the template which is called, for instance, when a terraform apply
or terraform destroy
is executed. It contains the resource definitions and data sources. If modules are defined, it can invoke those modules.
variables.tf: It contains the variable definitions and, optionally, the default values. These variables will be called from the main file and are global to the folder/module1.
terraform.tfvars: Values for the variables defined in the variables.tf
.
outputs.tf: It contains the outputs variables of the project/module (if any).
terraform.tfstate: This is a generated file that contains the state of the deployment. Even if it is generated, its existence is mandatory once we have do the deployment. If it is removed or lost, Terraform will consider that nothing has been deployed and, hence, we would have missed the “connection” with the previously deployed resources.
MAIN: From the list of files above, the only which is mandatory is the
main.tf
.
VARIABLES: If the
terraform.tfvars
file is not created or some variables values (without default values) are not defined there, the missing values will be requested later from the command line.
OUTPUTS:
outputs.tf
file will contain only the output variables we want. The name of those variables can be different to the resources created.
MODULES: A project might contain one or more modules placed in a subdirectory. The file structure of a module is identical to the root folder. Consequently, the only mandatory file is the
main.tf
. Additionally, the name of the module will be the name of the directory where those files are contained.
1 As we will see, we can also define local variables. However, these local variables will be defined in the main.tf
file.
Task: Create a project with the basic directory structure and initialize it with terraform (use
terraform init
).
Note: You do not need to create the
module
folder.
First of all, if you are using a single account for IAM users and services, and you are not using MFA, you just need to set the following configuraion in the main.tf
(this is the simplest case, but not recommended):
provider "aws" {
version = "~=2.40"
region = "<region>"
profile = "<profile>"
}
In this case:
default
profile will be used. However, this might not be definer or, even if so, we might want to use a different one.In AWS it is recommended to use an account for users separated from an account where the services will be deployed.
Additionally, it is recommended to use the MFA. For the MFA, follow the AWS Set up section. For using a service account by assuming a role by a user, you need to set the following configuration:
provider "aws" {
version = "~> 2.40"
region = "<region>"
profile = "<profile>"
assume_role {
role_arn = "<role_to_be_assumed>"
}
}
In this case, the role_to_be_assumed
is a role defined in the service account that grant permissions to the user to access to services in the service account.
The first step before deployment is to initialize the project. This will download the provider plugin and the used modules. To do do this, just type:
$ terraform init
You will need to do this each time you add a module (either local or external).
You can play with different commands to check that your code is prepare and ready for deployment:
$ terraform fmt
You need this to do this in all your folders (root directory and modules). Otherwise, the formatting check and fix will only apply to the current directory.
$ terraform plan
In case of success, it will indicate the changes that will be peformed during the deployment.
In case of an update of your infrastructure, it will show which resources will be created, which ones will be removed and which ones will be updated (following the current state in
terraform.tfstate
file).
Now that you have check that everything is ok, you are ready for the deployment.
aws-mfa
tool as indicated in a previous section to set the updated credentials (if applicaable).$ terraform apply
You will be requested whether you want to actually deploy the infrastructure. Type yes
.
To skip the human validation and confirm that you want to (un-)deploy your infrastructure, you can add the
auto-approve
option to theterraform apply
.
Task: Try these commands with your (empty) project.
To undeploy your code, just type:
$ terraform destroy
The destruction of services will be based on the
terraform.tfstate
file. If you remove such file, Terraform will not know which services it needs to destroy and, consequently, nothing will be done.
These are the most useful Terraform options:
However, find all the options by typing terraform -h
The definition of variables should be done in the variables.tf
file.
A variable must/can contain:
type: The supported types of variables are:
default value: It is specified with the default
clause. It provides a default value if it not specified anywhere else
description: A description for the variable, i.e. the usage of the variable
validation: Rule to validate the value assigned to a variable. It contains:
can(regex("^sg-", var.name))
Note: Only the type clause is mandatory in the variable definition.
variable "<var_name>" {
type = <type>
description = "<Usage of the variable>"
validation {
condition = <condition the variable value must satisy>
error_message = "<Message to display>"
}
}
Note: Variable validation is in experimental phase. We have to enable this experimental feature in our module or main project:
terraform {
experiments = [variable_validation]
}
Example:
variable "sg_bastion_host_name" {
type = "string"
description = "Security group for the bastion host"
default = "bastion_sg_name"
validation = {
condition = can(regex("!^sg", var.sg_bastion_host))
error_message = "The security group cannot start with the prefix \"sg\""
}
}
The access of a variable can be done from the same directory where the variables.tf
file is placed, including the variables.tf
file itself. A variable is accessed by using the var.<var_name>
syntax1. For instance, for a var named tags
and defined in variables.tf
we would access to it in the following way:
resource "aws_instance" "bastion_host" {
...
tags = var.tags
}
The values of the variables specified in the variables.tf
can be:
The second one is the most recommended one because:
In general, we use the file
terraform.tfvars
to define the values of variables. However, we can use these other names:
terraform.tfvars.json
- any files with names ending with
.auto.tfvars
.auto.tfvars.json
The values for the variables in that file can be specified by setting name = value. For instance:
security_group_name = "bastion_name_sg"
TF_VAR_
prefix. For instance:$ export TF_VAR_security_group_name=bastion_sg_name
-var
option when calling the terraform
command, e.g.$ terraform apply -var="security_roup_name=bastion_sg_name"
We can define local variables in the main.tf
file if we do not want to define some values locally. This can be defined in the locals
section.
locals {
mylocalvar = "this is a local variable value"
tags = {
Name = "Value"
}
}
To use these local variables, just call it by using the local.
prefix. For instance:
resource "aws_instance" "bastion_host" {
...
tags = local.tags
}
Task: Create the following set of variables:
- a set of variables in the
variables.tf
file with a default value- a set of variables in the
variables.tf
file without a default value- from some of the variables defined above, add values in a file
terraform.tfvars
- create a local variable in the
main.tf
file
1 In old versions of Terraform (<= 0.11>) variables were referenced within a ${}
block, e.g. ${var.variable_name}
.
Output variables are specified in a file named outputs.tf
. The structure of an output variable is as follows:
output "<var_name> {
value = <value_reference>
description = "<description of the output variables>"
depends_on = [
<dependent_resource>
]
}
From the parameters of an output variable, only the
value
is mandatory.
For instance, we can define an output as:
output "bastion_ip" {
value = aws_instance.bastion.public_ip
description = "Public IP of the bastion host"
}
Output variables can be accessed once the infrastructure has been deployed with the output option and, optionally, the variable name. For instance
$ terraform output bastion_ip
50.17.232.209
We can access output values from child modules by referring them with the syntax module.<module_name>.<output_variable>
. For instance:
elb_endpoint_url = module.elb.elb_endpoint_url
A resource is an object in the infrastructure. Each resource block is equivalent to one or more objects in the infrastructure.
The resource definition structure is as follows:
resource "<resource_type>" "<resource_name>" {
<parameter> = <parameter_value>
}
The resource together with the resource_type identify a resource.
In other words, a resource is an instance of a resource type. It is equivalent to Java objects where we have classes and objects
The list of Terraform resource types is available in the Terraform documentation.
For each resource type, Terraform gives us usually an example of how can we build it with some of the options1:
The defined resources are deployed in parallel unless some dependency between them is set. Implicit dependencies are created when a resource uses the value of another resource. For instance, in the image below, the aws_elb_listener
has a dependency with aws_elb_target_group
because we needs the value of the target group to be create the listener.
Sometimes, we need to create resources conditionally, regarding specific parameter values. Conditions are specified through counters within resources:
count = (<condition> ? 1 : 0)
A condition
can be anything, such as the length of a variable is not zero, the value of a variable is equals to some value, etc. Following the structure above, if the condition is satisfied, the resource will be created. Otherwise, the resource creation will be skipped. However, we can revert this behavior by either negating the condition or by setting:
count = (<not_condition> ? 1 : 0)
As an example of a resource creation, see the image above (the aws_elb_listener
is created if var.elb_cert_id
is set).
Task: Create the following infrastructure:
Install the
httpd
service, create a sample html page and try to access to it (check security groups if it does not work). We will see in section [data sources](#Adding external files) how to add these actions to automatize the user_data.Clue: To help you with this, find these resources implemented in the https://github.com/ronaldtf/aws-misp project (search in the
vpc
andelb
modules).
1 source: https://www.terraform.io/docs/providers/aws/d/s3_bucket.html
A module is a folder with resources that pursue the same goal. For instance, we can create a module for a bastion host (which will contain the EC2 instance, its security group, etc.), a module for an ELB (with the target groups, security groups, etc.).
Apart from the modules you define, Terraform community has a set of modules for AWS which are publicly accessible. Find them here.
A module follows the same structure than the root of the project specified below. It will have an input, a main and an output (as well as the variable values - if needed). It will follow exactly the same rules than the root folder of the project. However, it will be in a subfolder within it.
Treat a module as an isolated subproject within the main Terraform project. The module is isolated from the root project even if the root project will have access to the module output variables.
Assume that we have created a module (named bastion
- therefore, in a folder with such name) with a set of input variables:
[variables.tf]
variable bastion_name {
type = string
}
variable bastion_subnet_id {
type = string
}
and a set of output variables:
[outputs.tf]
output bastion_public_ip {
value = aws_instance.bastion.public_ip
}
output bastion_sg {
value = aws_security_group.bastion_sg.id
}
The provider is implicit in a module and will inherit from the root project. However, nothing forbids to specify the own provider in the module either it is the same or a different one from the root project.
To call such module from the root project, in the main.tf
file, just write:
module "bastion" {
source = "./bastion"
bastion_name = var.name
bastion_subnet_id = var.bastion_subnet_id
}
(in this case, we have defined variable values in the variables.tf
file)
The
source
part is mandatory when calling a module. This serves to know where to find the code for the module.
To refer to the output values from the module, we will specify module.<module_name>.<output_variable>
For instance, we can use it as an output of the root project:
[outputs.tf]
variable bastion_ip {
value = module.bastion.bastion_public_ip
description = "Bastion host public IP"
}
Task: Create modules for the example in the previous section and add a bastion host.
Data sources are mechanisms to retrieve data which, in general, does not take part of our infrastructure and needs to be used in our Terraform templates.
Examples of data sources are:
The structure of a data source definition is:
data <datasource_type> <datasource_name> {
<parameter_type> = <parameter_value>
}
Let’s put an example. Imagine we need to retrieve the most recent AMI id for an Ubuntu distro, with HVM virtualization type, EBS as storage volume, and we know that the owner is 099720109477. To do so, we need to create a data source with the following configuration:
data "aws_ami" "ubuntu_ami" {
filter {
name = "name"
values = ["ubuntu*"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
filter {
name = "root-device-type"
values = ["ebs"]
}
filter {
name = "state"
values = ["available"]
}
owners = ["099720109477"]
most_recent = true
}
Notes:
most_recent = true
we only retrieve the latest one.data.aws_ami.ubuntu_ami.id
Task 1: Retrieve the availability zones for a given region Task 2: Find a way to get the public IP of your host with data sources.
One of the advantages of Terraform against CloudFormation is that it allows invoking external scripts. In general, those scripts are suffixed with the .tpl
extension. A typical example is to use those scripts as the user_data in an EC2 instance. For instance, we can define the following file…
[user_data.sh.tpl]
#! /bin/bash
yes | sudo apt-get update
yes | sudo apt-get upgrade
yes | sudo apt-get install squid
… and adding it as a data source:
data "template_file" "user_data" {
template = file("${path.module}/user_data.sh.tpl")
}
We can also pass parameters to a script by specifying the vars
clause. For instance (in this case, var.elb_url
indicates that it is a variable that comes from the variables.tf
file):
data "template_file" "user_data" {
template = file("${path.module}/user_data.sh.tpl")
vars = {
elb = var.elb_url
}
}
We can set the following environment variables for debugging:
Task: Set some of these variables, especially
TF_LOG
andTF_LOG_PATH
and analyze the traces
Now that you know…
… it’s time to put your hands on new projects and start new challenges!