Why I Started Using and Recommending Terraform

When one of our Fortune 500 clients asked Will Rubel why they should use Terraform over other similar solutions, he wrote this article to answer their question.

Having used AWS Config, AWS Codepipeline, and AWS CloudFormation for many years, I had everything automated and working smoothly, and did not see the need for tools external to the AWS environment. 

I looked at HashiCorp’s Terraform and saw it merely as a tool to automate deployments and track the state of resources. And I was already doing this with AWS CloudFormation and AWS Config. 

My argument to those pushing for Terraform: Why would I need to utilize Terraform?

Eventually, I had a situation that caused an outage. We had pushed an update to one of our critical CloudFormation infrastructure stacks and the stack entered a failed state. It could not be rolled-back or deleted. This led to an AWS support call and the stack was destroyed manually by AWS.

Unfortunately, other stacks had a dependency to the failed stack, and the failure required the complete production infrastructure to be destroyed and rebuilt. 

After we fixed the problem, I thought this was a rare anomaly, and probably would not happen again. Six months later, I had another failed stack. Fortunately, this stack was not a critical piece of infrastructure, and the stack could be redeployed.

After having this occur several times over the course of several years, I realized it was a bug in the AWS CloudFormation deployment and updating process, and began to research other options. 

I began to look more seriously at Terraform again. 

Could using Terraform for deployments prevent this situation? 

The answer is yes.

Terraform uses templates to deploy resources and they are not coupled together in the same way they are coupled together in CloudFormation stacks. Instead, Terraform resources are independent of each other, and can be easily updated without causing a failure.

As I dug deeper into Terraform, I also discovered many additional features which AWS did not or could not provide because of the way the AWS ecosystem was built.

Some of those additional features include:

  • Test suite to allow security and compliance testing of templates prior to deployment.
  • Local AWS cloud environments which allow the deployment of AWS resources locally to test templates and reduce costs and the number of unnecessary deployments.
  • Ability to leverage and integrate features in other cloud platforms, such as Azure Active Directory into AWS deployments.
  • Ability to build your own Terraform modules to standardize organizational deployments. This allowed all deployments to be exactly the same, with the same tags, security, etc.

At this point, I was totally sold on using Terraform and began developing my own modules. Unfortunately, many cloud engineers do not see the same benefits and they make the same arguments I previously made.

  • Everything already works for us, and we don’t need to add another layer of complexity.
  • We don’t need to test our infrastructure code, only developers do this.
  • We can ensure compliance by using AWS Config.
  • Why do you need to test locally, when it is very inexpensive to just deploy to an actual AWS environment.
  • We don’t need modules and libraries for infrastructure code.
  • We only use AWS, and do not use any other cloud environments.

While I can empathize with their arguments,  eventually they will experience multiple failed CloudFormation stacks and research processes to prevent it from happening in the future.

While they may not come to the same conclusion I did, I would hope they will look at Terraform as a potential solution.

Automate and scale business outcomes

Trility teams leverage Terraform and other HashiCorp solutions to help companies run securely and reliably in the cloud – getting it done in days, not weeks and months historically required by infrastructure build-outs.