Unexpected benefits of using Terraform for small installations

Technology December 10, 2018

When we think about tools like HashiCorp’s Terraform, we usually think about orchestrating and managing large, scalable infrastructure. And many of the examples out there make the same assumptions.

But what about reasonably small installations? Setups where you really only have a handful of servers? Would you still use Terraform for such a setup? After all, it could be argued that pointing and clicking in your cloud management interface is going to be quicker than writing and testing infrastructure as code.

As a matter of discipline (and experiment), we’ve now done quite a number of small installations with Terraform. And we’ve been extremely pleasantly surprised at how well Terraform scales down to such small setups. Based on what we’ve gained, I doubt that we’ll ever do another manual setup for production-quality infrastructure, regardless of how simple it is.

Baseline comparison: point-and-click infrastructure

Before we talk about the benefits of using Terraform for small installations, what are we comparing against?

We’re comparing against simply signing up for an Amazon Web Services (AWS) account, and then using AWS’ web console to manually manage your VPC, security groups, users, and EC2 instances. This is the way that everyone starts out with AWS, and this is the way you’re probably still doing things if you’re not using an infrastructure-as-code tool like Terraform. It’s the simplest way to get started and “see” the resources that you’re manipulating.

But just like typing code on an interactive terminal (as opposed to putting your code in a script), you start to hit the inherent disadvantages of the point-and-click approach very quickly.

Here are several very real benefits that we’ve realized when using Terraform even for small installations:

Benefit #1: Space for useful documentation

Honestly, the simplest, dumbest, and most useful benefit of simple infrastructure as code is the ability to put a block of comments right next to your declaration.

We all know that nothing beats an insightful comment block right next to a tricky set of code. Left and right, we found many places in our infrastructure code that benefited from commenting.

Back when we were creating “simple” infrastructure in a web console, we would be in the mindset that these things were straightforward and unworthy of comment. As soon as we moved it into code, our developer instincts kicked in, and we’d tell the complete story for the next maintainer.

It’s true that you could simply write up a separate document that describes what’s what was done in the web console. And we often used to document our decisions that way. But whenever documentation is separated from implementation, it’s likely to be overlooked at a critical moment, or go out of date.

Benefit #2: History and accountability

Having an annotated version history of your infrastructure helps tremendously when you encounter something that you didn’t expect. We’ve found that infrastructure as code is littered with key settings and values, where a single commit may only change a couple characters but have dramatic impact.

If we imagine our future selves looking at our infrastructure several months from now, one of the most common questions we’d ask is, “I can see what the code says, but how did it get this way?” There’s nothing like a version history to help jog your memory, or reveal a colleague’s addition that you were never aware of.

Benefit #3: Compare actual vs. expected state of your infrastructure

Once your infrastructure is declared as code and committed to a repository, the `terraform plan` command becomes a powerful tool for comparing what you think the infrastructure is with what it actually is.

Certainly you can run this command manually to make sure that the code that you’re looking at accurately describes what’s out there in your cloud account. But it also opens the door for a very useful continuous integration pipeline. By running `terraform plan` periodically, or before you attempt to make any changes, your team can be notified about a change in actual state that isn’t in the code. This could have happened because someone manually changed something via the web console (and didn’t record it in the Terraform code and state) or because some action on the cloud provider’s side caused the state to diverge.

Either way, your Terraform state serves as a very useful snapshot of your assumptions about the state of the system — one that can be used to annotate any divergence that might have happened since the last time you ran `terraform apply`.

Benefit #4: Continuous Integration

Taking it a step further, now that we have history and state, we can start to see the benefits of having a simple automated workflow for pushing out changes.

For simple setups, you’re not going to be at the level where you need disciplined workflows like Terraform Enterprise. But even a rudimentary build process can provide efficiency and consistency in how your team coordinates code changes, state changes, and staged deployments of infrastructure.

At the very least, consider a syntax/sanity check on your Terraform code after each commit and a read-only `terraform plan` to ensure that the code that’s committed to your main branch matches the actual state of your system.

Then, as your system gradually gets more sophisticated, you’ll already have a pipeline to leverage for automating more steps.

Benefit #5: A codified representation that matches what’s in your mind

It only takes a few resources before your web console’s representation of your system starts diverging from the logical diagram that’s in your mind.

A simple system may consist of:

  • A staging environment, which contains a single application server, an S3 bucket, a database server, and some security groups
  • A production environment, which largely mirrors the staging environment

In our minds, we naturally think of the system with this primary division — the staging environment on one side and the production environment on the other side.

But when we go to the web console, that’s not how things are organized. Instead, all the EC2 instances are listed together in one place, all the S3 buckets are listed in another place, etc. If we want to “see” any significant piece of the staging environment, we have to bounce around to different areas of the console and ignore the things we see that aren’t part of the staging environment.

Imagine if you were in charge of organizing your team’s documents. You’d probably maintain one folder for each project. Each folder would contain all the documents, images, and spreadsheets associated with that project. What if you came in and someone had reorganized all the documents by type instead? They made new folders and put all the spreadsheets across all projects in one folder, all the text documents in another folder, and all the images in yet another folder. If you wanted to work on a project, you’d need to bounce between all three folders and you’d have to ignore the files that didn’t pertain to your project.

That’s more or less exactly how most cloud providers’ web consoles are organized. In retrospect, it’s insane that we’d ever go there to get a picture of the state of our system.

Instead, we can organize our code according to how we think of the system. If we feel that our security group declarations are tightly coupled to our instances, then we can put them right next to each other in the same file.

Benefit #6: Modularization for clarity and flexibility

Even simple setups benefit from breaking coherent sections into modules.

Simple modules are easy to share across projects. Chances are you’ll be managing another small project sometime in the future. The slight abstraction that a module provides makes it easy to reuse your infrastructure code in the new project.

And early modularization provides a path to scale later. It’s much harder to retrofit a module, particularly in Terraform due to how state is stored. Even if you’re planning in keeping things simple for the foreseeable future, you’re always only one grant or initiative away from being told that it’s a whole new world. It’s best to be ready for success.

Benefit #7: You’ll create more robust systems

One of the most surprising benefits of using code for small installations is that we actually wound up with a more secure, more complete system than if we’d created it manually.

When you express your infrastructure as code, it allows you to record certain concepts and details, and then file them away mentally so that you can think at a higher level. This abstraction frees our minds to handle additional complexity.

For example, we found ourselves easily adding another security group layer that made the system a little more secure and a little more flexible. We didn’t do it when we were managing the system manually because it felt like overkill, and it was cumbersome to double check it in the web interface. In code, it was straightforward to create and manage, and its value quickly became apparent.

Starting early and simple with infrastructure code also allows you to get your legs underneath you before tackling more complex installations. It lets your team establish their coding style and workflow. Find a small pilot project and commit to managing it the same way that you envision handling larger projects. You’ll be glad to have the dry run.

An end-to-end example

Terraform concepts can be difficult to translate into practice because the tool doesn’t dictate how you should organize your code. It’s the opposite of opinionated.

Likewise, there are very few true Terraform frameworks out there. The nature and limitations of Terraform make it possible to abstract concepts in your own code, but the mechanisms just aren’t there to create elegant abstractions of abstractions. Instead, the community tends to rely on code generation and examples.

Over time, we’ve developed a series of patterns and practices that we’ve found to be extremely efficient and useful for creating and managing small- to medium-sized systems. We generalized these patterns from several small, successful production systems that we developed in the past year or so.

This infrastructure code produces a useful, production-quality base installation that can be used for a number of different purposes. It saves considerable time compared to constructing this infrastructure from scratch. And it establishes a number of very helpful coding conventions in the process.

We’ve coupled this code with extensive inline documentation that explains exactly how and why each piece functions as it does, and how to go through the steps of implementing it yourself. When you’re done, you’ll “own” every piece of the resulting code and infrastructure. There’s no magic and no code generation.

Soon, we’ll be releasing these patterns as open source software, so that other teams can benefit from them and build upon them. We call the project Aloha. Sign up for our mailing list in the sidebar if you’d like to be notified when we release it to the public and publish some demonstrations.

We’re obviously biased, but we think that Aloha is the perfect way to break into full-stack production integration of infrastructure as code, particularly for small- to medium-size installations. At the very least, it provides an open-source repository of patterns and practices that we hope other teams will find useful.

Space to refine your craft

When we create a useful snippet of small procedural code, even though we technically could just type it in on an interactive terminal, we tend to put it in a well-named, well-organized script and annotate it instead. It’s part of being a good team member and a good craftsperson.

Tools like Terraform allow us to treat our infrastructure processes as first-class citizens that are worthy of the same craft as our application code. Create small, well-written, well-documented Terraform repositories for all of your infrastructure, regardless of how small or simple they seem at first. Your teammates (and your future self) will thank you.

Did you like this post? Share It!