Building an IaC security and governance program step-by-step

Building an IaC security and governance program step-by-step

Infrastructure as code (IaC) is undoubtedly changing how engineers approach the cloud. IaC, when coupled with dev tools and automation, opens up new avenues for infrastructure performance, scalability, and now security.

By embedding IaC security and compliance controls into your version control systems and CI/CD pipelines, you can start identifying and fixing errors earlier. But, to do so without being disruptive, it’s vital to lay out a strategy to determine where and how to enforce security controls to meet your goals without slowing your developers down, from experimentation to governing your strategy.

This post will share a technical process for approaching and building an internal IaC security strategy.

What are your infrastructure security goals?

The key to implementing the most sustainable infrastructure security program lies in leveraging the tools and workflows developers already use. For many, that may be easier said than done because security tooling traditionally lacks the access and context necessary to play nice with dev workflows. The IaC security approach is fundamentally different from the more reactive cloud security monitoring and alerting that happen independent of developer workflows.

As you start paving the road ahead for continuous infrastructure security processes, start by defining your ideal state. That way, as you go through the process, you’ll have an idea as to how and where you want to enforce different security and compliance controls.

What policies do you want to enforce, and how?

Your infrastructure security program is highly dependent on your organizations’ security priorities and maturity, and you should read this guide with your unique needs in mind.

If you want to improve your overall cloud security posture, you may want to look for ways to apply common security policy frameworks such as the Center for Internet Security. Several open source security scanners such as Checkov or Open Policy Agent (OPA) enforce policy-as-code at the IaC level.

They’re usually standalone packaged containers defined to accept a file, directory, repository, branch, etc., and output scan results in a machine-readable format such as JSON or JUnit XML. The most significant advantage of using an open source scanner is that they pre-package valuable security testing logic that your developers won’t have to learn.

The alternative to a scanner is a full-featured security product, preferably one with exposed APIs and webhook integration sources. Most developer-first security solutions have done this part of the heavy lifting, so you should be getting some of this out-of-the-box.

If you need to enforce external controls such as industry benchmarks or regulatory compliance requirements, you’ll need a more structured approach that maps individual checks to desired outcomes. Many cloud providers include robust compliance reporting for standard benchmarks such as HIPAA, NIST, PCI-DSS, etc. Make sure your policy enforcement approach enables you to build your own policies so you can adopt those controls as well.

Where should you enforce infrastructure policies?

Each of your teams likely has its unique frameworks and processes in place, and it’s important to have a full lay of the land before diving in. Start with a simple mapping exercise and list all the configuration frameworks in use—from CloudFormation or Azure Resource Manager (ARM) to Terraform or Pulumi.

Once you know what frameworks exist internally, the next step is to determine the best place to scan them for infrastructure security errors. With many frameworks in place, building out new processes from scratch would be an uphill battle. Depending on your organization’s DevOps maturity, you’ll likely find an existing foundation on which you can build.

Understand your teams’ end-to-end code review process—from committing new infrastructure changes to CI/CD build pipelines and deployments. Your team likely has some automated tests—unit, acceptance, integration, etc.—in place to ensure updates work as intended and don’t introduce bugs. Use those processes as an example for inserting security scanning. To better understand those processes, you might want to ask questions such as:

  • What triggers a CI/CD build?
  • Do your current automated quality assurance efforts run continuously on every new pull request and commit?
  • Do more time-consuming tests such as integration tests run on less-than-continuous intervals such as nightly or only on branches following specific naming conventions?
  • Are developers testing their work locally before committing?
  • How many CI/CD pipelines are in use?

How and where you implement your security scanning will depend on those processes and the tools used to do them. Based on your current setup, you’ll generally want to take one of two approaches to IaC security—leveraging CI/CD processes or the built-in functionality from your version control system (VCS).

Approach #1: CI/CD pipeline

Building security controls as part of CI/CD pipelines is probably the easier of the two approaches because modern CI/CD tools have completely abstracted away the complexity behind orchestrating test logic.

In many cases, scanning security in CI pipelines is the fastest way to implement an IaC security and governance program. If a modern CI tool is already in place, you can probably set it up in less than 10 minutes. If you find yourself going down this path, consider the residual results of the scan and spend time dry-running it with the team that’s going to be in charge of handling its output.

For teams operating in a centralized infrastructure operation model, this is an optimal solution. Expertise and control can remain in the hands of a small few, and the continuous enhancements to this process will not affect a large population of developers. For everyone else—continue reading.

Approach #2: Version control system

Your version control system (VCS) of choice is also hiding some robust security governance internal controls. Apart from adding individual automation capabilities, all of the three platforms currently dominating the market (GitHub, GitLab and Bitbucket) offer embedded code review and authorization controls that can be utilized to better control who can change what and where.

Through their native versioning and branching, developers can perform extensive testing without compromising any running production system and continue to collaborate and converse before making changes due to a test.

By embedding infrastructure security here, there are some pros in terms of more fine-grained access and controls. Two of the more common methods to do that utilize the standard pull request review (PR) process:

  • PR annotations: Start simple by choosing the path of least resistance. Using a standard webhook or API interface, you can start by scanning every PR and annotating its results with concrete secure coding advice. The benefit of the annotation is that it’s frictionless. Developers can choose to adopt or ignore it—little frustration, but on the other hand, not a lot of governance.
  • Checks: With the addition of native CI to most git platforms, you can also always incorporate a standard CI pipeline in git. The main benefit is to include a CI-like scan as part of each individual PR. It surfaces the same kind of output in an earlier stage and enables developers to react before actually failing building in the pipeline.

How should you enforce security output?

Before you include intrusive controls and overwhelming security scanning results, it’s a good idea to set expectations for how security output should be actioned. Nobody wants to get bombarded by tens of failure indications with no clear path to justify their decisions. If your IaC strategy is noisy or becomes a blocker at the end of the day, your team won’t adopt it.

Define how passive or active you want the output of your security controls to be. Should it block builds actively or provide passive feedback to address in some pre-defined SLA?

Some scanners can even integrate further into pipelines and orchestrate more advanced actions, such as blocking a build from completing until testing is complete. Scanning in CI was generally meant to be passive; it wasn’t built for interactive testing and troubleshooting. Should a development pipeline hold the answers to complex questions that need to be answered when resolving a security error or compliance violations?

Ask yourself who is going to get the notification when that build fails. Continue by imagining if your CI console is where you want to troubleshoot a misconfiguration. Will you have sufficient context to solve the kind of issues your scanner is going to raise? Will you be able to deduce the impact of a change you’ll make to pass a CI scan?

In addition to the previously mentioned benefits of hinging your IaC security strategy on your VCS, there are more fine-grained controls in git such as:

  • Approval rules: With a VCS, you can define how many approvals, and a pull request needs before it can be merged. Some even let you select which specific users those should be.
  • Code owners: If you have admin access to your main repo, consider using CODEOWNERS parameters to define a stricter approval protocol for more sensitive files.

Some of the more mature tools also support advanced functionalities such as importing custom rule files and applying suppression and skips as code annotations. Those should help you make sure that only relevant checks eventually crop up.

Conclusion

There are no one-size-fits-all strategies for infrastructure security, and we as an industry are still defining the best practices for embedding it into IaC. It’s up to every security team to work closely with their engineering and DevOps to make security consumable and, ultimately, to prevent risk as early in the development life cycle as possible.

Read more