AWS CDK vs. Terraform

First published 2021-10-28 in the Metosin Blog with my colleague Kimmo Koskinen.

Introduction

We are cloud enthusiasts at Metosin, and we have used or tried quite a few infrastructure as code (IaC) tools in all “big three” cloud platforms (AWS, Azure, and GCP). Kari has written some blog posts regarding these experiences:

This autumn, Kari got a chance to try AWS Cloud Development Kit since he had to implement a data pipeline using one relatively new AWS service, which still does not have Terraform support. So, we thought we would create the data pipeline using AWS CDK and the rest of the AWS infra using Terraform. This blog post describes our experiences using AWS CDK and how it compares to Terraform, our preferred IoC tool (at least for now).

Terraform

Let’s first refresh our memory regarding the old IaC kid on the block, Terraform.

Terraform is an excellent declarative Infrastructure as Code tool. The main advantage of Terraform is that you can use it with all major cloud providers (AWS, Azure, and GCP). Terraform also provides various Terraform providers. Using these providers, you can create various infrastructure solutions for all kinds of services (e.g., Terraform provider for MasterCard.)

The Terraform Language created with HCL is well suited for expressing configuration of infrastructure objects as data, in a way that is expressable with JSON. The latest stable version is 1.0.9 (as of writing this blog post).

Terraform’s power is that it is almost purely declarative and, therefore, an excellent way to declare cloud infrastructures. However, there are times when some procedural logic is needed, and in these situations, a procedural language would be easier to work with.

resource "aws_subnet" "public_subnet" {
  count                   = var.public-subnet-count
  availability_zone       = data.aws_availability_zones.main.names[count.index]
  cidr_block              = "10.0.${count.index}.0/24"
  map_public_ip_on_launch = true
  vpc_id                  = aws_vpc.vpc.id

  tags = {
    Name       = "${local.res_prefix}-public-subnet-${count.index}"
    SubnetType = "public"
  }
}

Terraform language example: a subnet configuration.

Kari’s AWS Cloud Development Kit Initial Experience

Using AWS Cloud Development Kit you can use your favorite programming language to define cloud infrastructures (as of writing, the supported languages are: TypeScript, JavaScript, Python, Java, and C# - it might be interesting to try Clojure with CDK). Since I use the programming language as a scripting glue to create and integrate cloud resources, I used Python, which according to my short experiences with CDK, turned out to be an excellent language with CDK.

I was astonished at how easy it was to start using AWS CDK. The learning curve was almost non-existing (which you definitely cannot say about Terraform). I’m pretty excited about the new IoT project, and I didn’t have time to read lengthy tutorials about using AWS CDK, so I just created an AWS CDK app using the initialization script (provided in Your first AWS CDK app):

cdk init app --language python

This command creates a new empty CDK app. Then I just read the AWS CDK Python Reference on creating the AWS resources and integrating them. I must point out that the AWS documentation is excellent - I had almost no issues whatsoever creating a data pipeline (AWS IoT Analytics => S3) as described in my previous blog post AWS IoT First Reflections. My biggest surprise was that I could implement the whole pipeline in one day. And using a new IaC tool, AWS CDK (without any previous experience with AWS CDK), and using AWS IoT Analytics - a service that I experimented just once with AWS Console.

my_name = f'{self.etsi_system_name}_{self.etsi_env_name}_{self.module}_channel'
iot_analytics_channel = iot_a.CfnChannel(
    self,
    my_name,
    channel_name = my_name,
    channel_storage = iot_a.CfnChannel.ChannelStorageProperty(
        customer_managed_s3 = iot_a.CfnChannel.CustomerManagedS3Property(
            bucket = iot_analytics_bucket.bucket_name,
            role_arn = iot_analytics_storage_iam_role.role_arn,
            key_prefix = 'raw/',
        ),
    ),
)
iot_analytics_channel.node.add_dependency(iot_analytics_bucket)
iot_analytics_channel.node.add_dependency(iot_analytics_storage_iam_role)

AWS CDK language example using Python: an IoT Analytics Channel.

Comparison

Configuration management

You can handle configuration management quite easily using both tools:

Terraform: Create a file in which you can give default values for each environment (e.g., RDS instance size…) as a map, and then provide other maps for each environment to override the default values. Merge the maps — and you have simple configuration management using Terraform.

AWS CDK: It’s easy to use the programming language constructs. E.g., you can create a method that populates the default tags to each resource and provide the AWS CDK construct (the resource) as an argument to the method.

Modularity

Modularity is simple using both tools:

Terraform: You can easily create re-usable Terraform modules and pass parameters to those modules. You can even store the modules in git repositories.

AWS CDK: You have the power of your favorite programming language to modularize your solution any way you want, and even share and use the modules as libraries.

Consistent Resource Naming

We are pretty stringent about naming our cloud resources in a certain way: e.g., providing the prefix + environment in every resource name. This way, we can easily search resources using either prefix (e.g., “SystemX”) or environment (e.g., “dev”) or the specific deployment (e.g., “SystemX-dev”).

Terraform: It is pretty easy to create a local variable for naming all entities for the current module and then use this local variable to concatenate consistent resource names for each resource entity.

AWS CDK: Very straightforward since you can use a real programming language. In our current project, we create the AWS IoT => S3 pipeline using AWS CDK and the other parts of the AWS infrastructure using Terraform. We used the same resource naming scheme in both solutions.

Consistent Resource Tagging

In the same way, as we are pretty stringent with resource naming, we also want to be strict with resource tagging. We want to add a set of default tags to all infra tool created resources + some extra tags for certain specific resources.

Terraform: Quite easy. You can provide the default tags, e.g., in the common environment configuration. Then you merge default and custom tags when creating the actual resource (and provide merged tags). Nowadays, aws provider supports default tags.

AWS CDK: Also relatively easy. After creating a resource, we just call a method to add the same default tags to every resource, e.g.:

self.add_default_tags(iot_analytics_channel)

Language

How powerful is the language? How easy is it to provide a declarative configuration of cloud infrastructure? How easy is it to provide customization and conditional situations (e.g., create this entity only if condition X exists…)?

Terraform: Terraform is a purely declarative language. That is its power and weakness. Terraform guides you to a particular declarative configuration that makes infrastructure code clean. On the other hand, certain conditional situations are a bit clumsy (though usually possible). One example is to create conditional resources by using the count meta argument.

AWS CDK: You can use a real programming language — therefore, the solutions often are as different as the developers creating those solutions - in this respect, Terraform forces you towards a more specific declarative infrastructure solution. On the other hand, you don’t have to make various tricks to provide conditional situations (as with Terraform), but you can use any conditional logic using your favorite programming language.

IDE Support

We are using Intellij IDEA and Emacs. The IDE support is excellent for both editors. Highlighting, navigation, suggestions etc., work out of the box (IntelliJ Terraform plugin and Terraform LSP servers that you can use with Emacs).

Deployment Process

Can you see a plan phase before actual deployment — i.e., what resources will be added/modified/deleted? How complicated is the deployment process? What happens if something goes wrong?

Terraform: Terraform supports a plan phase. The actual deployment (apply) is 99% of the deployments, pretty straightforward. Terraform does not try to roll-back if something goes haywire during the deployment. Usually this is a good thing.

AWS CDK: AWS CDK is just an abstraction to AWS CloudFormation. There are two steps involved in AWS CDK deployment:

  • CloudFormation stack creation: cdk synth. In this phase, you can see the CloudFormation stack that is going to be created/updated.
  • Deploying the actual CF stack: cdk deploy.

Example:

λ> cdk synth
...
Resources:
  XXXdevt1iotanalyticsbucketYYYYY:
    Type: AWS::S3::Bucket
...

λ> cdk deploy
IotanalyticsStack: deploying...
IotanalyticsStack: creating CloudFormation changeset...
...

 ✅  IotanalyticsStack

Stack ARN:
arn:aws:cloudformation:xx-yyyy-1:9999999999:stack/IotanalyticsStack/xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx

If something goes wrong during the deployment, CDK rolls back the created stack. This might be a good thing, or not.

Multi-cloud Capability

Can you use the IaC tool with all three major clouds (AWS, Azure, and GCP)? If you do multi-cloud work, there is a significant benefit to using either Terraform or Pulumi. If you are using AWS, you can decide to use AWS CDK or Terraform/Pulumi.

AWS CDK or CloudFormation?

We have also used native CloudFormation, and we would say we didn’t like to work with those JSON or YAML files that much. If you decide to use CloudFormation, we strongly suggest using AWS CDK instead of native CloudFormation JSON/YAML to create your cloud infrastructure code.

Terraform and CDK Integration

If you use Terraform to create your infrastructure, but you realize that you need an AWS service that Terraform does not yet support, you can create only that part of the infra using CDK and then integrate that infra part to Terraform.

We have created IoT Analytics using CDK in the example below since Terraform does not yet support it. In the Terraform code, we can query an IoT Analytics Channel ARN that we need to inject to an IoT Core Topic Rule:

#############################
# RESOURCES FROM THE CDK SIDE
# This is the integration point to the resources created on the CDK side.
# Since as of writing this, Terraform does not yet support IoT Analytics.

# Query the arn.
data "external" "cdk-iotanalytics-role-arn-program" {
  program = ["aws", "iam", "get-role",
    "--role-name", "${var.prefix}-${terraform.workspace}-iotanalytics-channel-iam-role",
    "--query", "{\"role-arn\": Role.Arn}"]
}
...
# And use the arn.
resource "aws_iot_topic_rule" "test-topic-rule" {
  name = "${var.prefix}_${terraform.workspace}_test_topic_rule"
  enabled     = true
  sql         = "SELECT * FROM '${local.test_topic}/1'"
  sql_version = "2016-03-23"
  iot_analytics {
    channel_name = local.iotanalytics-channel-name
    role_arn     = data.external.cdk-iotanalytics-role-arn-program.result["role-arn"]
  }
}

Conclusions

Both Terraform and AWS CDK are excellent IaC tools to create and maintain cloud infrastructures. Terraform is more purely declarative, and with AWS CDK, you can use your favorite programming language with imperative style.

The writer is working at Metosin using Clojure in cloud projects. If you are interested to start a cloud or Clojure project in Finland or you are interested in getting cloud or Clojure training in Finland you can contact me by sending an email to my Metosin email address or contact me via LinkedIn.

Kari Marttila

Kari Marttila’s Home Page in LinkedIn: https://www.linkedin.com/in/karimarttila/