(Re)Create Apache Kafka® topics with Terraform

Terraform can orchestrate your infrastructure the same way every time, giving you confidence in your consistent platform. Idempotency for the win.

Subscribe to RSS

One of the joys of cloud data platforms is the ability to quickly and easily create and configure services. Often I'm working on more than one project at a time, so by using Terraform to create/recreate the setup, I can swiftly get going again when I return to the task. I've recently been working with Apache Kafka® quite a bit, using Terraform to recreate the topics that my application uses, and I'll be sharing my approach in this post.

Before we begin

I encourage you to try these steps out yourself, and adapt the examples as you go along to meet your own requirements. To do that, you will need:

Terraform set up on your computer.
An Aiven account (sign up if you don't have one already, there's a free trial)

Aiven Terraform provider

Aiven publishes a Terraform provider, which is great because it gives everything required to create, configure and delete services on the Aiven platform. You can find more information about the Aiven Terraform provider in the documentation, but configuring the basics is a good place to start.

Set up the provider

For a new project, run terraform init in the directory where you will keep your *.tf files, and then create a file called provider.tf containing the text below. This tells Terraform which provider to use (aiven), including the version (3.7 or later). The provider is essentially an API client, so you also need to declare a variable for the API token that the Aiven provider can use.

If you already have a Terraform project, you can add these blocks alongside your other provider configurations.

terraform {
  required_providers {
    aiven = {
      source  = "aiven/aiven"
      version = ">= 3.7"
    }
  }
}

provider "aiven" {
  api_token = var.aiven_api_token
}

Configure the variables

For this project, there are two variables:

the Aiven API token
the project name

Tip: you can also use a data source to represent the project, but using the project name is simple and works too!

Working with variables is done in two steps. The first step is defining the variables that are required, which I'm doing in a file called variables.tf:

variable "aiven_api_token" {
  description = "Aiven API token"
  type        = string
}

variable "project_name" {
  description = "Aiven project name"
  type        = string
}

The variables are not very exciting without their values. Use a variable definitions file called aiven.tfvars (we're including the example API token value, don't worry this one isn't real):

project_name    = "dev-project"
aiven_api_token = "oalk0W1m+Lhss0CPrMfOqvXBp+LDB0LAC4lxbSzEYS7A/dAhnTZOiM5leC1NzIUZnVLHAr9eVESKSB41tgttFGqAWbxiYI5iPNB8CZTohwi91dsULj5uwXyHfho+M94yhC8srl84oEsnXExksNkLolvKcvwJ6IIw5c14c3Mt+FUwcenl9BA2LkC9DNJ/TDoM3qfHXXLaTknW3IbB3SIUR4YFE+ru/i7REEfYcj41YhdBqXANzRM0ETwSraOCVV7cuOyZR5UrWuwFzgWf54Qqy/mILxQR9PwSXzRSuZ6pBMH2chkPF4mZlGoJjjDWuE+CPFo9EysGlWYWAAZThtvor11iQM9/JtvjzLYqfvZzvbgP"

The *.tf files don't contain any secret values, so can safely be checked into source control.
The aiven.tfvars file doesn't get checked into source control, but is supplied to Terraform operations as needed.
Mixing configuration and values feels like it might be a useful shortcut, but it usually leads to regrets.

Set up Apache Kafka

For non-trivial setups, recreating the topics can be laborious. In a dev environment, it's tempting to allow Kafka to create the topics automatically when they're accessed. This is ideal for prototyping but makes it difficult to keep track of the topics in use, and too easy to work with a slightly misspelled topic without realising. Being able to quickly recreate a specific set of topics is very useful and Terraform lets us do that.

This example also creates an Apache Kafka® resource, but you could equally use a data source if your goal is to configure topics on an existing cluster. Here's the file with the cluster defined:

resource "aiven_kafka" "project_kafka" {
  project                 = var.project_name
  cloud_name              = "google-europe-west1"
  plan                    = "business-4"
  service_name            = "lorna-kafka-demo"
  maintenance_window_dow  = "monday"
  maintenance_window_time = "10:00:00"
  kafka_user_config {
    kafka_version = "3.2"
    schema_registry = true
    kafka_rest      = true
    kafka {
      default_replication_factor = 2
    }
  }
}

The Aiven for Apache Kafka Terraform resource type has a lot of available configuration options. Take a look at the documentation for the aiven_kafka resource type for a very long list of what you can use in this configuration block. This example enables the Karapace Schema Registry and the REST API, which is generally a good default.

Define the desired topics

The topics are in a different file, using the resource name of the cluster already defined to set the service name for each topic:

resource "aiven_kafka_topic" "user_activity" {
  project      = var.project_name
  service_name = aiven_kafka.project_kafka.service_name
  topic_name   = "user_activity"
  partitions   = 3
  replication  = 2
}

resource "aiven_kafka_topic" "avatars" {
  project      = var.project_name
  service_name = aiven_kafka.project_kafka.service_name
  topic_name   = "avatars"
  partitions   = 3
  replication  = 2
}

resource "aiven_kafka_topic" "event_logs" {
  project      = var.project_name
  service_name = aiven_kafka.project_kafka.service_name
  topic_name   = "event_logs"
  partitions   = 3
  replication  = 2
}

For more options on configuring the topic, visit the configuration documentation for this resource type.

Describing the topics like this is a nice way to think about how they should be configured, and of course the text-based approach means that changes can be tracked easily, versions compared, or setups shared with others for their own use.

Deploy the cluster and topics

With the provider, variables, Aiven for Apache Kafka cluster and desired topics configured, Terraform can bring our dreams to life. Start by running the terraform plan command. Since we put the variables in a separate file we can use the -var-file switch to specify it. Your command looks something like:

    terraform plan -var-file=values.tfvars

The (long-winded) output of this command will end by announcing the number of resources to add, change or destroy. In this case, with one cluster and three topics, there are 4 things to add. If the plan outputs the expected value, then go ahead and use the apply command to enact the changes:

    terraform apply -var-file=values.tfvars

Terraform for fast and repeatable service configuration

I use a lot of scripts like this for different demo or work-in-progress projects, and I hope you'll find this approach useful in your own work as well. Rather than trying to take notes or remember what to do, write it once and run it as many times as needed.

For more starter Terraform configuration, visit the Aiven Terraform cookbook where you will find plenty of ingredients for you to cook up something to meet your own needs.

If you are new to Terraform, try our step-by-step guide to get you started with detailed instructions.

Table of contents

Before we begin
Aiven Terraform provider
Set up the provider
Configure the variables
Set up Apache Kafka
Define the desired topics
Deploy the cluster and topics
Terraform for fast and repeatable service configuration