Serverless event driven architecture with AWS Lambda functions and Apache Kafka®
Build serverless Event Driven Architectures (EDA) by combining Apache Kafka® with AWS Lambda functions. Learn how to trigger Lambda functions based on events flowing in an Apache Kafka topic
Despite a number of tools available to perform stream processing like Apache Kafka® Streams or Apache Flink®, sometimes you want to write some serverless code or functions to process events. You can do this with AWS Lambda, letting you concentrate on wiring the parsing logic without having to worry about managing or scaling the compute.
In this tutorial we'll cover the basics of how to invoke a Lambda function using an Apache Kafka trigger.
The Components
We'll use AWS Lambda triggers, in particular the Apache Kafka trigger, to consume from a specific Kafka topic and invoke the Lambda function on every event (or batch of events).
Getting started with Apache Kafka
Lambda's Apache Kafka trigger works with any Apache Kafka cluster, for simplicity this tutorial will showcase how to create one with Aiven by:
If you are creating a new account, select if it's Personal or Business and give the project a name
At Services, under Create new service choose Apache Kafka
To finish creating your Apache Kafka® service, choose the version (default is fine), and AWS as your cloud provider.
Create the Apache Kafka service in the same region as where you'll want to run the AWS Lambda function to minimize the latency.
Click on Create service
Click Skip this step to jump directly to the Apache Kafka service details while the service builds in the background. Enable the Apache Kafka REST API (Karapace) toggle so we can messages in a Kafka topic using the Aiven Console.
When the cluster is up and running, navigate to the Topics tab and create a topic called test. Click on the topic name to view it, then click Messages to consume and produce messages in the topic.
Store Apache Kafka credentials in AWS Secrets Manager
The next step is to define a secure link between Aiven for Apache Kafka and the AWS Lambda function. Aiven for Apache Kafka offers a client certificate authentication method out of the box we can store the secrets in AWS Secrets Manager by:
Select the Plaintext editor and include the following JSON
Loading code...
The <ACCESS_CERTIFICATE> and <ACCESS_KEY> are respectively the Access certificate and Access key you can find in the Aiven Console, under the Aiven for Apache Kafka service Overview tab.
The access certificate should start with -----BEGIN CERTIFICATE----- and end with -----END CERTIFICATE-----, the whole content should be included.
The access key should start with -----BEGIN PRIVATE KEY----- and end with -----END PRIVATE KEY-----, the whole content should be included.
Click Next
Define a secret name (e.g. prod/AppLambdaTest/Kafka), a description, tags and resource permissions
Click Next
Define the rotation configuration and click Next
Review the secret details and click Next
The secret contains the Aiven for Apache Kafka access certificate and key. We need to do the same set of steps to store the CA certificate. The JSON format for this second secret is:
Loading code...
The <CA_CERTIFICATE> is the Access CA you can find in the Aiven Console, under the Aiven for Apache Kafka service Overview tab. You need to include everything from the -----BEGIN CERTIFICATE----- to the -----END CERTIFICATE----- including the two opening and closing strings.
Name the CA related secret prod/AppLambdaTest/KafkaCA
Define a Lambda function
The next step is to define the Lambda function to perform some simple logging of the events coming in. We can create a Lambda function by:
In the Permissions section select Create a new role with basic Lambda permissions
Click on Create function
Enable the new AWS role to access the Apache Kafka secrets
The basic role created during the Lambda definition allows only CloudWatch logging by default. We need to enable the role to read the secrets containing the Apache Kafka certificates needed to connect to the Aiven service. We can do that by:
In the AWS Lambda function details page, click on the Configuration tab
Select the Permission section
Click on Edit
Click on the View the <ROLENAME> link at the bottom of the page to edit the newly created role (the <ROLENAME> is a string including the Lambda function name, the role string and a random string suffix), this opens AWS Identity and Access Management (IAM)
Click on Add permissions and select Create inline policy
Type Secrets Manager in the Select a service search screen and select Secrets Manager
Expand the Read section and select the GetSecretValue
In the Resources section, optionally filter the AWS Amazon Resource Names the role will be able to access or select All to allow the role access all secrets.
Give the new policy a name
Click on Next and Create policy
The new inline policy is added to the newly created role.
Create the Lambda trigger pointing to Apache Kafka
Back on the AWS Lambda service page where the newly created function has now all the privileges to access the secrets needed to connect to Apache Kafka. We can setup the trigger by:
In the Function overview click on + Add trigger
Select Apache Kafka
Under the Bootstrap servers click on Add and include Aiven for Apache Kafka bootstrap server that you can find in the Aiven Console, service page, overview tab.
Set 1 as batch size, so the function is called on every event
Set test as topic name
Select CLIENT_CERTIFICATE_TLS_AUTH as Authentication
Select the secret containing the access certificate and key in the Secrets Manager key
Select the secret containing the CA certificate in the Encryption
Click on Add
Write the Lambda logic
The last step in our journey is to define what the Lambda function should do. To define that we need to:
Write the following in the lambda_function section
Loading code...
In the above code we:
include json to parse the JSON values to push to the kafka topic
create the partition partition=list(event['records'].keys())[0]. To retrieve the name of the partition, check the details of an example event in the AWS documentation
Get the messages and the value session (data['value'], then print the pizza and name fields, after decoding from base64.
It's time to run the function with the Deploy button.
Check the pipeline results
To check if the pipeline is working, we need to:
Push records in the Apache Kafka test topic
Check in the logs that the messages are printed
We can test it by:
In the Aiven Console, navigate to the Aiven for Apache Kafka service
Click on the Topic section
Click on the test topic
Click on the Messages button
Click on Produce message button
Select json as format
Select the Value tab
Paste the following into the main editing area
Loading code...
To check that the Lambda function works:
In the AWS Lambda page, navigate to the Monitor tab
In the Metrics subtab, you should see the number of invocations, success and error rates
In the Logs subtab, you should be able to check the log details. Opening the latest logs you should see an entry like the following
Loading code...
Scalable event driven architecture with AWS Lambda and Apache Kafka
AWS Lambda functions are a widely adopted by developers to ingest and parse a scalable amount of data without having to pre-provision a dedicated service.
The combination between Lambda functions and Apache Kafka offers an interesting option for building scalable, serverless, event driven architectures. It lets you read data from Apache Kafka without having to code an application or deploy a service specific to that. This reduces your overall infrastructure cost and the complexity of your system.