
Developer Center
Pure SQL pipelines using Apache Flink®? Learn how to set them up with a local Docker-based platform for Flink, including an SQL client!
Welcome to Aiven Rolling with re:Invent challenge, an easy way for you to explore Aiven for Apache Kafka® and Aiven for Apache Flink®.
With Aiven for Apache Flink® we added a new way to manipulate your Apache Kafka® streaming data via SQL statements, providing the best combination of tools for real-time data transformation.
For this challenge, we'll be using Aiven fake data generator on Docker to generate a series of symbols. The challenge consists of understanding the overall meaning of the symbols by transforming the original series of data with Apache Flink.
Let's dive right in.
The goal is to make sense of the incoming stream of data.
Create an Aiven free trial account: sign up for free.
Create a Aiven for Apache Kafka® and Aiven for Apache Flink® service.
Set up an integration between the Aiven for Apache Kafka® and Apache Flink® services.
Create a new Aiven authentication token.
Clone the Aiven fake data generator on Docker with:
git clone https://github.com/aiven/fake-data-producer-for-apache-kafka-docker
Copy the file conf/env.conf.sample
to conf/env.conf
and edit the following parameters:
Parameter Name | Parameter Value |
---|---|
PROJECT_NAME | Name of the Company Project where the Company for Apache Kafka service is running |
SERVICE_NAME | Name of the Company for Apache Kafka service running |
TOPIC | Name of the Topic to write messages in. rolling for the challenge |
PARTITIONS | 5 |
REPLICATION | 2 |
NR_MESSAGES | 0 |
MAX_TIME | 0 |
SUBJECT | rolling |
USERNAME | Your Company account username |
TOKEN | Your Company account token |
PRIVATELINK | NO |
SECURITY | SSL |
Build the Docker image:
docker build -t fake-data-producer-for-apache-kafka-docker .
Run the Docker image
docker run fake-data-producer-for-apache-kafka-docker
Check the fake messages being produced by Docker
In the Aiven Console, navigate to the Aiven for Apache Flink service page
Play with the Aiven for Apache Flink Application tab and try to make sense of the data.
Tip
The source table can be mapped in Aiven for Apache Flink with the following SQL, using therolling
topic as sourceCREATE TABLE ROLLING_IN( ts BIGINT, val string, ts_ltz AS TO_TIMESTAMP_LTZ(ts, 3), WATERMARK FOR ts_ltz AS ts_ltz - INTERVAL '10' SECOND ) WITH ( 'connector' = 'kafka', 'properties.bootstrap.servers' = '', 'topic' = 'rolling', 'value.format' = 'json', 'scan.startup.mode' = 'earliest-offset' )
12. When you find the solution, email a screenshot to challenge@aiven.io
Some tips that could help in solving the challenge:
kcat
is a tool to explore data in Apache Kafka topics, check the dedicated documentation to understand how to use it with Aiven for Apache Kafka.
jq
is a helpful tool to parse JSON payloads, read the instructions on how to install and check the following useful flags:
-r
retrieves the raw output-j
doesn't create a new line for every message-c
shows data in compact viewIf you're stuck with visualizing kcat
consumer data with jq
, check the -u
flag as per dedicated example.
For any questions about the challenge, head over to our community forum.
All individuals who submit a valid proof will be entered into a drawing. The winner will be announced on the last day of re:Invent during Aiven's live stream from re:Invent, where a special prize will be awarded.
Are you attending AWS re:Invent? Complete the challenge and show us your proof at the Aiven booth 1629 (near the Builder's Fair in the Data Zone) and get an extra piece of swag!
Don't miss our technical guides to get the most out of your Open Source data platform delivered straight to your inbox monthly!
Developer Center
Pure SQL pipelines using Apache Flink®? Learn how to set them up with a local Docker-based platform for Flink, including an SQL client!
Developer Center
Learn how to use Apache Kafka® as a source and sink to process streaming data, and how to deploy that with Terraform. A part of Aiven's Terraform Cookbook.