With Aiven for Apache Flink® we added a new way to manipulate your Apache Kafka® streaming data via SQL statements, providing the best combination of tools for real-time data transformation.
For this challenge, we'll be using Aiven fake data generator on Docker to generate a series of symbols. The challenge consists of understanding the overall meaning of the symbols by transforming the original series of data with Apache Flink.
Let's dive right in.
The goal is to make sense of the incoming stream of data.
Create an Aiven free trial account: sign up for free.
Create a new Aiven authentication token.
Clone the Aiven fake data generator on Docker with:
git clone https://github.com/aiven/fake-data-producer-for-apache-kafka-docker
Copy the file
conf/env.confand edit the following parameters:
Parameter Name Parameter Value PROJECT_NAME Name of the Company Project where the Company for Apache Kafka service is running SERVICE_NAME Name of the Company for Apache Kafka service running TOPIC Name of the Topic to write messages in.
rollingfor the challenge
PARTITIONS 5 REPLICATION 2 NR_MESSAGES 0 MAX_TIME 0 SUBJECT
USERNAME Your Company account username TOKEN Your Company account token PRIVATELINK
Build the Docker image:
docker build -t fake-data-producer-for-apache-kafka-docker .
Run the Docker image
docker run fake-data-producer-for-apache-kafka-docker
Check the fake messages being produced by Docker
In the Aiven Console, navigate to the Aiven for Apache Flink service page
Play with the Aiven for Apache Flink Application tab and try to make sense of the data.
TipThe source table can be mapped in Aiven for Apache Flink with the following SQL, using the
rollingtopic as source
CREATE TABLE ROLLING_IN( ts BIGINT, val string, ts_ltz AS TO_TIMESTAMP_LTZ(ts, 3), WATERMARK FOR ts_ltz AS ts_ltz - INTERVAL '10' SECOND ) WITH ( 'connector' = 'kafka', 'properties.bootstrap.servers' = '', 'topic' = 'rolling', 'value.format' = 'json', 'scan.startup.mode' = 'earliest-offset' )
12. When you find the solution, email a screenshot to firstname.lastname@example.org
Some tips that could help in solving the challenge:
kcatis a tool to explore data in Apache Kafka topics, check the dedicated documentation to understand how to use it with Aiven for Apache Kafka.
jqis a helpful tool to parse JSON payloads, read the instructions on how to install and check the following useful flags:
-rretrieves the raw output
-jdoesn't create a new line for every message
-cshows data in compact view
If you're stuck with visualizing
kcatconsumer data with
jq, check the
-uflag as per dedicated example.
For any questions about the challenge, head over to our community forum.
Winner and prizes
All individuals who submit a valid proof will be entered into a drawing. The winner will be announced on the last day of re:Invent during Aiven's live stream from re:Invent, where a special prize will be awarded.
Are you attending AWS re:Invent? Complete the challenge and show us your proof at the Aiven booth 1629 (near the Builder's Fair in the Data Zone) and get an extra piece of swag!
Crafted by developers for developers
Don't miss our technical guides to get the most out of your Open Source data platform delivered straight to your inbox monthly!
Learn how to use Apache Kafka® as a source and sink to process streaming data, and how to deploy that with Terraform. A part of Aiven's Terraform Cookbook.