Sample dataset generator for Aiven for Apache Kafka®
Learning to work with streaming data is much more fun with data, so to get you started on your Apache Kafka® journey we help you create fake streaming data to a topic.
The following example assumes you have an Aiven for Apache Kafka® service running. You can create one following the dedicated instructions.
Fake data generator on Docker
To learn data streaming, you need a continuous flow of data and for that you can use the Dockerized fake data producer for Aiven for Apache Kafka®. To start using the generator:
-
Clone the repository:
git clone https://github.com/aiven/fake-data-producer-for-apache-kafka-docker
-
Copy the file
conf/env.conf.sample
toconf/env.conf
-
Create a token in the Aiven Console or using the following command in the Aiven CLI, changing the
max-age-seconds
appropriately for the duration of your test:avn user access-token create \
--description "Token used by Fake data generator" \
--max-age-seconds 3600 \
--json | jq -r '.[].full_token'tipThe above command uses
jq
(https://stedolan.github.io/jq/) to parse the result of the Aiven CLI command. If you don't havejq
installed, you can remove the| jq -r '.[].full_token'
section from the above command and parse the JSON result manually to extract the token. -
Edit the
conf/env.conf
file filling the following placeholders:my_project_name
: the name of your Aiven projectmy_kafka_service_name
: the name of your Aiven for Apache Kafka instancemy_topic_name
: the name of the target topic, can be any namemy_aiven_email
: the email address used as username to log in to Aiven servicesmy_aiven_token
: the personal token generated during the previous step
-
Build the Docker image with:
docker build -t fake-data-producer-for-apache-kafka-docker .
tipEvery time you change any parameters in the
conf/env.conf
file, rebuild the Docker image to start using them. -
Start the streaming data flow with:
docker run fake-data-producer-for-apache-kafka-docker
-
Once the Docker image is running, check in the target Aiven for Apache Kafka® service that the topic is populated. This can be done with the Aiven Console, if the Kafka REST option is enabled, in the Topics tab. Alternatively you can use tools like kcat to achieve the same.