Kafka / Python Workshop

Welcome to our workshop!

Thank you so much for taking part in our very first hands-on workshop: Learn Apache Kafka with Python

We hope you will join us on Youtube LIVE on Tuesday, but what follows are written instructions for those keeners who would like to jump ahead, or for those newer folks who might hit bumps along the way during the workshop.

If you want to have a look beforehand, here is the GitHub repo.

A tutorial following similar steps is also available in Aiven’s DevCenter.

Pre-requisites

To get the most out of your workshop experience, we recommend doing the following ahead of time:

  1. Get signed in to Aiven Console
  2. Authenticate with Gitpod
  3. Set Gitpod’s Python client environment

We strongly suggest to use Google Chrome for the workshop

Get signed in to Aiven Console

In order to eliminate issues between individual machines, we’ll be using Apache Kafka and Apache Flink on the Aiven platform.

To get started, follow the instructions at Creating an Aiven account

Authenticate with Gitpod

Gitpod is an entire development environment running in the cloud, accessible from your browser, including an IDE and Jupyter notebooks. Once again, we’ll use this tool to eliminate issues with individual machines during the workshop.

This step will connect Gitpod to your GitHub account. The process, and related privileges is explained in the dedicated Gitpod documentation.

  1. Go to Dashboard and click Continue with GitHub.
  2. A new window will open, and you’ll be directed to enter your GitHub credentials. Fill out the form and click Sign in.
  3. You’ll be presented with a screen to authorize Gitpod to view your GitHub email address. Click Authorize gitpod-io. If you have 2FA (two factor authentication) set up, you will have to do that as well.
  4. Back at the previous screen from step 1, click Continue with GitHub.
  5. You should now see GitPod’s welcome screen. Click Continue with 10 hours per month.
  6. Either leave defaults, or make whatever customizations you’d like at the How are you going to use Gitpod? screen. Click Continue.
  7. Answer the demographic questions however you’d like and hit Continue.
  8. Click Continue from the welcome page, leaving the default settings.
  9. You will now have to enter a mobile number to verify your account. Once you receive the 6-digit code, enter it and click Validate account.
  10. Your account has now been successfully validated!
  11. Once you click Continue, you’ll be back at the New Workspace screen. Click Continue, leaving the default settings.
  12. Congratulations! You will now find yourself in the Gitpod editing environment:

Set Python client environment

The workshop’s materials are stored in Jupyter Notebooks to make it easy to follow the examples and always have all the code in front of you.

  1. From the Gitpod EXPLORER panel at the left, click on > notebooks which expands out to show the files inside that folder:

  2. Click the folder called 1-produce.ipynb to load the first notebook.

  3. In the upper right, press Select Kernel.

  4. Your first time, you’ll be directed to choose a kernel source. Select Install/Enable suggested extensions Python + Jupyter.

  5. You’ll then be prompted to install and synchronize the ‘Python’ extension. Choose Install. Answer the same for the ‘Jupyter’ extension.

  6. Press Select Kernel again, and choose Python Environments. The Recommended selection of Python 3.12.1 should be OK.

  7. Congratulations! You should now see Python 3.12.1 where the “Select Kernel” button used to be. You’re all set to start the workshop!

Troubleshooting / FAQs

I’ve already used Aiven before and used all of my trial credits. Can I still participate in the workshop?

You can create a new account with another email address, this will entitle you of a new trial with the 300$ of credits and a month to spend them.

When I opened Gitpod, I saw a bunch of little error boxes on my screen!

That’s ok. Just click on the x for each of them to make them go away. The errors listed are related to Jupyter notebooks and not relevant for the success of the workshop.

2 Likes

Here to answer any questions if necessary

2 Likes

…although it looks as if no-one had problems needing my help :slight_smile:

2 Likes

Which just means job well done in my book :wink:

1 Like

Thanks nevertheless for your assistance, @tibs_at_aiven ! :smiley:

For anyone else out there who might be catching up on the workshop material at a later time, we’re here to help! :slight_smile:

…and of course, we will run this workshop again in the future!

1 Like

It’s been 14 min and I am getting the following message when trying to consume the messages in 2-consume.ipynb:

%4|1711529846.503|MAXPOLL|myclient#consumer-1| [thrd:main]: Application maximum poll interval (300000ms) exceeded by 168ms (adjust max.poll.interval.ms for long-running message processing): leaving group

I can see that messages still exist from the portal by clicking the ‘Fetch Messages’ button. Can anyone help?

2 Likes

Thanks for following up here, and sorry we weren’t able to sort it out in the actual workshop session.

My first thought is to wonder if it’s that there weren’t any messages to consume - for all the Kafka consumer examples, the consumer is starting to read new messages, which is why the code outputs a message to remind to run the producer again.

If that’s not the case, let us know…

(The “Fetch Messages” dialogue counts as its own consumer, so it carries on from where it had last consumed to, which might explain why it saw things that the Python code didn’t)

2 Likes