Highlights from Berlin Buzzwords 2022

Our team was thrilled to be back in person at Berlin Buzzwords 2022. Read on for a recap of our favourites on storing, processing, streaming and searching data.

23 June 2022

Last week, some of the Aiven team and I had the pleasure of meeting the developer community in person at Berlin Buzzwords 2022. As always, the conference was packed with great content and people from all over the world, so personally, I was quite torn between the sessions and the expo hall! Thankfully, some of my colleagues kindly shared their highlights with me, so keep reading for a recap of our favorite talks.

Stream

There were lots of great streaming talks and I’m excited to start off with the one by our wonderful Developer Advocate, Olena Kutsenko. If you’re curious about Apache Kafka and how it works, Olena’s session “Apache Kafka® simply explained” is a great way to start, with a very entertaining example that you can also check out on GitHub.

Speaking of Apache Kafka®, Amrit Sarkar’s session is worth checking out to dive into metrics and indicators that matter most while running Kafka at scale. For those who are struggling with scaling Kafka pipelines, this talk by Opher Dubrovsky and Ido Nadler could come in handy.

Christoph Schubert did a comprehensive introduction of the Kafka Streams architecture and talked about best practices for running your Kafka Streams application smoothly in production. And if you’re a fan of Apache Flink® (I know we are!), this talk by Timo Walther gives a really good overview of the stream processing framework and its capabilities.

It was also interesting to learn more about Apache Druid®, which is an open source analytics database for modern data-intensive applications, and luckily, Sergio Ferragut gave a great introduction to its architecture. (Shameless plug: there will also be another opportunity to hear Sergio speak about Apache Druid® at Uptime!)

Scale

“A smart person learns from their own mistakes, but a truly wise person learns from the mistakes of others.” A great theme for a talk! Noaa Barki shared some lessons and recurring patterns her team learned from reviewing 100+ Kubernetes post-mortems, so you can avoid those.

Another interesting Kubernetes use case was shared by Ramiro Alvarez Fernandez, Álvaro Panizo, and Daniel Hernández Alfageme. The Empathy.co team showed how they migrated from a cluster in the cloud to Kubernetes, with some actual figures on cost and performance.

For some great advice and strategies for validating large systems, we recommend checking out “Effective CI/CD for Large Systems” by Josh Reed. Not only because Josh is our colleague, but also because this talk is just plain useful when testing large systems, integrating new changes, and ensuring good code quality.

Search

One of the highlights for our team was the chance to meet the OpenSearch and Apache Lucene® folks in person and hear some great talks about these technologies! Uwe Schindler talked about the future of Lucene's MMapDirectory, which is a very strong and specific technical subject, but with a nice overview of memory mapped files from the ground up in Java. Eli Fisher talked about how search engine technologies, like OpenSearch®, and their features have evolved to be adopted for many other use cases, e.g. log analytics, security analytics, and more.

If you’re curious about some other interesting search engine examples, Atita Arora introduced the Vespa architecture, its advanced features, and how to generally understand Vespa with a fresh or Lucene-based mindset. It’s also worth checking out Richard Goodman’s talk about using Apache Solr® unconventionally to serve 26bn+ documents.

Store

For those of you who are excited about all the different OLAP solutions out there, Chinmay Soman’s talk about Apache Pinot® could be interesting to check out. Chinmay goes over Pinot’s capabilities and what makes it a unique OLAP platform.

It was quite interesting to hear about OpenTelemetry (which will also be covered at Uptime!) and how it allows the creation of custom metrics in a standard, scalable, and reusable way from Ricardo Ferreira.

Fans of PostgreSQL® (hi!) will enjoy this engaging session by our colleague Francesco Tisiot. Francesco explains how recursion works in PostgreSQL with an interesting example of solving the knapsack problem.

Session recordings are already available to binge-watch in the Berlin Buzzwords 2022 YouTube playlist.

events