Aiven’s favorite Apache Kafka® connectors
Getting started with Aiven for Apache Kafka®? Check out this list of our most used Kafka connectors!
The beauty of open source software is the wealth of community-driven frameworks, plugins and additional code available to anyone who chooses to use it. Apache Kafk® a is no exception. Indeed, with Apache Kafka® Connect, the Kafka project standardized its connection framework, making integrating with Kafka more seamless than previously. However, with so many options for connectors, it can be difficult to know which to choose.
We’ve written this article as an overview of the most useful Kafka connectors for a variety of use cases. These use cases include:
- Event sourcing: one of Kafka’s most compelling use cases is event sourcing and event driven applications – using events happening in one system, like a user frontend, to trigger actions in another system, such as a monitoring system. This is particularly useful for application developers, retail, or anyone whose use case involves interacting with an end user and a graphical user interface.
- Analytics and metrics: Another way to utilize the data your applications generate is to send that data to analytics and metrics reporting. More advanced use cases in this vein include fraud reporting and anomaly detection. Kafka can structure data consistently for an unlimited number of consumers, making that data easy to transport, read and act on. Most industries have a need for some kind of analytics or metrics monitoring, but we find that the most pressing use cases are generally high tech companies for whom service reliability is key.
- Data streaming: Kafka can act as a transport layer in near realtime data streaming applications. Most industries don’t need realtime data monitoring, but we find that use cases in the energy sector and financial services sector find the ability to stream in realtime when needed useful.
We selected connectors for this article based on overall utility, ongoing community support/popularity, and readiness for an enterprise production environment.
To use any of these connectors with your Aiven for Apache Kafka service, spin up an Aiven for Apache Kafka Connect service alongside your Kafka service, and connect the two.
Kafka connectors fall into two categories: source and sink. Source connectors are designed to send data to Kafka from specific kinds of systems, and Sink connectors are designed to send data from Kafka to a specific kind of system.
Aiven’s general purpose Kafka Connectors
Aiven maintains Kafka connectors which simplify connecting to Aiven for a number of different data services. These include both sink and source connectors.
- Kafka Connect JDBC: The JDBC Source Connector connects a variety of relational databases to Kafka, including MySQL and PostgreSQL. The Aiven platform supports automatic connections via the console for databases deployed to Aiven, but if you need to connect to another database, this do-it-all connector is the thing to use. Learn how to use it with the [Using Kafka Connect JDBC Source](https://aiven.io/developer/using-kafka-connect-jdbc-source-a-postgresql-exampletutorial!
- Commons for Kafka Connect: The Aiven Commons for Kafka Connect provides connectors to a number of data buckets on popular clouds. This is useful when reading in data subsets for analysis in other systems, or writing analytics data to cold storage. It provides connections to Google Cloud Storage, Amazon S3 Buckets and Azure Blob storage.
- HTTP Sink Connector: We like the Aiven HTTP sink connector for sending data over HTTP. Nor every piece of data can go directly into a data store. The HTTP Sink Connector recommended go-to for sending any and all information from Kafka to any other system using HTTP. We recommend learning how to use it even if you’re connecting Kafka to a known quantity like Postgres or MySQL – at some point, we find almost all Kafka implementations connect to something that a preexisting connector doesn’t work for.
Purpose specific connectors
Sometimes it's useful to use a connector that targets a specific technology or use case. At Aiven, we see these the most in production:
- Debezium: Debezium is our go-to for change data capture implementations using Kafka. Learn more about using Debezium to implement change data capture using Amazon RDS, Azure SQL, and Postgres tables with logical decoding.
- BigQuery: For those working with Google Cloud, BigQuery offers a unique way to access your data at scale.
Get more from your Kafka implementation
Apache Kafka is a service best paired with other services to build out data pipelines for all sorts of interesting things. The following resources can help you use Kafka to its fullest:
- The Aiven for Apache Kafka cookbook offers Terraform implementations of some of the most popular integrations
- Tiered storage lets you save data that passes through Kafka externally for later use. Learn how to use tiered storage with our guide!
- Use the Aiven Console to one-click integrate with popular services like Datadog and more.