Aiven Blog

Oct 1, 2025

The Open-Source BigQuery Sink Connector Saga

The story of how this connector switched hands three times and how Aiven came to maintain it.

Hugh Evans

|RSS Feed

Developer advocate and community manager with a particular interest in data and AI.

The BigQuery Sink connector is a critical piece of Kafka infrastructure that allows you to offload your Kafka topic data into BigQuery in real time. It is the third most-used connector among Kafka users (next to Google Cloud Managed Service for Apache Kafka and the original WePay sink connector), but it's not without its fair share of plot twists. Here's the story of how this connector switched hands three times and we ultimately ended up re-building it.

The Connector That Almost Disappeared

The BigQuery sink connector has a complex history that nearly led to a valuable open source tool becoming effectively proprietary. Originally developed at WePay in 2016 and released under the Apache 2.0 license, the connector served the community well for years. WePay maintained it effectively, adding features like table partitioning and GCS (Google Cloud Storage) batch loading while welcoming community contributions.

Confluent began more actively developing the connector as they started offering BigQuery support for their customers. To streamline PR reviews, they struck a collaborative agreement with WePay to have Confluent own the repo while WePay got maintainer rights. This seemed to work well—until Google introduced the Storage Write API.

The Feature That Changed Everything

Google's Storage Write API was a game-changer for BigQuery ingestion, at half the price of the older insertAll API with the first 2 TiB per month for free. The community immediately recognized its value, and implementation work began on the open source connector.

However, after the Storage Write API support was merged into the repository, something unexpected happened: all related commits were reverted! Instead, Confluent released a "V2" connector with Storage Write API support exclusively for their cloud platform, with no plans to make it available to the broader open source community.

The message was clear: the open source connector was now secondary to commercial interests.

Aiven Steps In

Recognizing that a critical piece of open source infrastructure was being left behind, Aiven forked the connector in 2024. We restored the Storage Write API functionality, updated the project infrastructure, and released version 2.6.0 with the features the community had been requesting.

But we didn't stop there. We've continued active development, published comprehensive documentation in collaboration with the BigQuery team that stays current with the codebase, and maintained the project with genuine open source principles in mind.

Why It Matters Today

It's being used: Our connector is currently the third most-used connector among Kafka users, demonstrating real-world adoption and trust from the community.

Google endorses it: Google now includes our connector in their official documentation for Managed Service for Apache Kafka, showing that the creators of BigQuery see value in what the Kafka community has built.

It has the features you need: Unlike the original open source version, our fork includes Storage Write API support, giving you access to Google's most cost-effective ingestion method.

It's truly open source: No vendor lock-in, no proprietary features, no limitations based on which platform you're using. And it’s going to stay that way.

Getting Started

If you're working with Kafka and BigQuery, you don't have to choose between paying for a commercial solution and using an outdated open source alternative. Our connector is available now at github.com/aiven-open/bigquery-connector-for-apache-kafka, with full documentation at aiven-open.github.io/bigquery-connector-for-apache-kafka.

Whether you're running your own Kafka Connect clusters or using a managed service, the connector works the same way. Configuration is straightforward, and you'll have access to all the cost-saving benefits of Google's Storage Write API.

Looking Forward

Kafka Connect is a crucial piece of modern day data infrastructure. It and projects like it help Kafka solidify its status as the standard for real-time data integration. Open source data infrastructure works best when the community can rely on projects being maintained with genuine open source principles. Putting tablestake features of open-source components behind a paywall inhibits the adoption of Kafka. Because we believe that a rising tide lifts all boats, we are committed to maintaining this connector with genuine open source principles.

The fact that Google has chosen to highlight our connector in their official documentation shows that quality open source solutions can compete with and complement commercial offerings. We're proud to be maintaining this critical piece of infrastructure for the community.

If you are interested in contributing to open-source data infrastructure, we are hiring strong Kafka engineers in Europe.

Ready to try the Aiven BigQuery Connector for Apache Kafka? Check out our GitHub repository or dive into the documentation to get started.


Stay updated with Aiven

Subscribe for the latest news and insights on open source, Aiven offerings, and more.

Related resources