Mar 21, 2022

Karapace strengthens schema management

Companies are increasingly turning to Karapace, started by Aiven, as an open source alternative for Confluent Schema Registry and REST API for Apache Kafka®.

soumya-bijjal — Soumya Bijjal
|RSS Feed
VP Product at Aiven

Apache Kafka® is an essential technology in building digital services today. Businesses need access to a wide stream of data. What’s more, that data must be well-integrated to enable real-time business use cases, such as predictive analytics, personalized web experience, IoT sensor ingestion, customer behavior analysis and online fraud detection.

Why schemas matter

Apache Kafka at its core simply transfers data in byte format, and does not care what kind of data is being sent or received. Kafka doesn’t “understand” the data that flows through it but the producers and consumers need to, so we must define a common data type between producers and consumers if they are to understand each other. This is where Karapace fits in.

Karapace is a piece of software that resides outside your Kafka cluster and handles the distribution and evolution of schemas. Karapace is an open source alternative to Confluent Schema Registry and Apache Kafka REST proxy. With this project, Aiven enables businesses to build applications and services using Apache Kafka in the open source world.

Aiven and Karapace

Aiven announced Karapace in July 2020 in response to the licensing change in Confluent Schema Registry back in 2018. We wanted to ensure that we could continue to offer a well-maintained, supported open source alternative for all Apache Kafka users, whether or not they were our customers, for handling Kafka REST proxies and Schema Registry.

(And being also users of Apache Kafka, we naturally wanted that capability for ourselves, too.)

Aiven’s open source mission is to help the tooling and environment to evolve so that companies can avoid vendor lock-in, and this was an important step on that road.

The future of Karapace

Aiven is committed to continuing the development of Karapace in the open, and ensuring 1:1 compatibility between Confluent Schema Registry and Karapace to the fullest possible extent.

With Instaclustr joining the project, the future of Karapace is even stronger. Their recent contribution adds support for the Google Protocol Buffer (Protobuf) format in the Karapace schema registry. Protobuf allows cross-language support for code generation, which is valuable for organizations using multiple programming languages. In such organizations, building an event streaming platform with strong guarantees on data quality means that the chosen encoding format must be interoperable between those languages.

Aiven creating Karapace as an open source project to begin with is an invitation for other companies and individuals to contribute new features and make Karapace more robust, flexible and versatile. Everyone gains when open source tooling improves, regardless of who is doing the improving. Open source is a prime example of coopetition, where companies work together to share knowledge and resources even as they continue to compete for market share of their products. These two interactions take place at different stages of the value chain. The companies share for example lower-level research insights and components, while still keeping their own products and services that have been built upon the (now shared) foundation.

This is why we at Aiven are delighted to find ourselves collaborating with Instaclustr in the spirit of joint commitment to open source. This will benefit Aiven’s and Instaclustr’s customers equally, as well as all other users of Apache Kafka. After all, we’re in the business of providing our customers with the best open source technologies.

Further reading