Oct 28, 2022
Apache Kafka®: Confluent vs self-managed vs Aiven
Aiven isn't the only thing out there. In this post, we compare Confluent's offering with Aiven's, and match both against self-managed solutions. Find out more!
Information and content updated on February 2024
Should you self-manage, or should you get Apache Kafka® managed and hosted? Should you go for an on-premises solution, or deploy your streaming data infrastructure in a cloud provider of your choice? And if you choose managed cloud, which provider should you trust? In this blog post, we’ll be asking some searching questions to help you figure out which Apache Kafka alternative is right for you. We’ll compare three potential setups that will get your streaming data going:
- Self-managed Apache Kafka
- Fully-managed Apache Kafka available from Confluent
- Aiven for Apache Kafka
Self-managed Apache Kafka is good for… | Aiven for Apache Kafka is good for… | Confluent’s managed Apache Kafka is good for… |
---|---|---|
+ Self-managed Apache Kafka configuration | + Pre-existing default setups based on your use case and scalability potential of your application | + Pre-existing default setups based on your use case and scalability potential of your application |
+ Full up-close access control | + External experts to manage your Apache Kafka environment | + External experts to manage your Apache Kafka environment |
+ Full control of premises | + No costs from premises: walls, A/C, power, security… | + No costs from premises: walls, A/C, power, security… |
+ No need to pay for hardware | + No need to pay for hardware | |
+ No need to sign up for a cloud account | + Can leverage existing cloud account (Bring-Your-Own-Cloud) | |
+ You own your servers | + Dedicated servers for VMs | + Dedicated servers for VMs (higher pricing tiers only****) |
+ Highly granulated costs | + Predictable price with networking and storage costs included | + Low price at start (price hike comes with more features) |
+ Wide range of integrations and connectors | + + Wide range of integrations and connectors | |
+ Out of the box monitoring AND plug-in monitoring solutions | + Out of the box monitoring AND plug-in monitoring solutions | |
+ Migration service in and out without additional data transfer costs | Migration service in and out, data transfer costs added to cluster cost | |
+ Consistent availability (SLA of 99.99%) across all plans | + Consistent availability (SLA of 99.95% across Basic and Single AZ clusters, SLA of 99.99% only on Multi AZ clusters*). | |
+ Best ‘Quality of Support’ for stream processing platforms*** |
Let’s go over some of the key questions you should be asking when deciding on whether you’re planning to manage Apache Kafka yourself or choose a fully-managed Apache Kafka. Some of them might surprise you!
Q1: Who should manage and maintain your Apache Kafka environment?
I want full control of everything - I have the qualified staff and no problems hiring.
If control at any cost is your priority, then a self-hosted on-premises Apache Kafka might be good for you. The learning curve is steep, but since you say you have the qualified staff with enough time, that won’t be a problem.
Just remember, getting the clusters set up is only the start; you’ll need to implement security, install patches and updates, manage access rights and so on. (We don’t want to sound negative, but it actually is quite a lot of work.)
I don’t want my company to have to manage Apache Kafka, my staff don’t have the time or the skills.
If you want to forget about your Apache Kafka clusters 99.99% of the time and still have them be operational 99.99% of the time, get a managed solution.
The ease of operations with a managed Apache Kafka service is great if you don’t want to spend money on hiring experts. You can let your chosen vendor operate and maintain your Apache Kafka for you.
You do need someone on your side with a cursory understanding of the systems you use, but finding and hiring that talent is far easier than finding someone able to operate, manage and maintain your Apache Kafka clusters and environment in production.
Q2: Where should your streaming data infrastructure be physically located?
At my own premises - we’re in a regulated industry
Companies in regulated industries may need an on-premises installation for legal compliance. This will vary depending on the country.
However, if the legal requirement is only for the data to be located within the borders of a certain country or area, some fully-managed services (like Aiven) offer the option of selecting a specific data center and Availability Zone to deploy your Apache Kafka clusters in.
With Aiven’s easy migration process, you can also employ a hybrid model. Leave the applications handling confidential information on your on-premises environment, and migrate non-confidential topics / data streams to the cloud. You can always specify an AZ that is still fully compliant with your local regulations.
Aiven for Apache Kafka cloud providers and AZs are free to browse; play around with the pricing widget to see if you can benefit.
In a cloud - I don’t want to pay for walls
You might need your Apache Kafka clusters to be available in a wide area. Or perhaps you just don’t want to pay for hardware, not to mention facilities which tend to include things like physical access control, electricity, air conditioning and building rent. If the traffic costs and lag aren’t an issue--possibly because your business is local--then deployment to a single public cloud environment may fit your needs.
The degree to which you want to self-manage this environment is a separate issue; see above for guidance.
In a global cloud environment, a hybrid environment, or multiple clouds - I need wide, instantaneous availability
If you anticipate your data streaming cluster to be accessed from a wide geographic area and via multiple public clouds, a multi-region, multi-cloud deployment may make sense. If this is your choice, a managed environment is usually the one that makes sense; managing multi-cloud deployments yourself is an order of magnitude more difficult than maintaining a single cluster.
Q3: If in the cloud, do you have an existing cloud account?
Yes, and I want to leverage that
If you already have a cloud account, the easiest option may seem to be deploying a self-managed Apache Kafka to that cloud, especially if the provider already offers it as a managed service.
But that’s not the only way. Many data platforms, Aiven included, offer a Bring Your Own Cloud (BYOC) option. This allows you to benefit from any bonuses you’ve accumulated while also taking advantage of the benefits of a managed service.
No, and I’m fine subscribing through a provider
By far the easiest way to set up an Apache Kafka based streaming infrastructure on the cloud is to sign up via a vendor offering fully-managed Apache Kafka. But which one? To compare just two under discussion:
With Confluent your networking costs are invoiced on top of the existing service fee** and ingress/egress fees per cloud provider.
With Aiven your networking costs are included in the flat service fee.
Q4: Do you need dedicated servers for your VMs in the cloud?
You’ll probably want your VMs to reside on a dedicated server. This allows you to avoid having to share computing resources with potentially noisy neighbors, and it also reduces security risks.
If you self-manage an on-premises solution, this is easy - just have one server per system, since you control the physical machines.
But going to the cloud doesn’t necessarily mean giving up on this requirement. At Aiven, all VMs run on dedicated servers in all pricing plans, and even Confluent offers dedicated servers, although only on their Dedicated Clusters**** that come with higher pricing tiers.
Q5: What do you want to use for monitoring?
I have / will build my own custom solution
If your monitoring needs are very specific and complex to the point where you’ve built your own system, then of course you need to choose your cloud solution based on which one can accommodate them. It’s hard to give specific advice here, other than do copious amounts of research to ensure that the environment is suitable.
I’m using a range of specific monitoring systems
Research is also required when choosing how to implement a ready-built solution hinging on a number of monitoring tools, whether open source or proprietary. Some might be offered by a data platform provider as managed solutions (such as Aiven for Grafana), or you might be able to connect to an external one. Aiven for Apache Kafka allows you to connect to external monitoring tools and add your Apache Kafka service specific information to your existing monitoring infrastructure.
I want an out of the box solution
The easiest way out of the dilemma, barring special requirements, is to take advantage of a managed monitoring package.
As fully managed Apache Kafka alternative solutions, both Confluent Cloud and Aiven for Apache Kafka offer out-of-the-box monitoring tools. Confluent Cloud has a packaged Control Center that you can use to monitor the whole Confluent platform at once as well as integrations with external monitoring systems.
Aiven on the other hand offers a Grafana-based monitoring screen out of the box, but also a super-easy way to integrate other services.
Q6: What else needs to live in the same ecosystem?
I want to extend it with PostgreSQL, OpenSearch, Redis... basically, a rich selection of open source technologies
It’s the nature of a stream processing system that your Apache Kafka environment will need to connect to external sources and sinks (be it on the cloud or not). Managing and connecting those systems adds extra complexity to your data stack. With a managed service like Aiven for Apache Kafka, you can seamlessly integrate your Apache Kafka streaming data to other Aiven managed services, such as PostgreSQL, OpenSearch, Redis to name a few.
I’ll build my own extensions and integrations
Managing and maintaining extensions and connections to your ecosystem can be a daunting task for your operations teams. Especially with version upgrades, ensuring backwards compatibility between connectors and systems in your data stack; so be prepared to have your team ready for some extra work and maintenance when it comes to connectors and extensions.
Q7: What is your availability requirement?
I want five nines!
If your SLA requirement for your Apache Kafka clusters is 99.99%, then a managed service might be the best option. Aiven for Apache Kafka ensures four-nines across all plans, backed by a dedicated SRE team.
Alternative Apache Kafka providers in the market might be offering 99.95% SLA guarantees for some of their plans and 99.99% in some of the pricier options. Be sure to get some solid research on what the impact of this might be in your organization and make sure you choose accordingly.
It’s okay to have frequent service breaks, I don’t need to pay extra for that
If frequent service breaks are okay for you in your data streaming, then self-managing a Kafka environment might be a good enough option. For non-production, non-mission critical applications, your team might well be the best at operating the environment.
But once your services move to production or have to handle critical business events and data, additional planning and extra resources might be needed. This can have a big impact on the total cost of ownership of such a solution or application.
Q8: Will you be migrating from elsewhere, or do you want to stay migration-ready?
No, I’m fine, we’re starting from scratch
In that case, you can ignore this section and move on to the next one! … But wait. Are you quite sure that your migrating days are over? What happens if you choose to self-manage and your team quits? Maybe you should keep reading.
Yes, I need to bring my existing Apache Kafka clusters to the new system
If you want to bring your old Apache Kafka clusters into a managed service, then you might want to look into the open source tooling for Aiven for Apache Kafka MirroMaker 2. It offers a smooth migration process to a cloud environment (and across cloud environments) for disaster recovery scenarios.
Yes, I want to be able to pick up and leave when I want to
Same as above. With open source tooling, like MirrorMaker2, you can easily pick up your data streams where you left off while remaining flexible at all times.
Q9: What about the price and billing?
I want to pay for network costs separately and extensively customize my system based on price - no paying for nothing!
For extensive cost-based customization, a managed service alternative like Confluent Cloud and their pricing might be a fine option. But stop to ask yourself a few questions first:
- How do you expect the networking costs to evolve overtime?
- What is the mid to long term impact of more streaming data flowing through your services and
- How does that impact your TCO in the long run?
I want it easy, just one invoice to pay
In self-managed data streaming solutions, paying all the separate bills and targeting the costs appropriately is a headache. Even in managed solutions, Confluent for example invoices network costs separately.
If you want to keep things simple, Aiven’s predictable pricing might be the best solution. With Aiven for Apache Kafka, not only do you get a fully managed Kafka solution but you can also easily predict what the monthly cost of your service will be — without worrying about excess costs for networking or egress/ingress. Check out the Aiven pricing page for more details.
The price has to be low, that’s all that matters
Cost efficiency is something that every organization and developer team should take into account. And self-managing open source software like Apache Kafka might be the easiest alternative to get you started.
However, as the requirements evolve and increase, so might the necessary level of expertise (be it Apache Kafka development expertise or operational know-how around Apache Kafka).
You need to carefully plan the lifecycle of the service. It might be a factor in future costs, such as hiring and retaining the appropriate talent and ensuring that SLA guarantees are being met.
Footnotes
* Information according to Confluent’s pricing page.
** Information according to Confluent’s pricing page for networking.
*** Quality of support of 8.4 for Aiven vs Confluent at 8.2 - according to data from G2 Crowd, February 2024.
**** Confluent Cloud’s documentation page mentions single-tenant deployments only for dedicated clusters.
To get the latest news about Aiven and our services, plus a bit of extra around all things open source, subscribe to our monthly newsletter! Daily news about Aiven is available on our LinkedIn and Twitter feeds.
If you just want to find out about our service updates, follow our changelog.
Are you still looking for a managed data platform? Sign up for a free trial at https://console.aiven.io/signup!
Further reading
Stay updated with Aiven
Subscribe for the latest news and insights on open source, Aiven offerings, and more.