-$225,994
Network
Cut your TCO by up to 80%
Unify real-time streaming and the Lakehouse
Experience the power of Inkless™
Diskless topics for Apache Kafka®
Bring your own cloud (BYOC) deployed as stateless service, directly in your VPC with no disks to manage.
Inkless implements diskless topics that write data directly to object storage, like AWS S3, using a leaderless architecture where any broker can handle any partition. This eliminates expensive local disk replication for topic data, though brokers still leverage small amounts of disk for metadata.
Apache Kafka stores and replicates data across multiple broker servers using local disks, with a designated leader broker handling writes to each partition while follower brokers maintain copies. This architecture requires cross-zone disk replication for high availability, which generates expensive network traffic as data is duplicated across different cloud availability zones.
Tired of unpredictable Kafka bills and the high cost of data retention? Your traditional Kafka setup is costing you more than you think. Use our interactive calculator to see how Aiven's diskless architecture can dramatically reduce your TCO. Input your current usage and discover your potential savings in minutes.
-$538,492 /month
Compared to Kafka 3AZ
-$225,994
Network
-$221,486
Storage
-$51,138
Broker
-$39,875
Personnel
One Kafka cluster to rule all streams, slashing TCO by up to 80 %, and staying 100% compatible with every client, connector and tool you already use.
Run sub‑100 ms streams and 80 % cheaper batch topics in the same cluster. No silos, no cluster sprawl — just one cloud‑native engine for every workload
Leaderless architecture writes straight to S3/GCS, erasing cross‑AZ fees and disks. Scale storage and compute independently, and pay only for what you actually use.
Keep every client, connector, and tool — just upgrade and go. It’s upstream-aligned, fully open source, and offers zero lock-in. Kafka, reimagined.
Can't find what you are looking for? Chat with us
Metric | Max seen in production | Production limits (tested) | Production limits (future) |
|---|---|---|---|
Data In | 1.8 GB/sec | 10 GB/sec | Unlimited |
P99 Diskless Latency | 1500ms | 2000ms | 800ms |
Partitions | 68,000 | 154,000 | Unlimited |
Connections | 120,000 | Unlimited | Unlimited |
Can't find what you are looking for? Chat with us
Inkless is the name for Aiven's innovative Apache Kafka service, purpose-built for the cloud. Inkless modernises Kafka's design by incorporating diskless topics to slash running costs.
Why are we naming it Inkless? Apache Kafka draws its name from Franz Kafka, the novelist whose ink-on-paper craft mirrors traditional data systems’ reliance on I/O operations. "Inkless" Kafka reimagines this paradigm, replacing I/O with cloud-native storage - enabling data persistence through scalable, decentralized architectures rather than conventional disk-bound writes.
The Kafka cost estimator reflects a real-world deployment across three availability zones (AZs). It includes key features like Tiered Storage and Fetch-from-Follower, SSD-backed brokers with built-in capacity headroom, and realistic cloud pricing for compute, storage, and cross-AZ traffic.
When diskless topics are selected, the model also accounts for the lower operational effort required to run these clusters.
1. High availability and replication
All Kafka clusters modeled in the calculator are spread across three AZs. Each topic uses a replication factor of 3, ensuring durability and availability.
This replication level is fixed in the estimator and matches Apache Kafka’s default recommendation for production environments.
2. Kafka features enabled by default
The estimator assumes two key features are always on:
3. Disk and capacity guardrails
To reflect realistic operational behavior, the estimator includes resource usage limits for each broker:
Constraint | Reason |
|---|---|
SSD-only storage | HDDs increase tail latency; SSDs match Aiven’s Kafka fleet |
≤ 40% disk usage per broker | Leaves headroom for rebalancing and unexpected traffic spikes |
≤ 80% CPU and network utilization | Reduces jitter and prevents resource throttling |
Maximum disk size: 16 TiB | Larger EBS volumes slow down restarts significantly on AWS |
If a workload can tolerate HDD latency, the calculator may favor Diskless Topics. Offloading data to remote storage removes the local disk I/O bottleneck.
4. Cluster sizing and partition limits
To avoid excessive partition reassignments and reduce mean time to recovery (MTTR), the calculator applies Kafka community sizing guidance:
When these thresholds are exceeded, the model assumes a second cluster is created.
5. Cross-AZ network pricing
The estimator uses actual cloud provider pricing for cross-AZ data transfer:
Cloud provider | Cross-AZ cost (per GiB) |
|---|---|
AWS | $0.02 (bidirectional) |
Google Cloud | $0.01 |
Azure | $0.00 (within region only) |
Cross-region traffic is not included by default. To model inter-region mirroring, enable the corresponding option in the calculator.
6. Operational effort
The estimator includes operational staffing assumptions based on Aiven’s internal telemetry:
Cluster type | Staffing estimate (FTE per 100 MiB/s sustained ingest) |
|---|---|
Classic Kafka | 0.5 FTE |
Diskless Kafka | 0.125 FTE |
Clusters using only Diskless Topics require less manual intervention. By offloading data to object storage, Tiered Storage removes the need for local disk management. This simplification also reduces the impact of incidents. The staffing estimates used in the calculator are intentionally conservative compared to public total cost of ownership (TCO) models.
BYOC (Bring Your Own Cloud) is Aiven’s deployment model that runs Diskless Topics - and optionally other Aiven services - directly inside your own cloud environment. This model gives you full control over your environment while Aiven manages operations through its control plane.
BYOC tiers are flat-fee subscription levels based on the sustained compressed ingress throughput of your Diskless Topics BYOC deployment.
Each tier includes:
Pricing scales with the volume of data (in MB/s) streamed through Kafka. Higher throughput maps to higher tiers, but every tier provides the same operational benefits.
Running Kafka at scale requires engineering expertise and operational maturity. With Diskless Topics in BYOC, data is stored in your own object storage, and compute runs in your own cloud account—enabling you to benefit from your provider’s cost optimizations. Aiven handles deployment, monitoring, upgrades, and recovery.
The pricing model is designed to keep the cost per MB/s low while meeting Aiven’s reliability and automation standards. It simplifies operations and offers a cost-effective alternative to managing Kafka infrastructure in-house.
Sample BYOC tiers
The following tiers represent 95th percentile sustained throughput ranges. They are not hard limits. Occasional short bursts above the defined range are allowed. If sustained traffic increases, you can upgrade tiers without downtime.
Tier | Sustained ingest (MB/s) | Example use cases |
|---|---|---|
T1 – Pilot | ≤ 20 | CI/CD pipelines, proof-of-concept environments |
T2 – Starter | 21 – 50 | Single-team apps, small SaaS products |
T3 – Growth | 51 – 100 | Multi-team workloads, analytics pipelines |
T4 – Scale | 101 – 200 | Regional IoT ingestion, clickstream data |
T5 – Large | 201 – 500 | Heavy telemetry, personalization engines |
T6 – XL | 501 – 1 000 | Ad tech, financial data feeds |
T7 – XXL | 1 001 – 1 500 | Game backends, national-scale telemetry |
T8 – Ultra | 1 501 – 2 000 | CDN logs, multi-region streaming workloads |
Custom | 2 000 | High-throughput systems beyond standard tiers |
Flexibility and cloud consistency
These tiers are designed to be flexible starting points, not rigid constraints. Aiven’s team can:
The pricing remains consistent across AWS, Google Cloud, and Azure. You retain any savings from reserved instances, committed use discounts, or storage tiering available in your own account.
Diskless topics are now an integrated feature within the Aiven for Kafka service, specifically for Bring Your Own Cloud(BYOC) deployments. To use them, enable the feature as part of your service's creation. You can find this option in the Console UI or when using other creation methods like the CLI, API, or Terraform.
Diskless topics are fully compatible with traditional Kafka topics. They use the same producer and consumer APIs, preserve message ordering and offsets, support exactly-once semantics, and can run alongside classic topics in the same cluster—no application changes needed.
The difference is in how data is stored. Instead of writing to broker disks, diskless topics stream data directly to cloud object storage.
This approach offers three key benefits:
Diskless topics are not ideal for ultra-low-latency workloads that require sub–500 ms end-to-end delivery. For those scenarios, traditional disk-based Kafka is a better fit.
However, for use cases that can tolerate one to two seconds of latency between producing and consuming data, diskless topics offer significant advantages. They reduce storage costs and allow compute and retention to scale independently.
Where diskless topics shine
Workload | Why the trade-off pays off | Supporting insight |
|---|---|---|
Application & infrastructure logs | High-volume data with long retention requirements; real-time dashboards can tolerate some delay. | Object storage offers lower per-GB cost with high durability (for example, "eleven nines"). |
Telemetry and metrics | Time-series data (such as Prometheus, OpenTelemetry, or similar) streams continuously, but dashboards typically refresh every few seconds. | Remote tiers handle sustained write spikes without re-sharding disks. |
IoT sensor data | Millions of small messages; cost is a bigger concern than sub-second speed. | Typical object storage write latency (100–200 ms) is acceptable for these workloads. |
Clickstream and user analytics | Web/mobile events feed near-real-time dashboards and nightly batch jobs. | Tiered storage decouples compute from retention as data volume grows. |
Change-data capture (CDC) | Slight lag is fine when capturing database changes into data lakes. | Object storage aligns with downstream formats like Parquet and Iceberg. |
ML feature logging and training | Models train on massive histories; replaying events is asynchronous. | Diskless keeps costs low while preserving Kafka ordering and semantics. |
Security trails and audit logs | Compliance requires long retention, but speed is less important. | Offloading to object storage avoids expanding broker disk usage. |
Backup and replay queues | Used for batch workflows or disaster recovery; prioritizes durability over speed. | Data streams directly into durable object storage for long-term recovery. |