Dec 5, 2024
How Data Stack Modernization is Helping Kroo Bank Secure its Challenger Status
The UK’s fintech market is crowded, but Kroo Bank is making waves as one of the top four digital-only banks. I recently spoke to Andrey Fadeev, Staff Software Engineer at Kroo Bank, who shared his insights on how Kroo is leveraging data stack modernization to support growth and drive competitive edge.
Company growth and burgeoning volumes of data
From the beginning, Kroo’s architecture was designed with future scalability in mind, using PostgreSQL® as its database and Apache Kafka® as the mechanism for message distribution. By embracing an asynchronous infrastructure, Kroo can also independently scale different parts of its system, improving efficiency and responsiveness.
After securing its full UK banking license in 2022, Kroo launched its current account. In the next 12 months alone, it secured an impressive 150,000 customers. This led to increased transaction and data volume.
Limitations of self-managed Kafka cluster
Fadeev told me, “This growth had an impact on our thinking. Our Kafka cluster had never failed us. The asynchronous set-up meant that scaling one part of the system didn’t put too much stress on the rest. Our worst-case scenario was that we built up some queues somewhere, but we could still scale other parts.”
But would it hold up under ever-growing amounts of data traffic? One challenge was that the bank’s storage capacity was approaching its limits as data volumes increased. As Fadeev describes it: “We had our self-hosted Kafka deployed in Amazon ECS, but with Amazon EBS volumes.”
He and the team spent a lot of time figuring out how to achieve the scale they needed. Eventually, they switched to Debezium Change Data Capture, which relies on the write-ahead log in the database to capture updates reliably and enables a rapid and automated response to changes in the database.
Moving to Aiven for Apache Kafka®
Facing the limitations of their existing infrastructure, Fadeev decided they needed a managed solution. That's when Kroo partnered with Aiven to leverage Aiven for Apache Kafka.
Kroo was experiencing a pattern we had seen many times before — a growing business choosing to offload the management of on-premises or self-managed technologies into the cloud in order to focus development time on revenue-generating activities.
I asked Fadeev how the team at Kroo was optimizing Kafka in its managed state for both performance and data observability.
“There are no magic tricks here,” he told me. “We mainly use the defaults from the Kafka cluster, which work well for our use case. One key optimization was to enable Zstandard compression in Kafka. Around 90% of Kroo’s data flows into Kafka through the Kafka Connector, so enabling compression as a global setting benefited nearly every topic. We enabled compression for the 10% of data that flows through Kafka Streams as well.”
From Fadeev’s perspective, having access to Aiven’s expertise meant they could make the right decisions around default settings and configurations. “Configuration options can exceed 300 settings,” he rightly notes. “It's pretty cool that we can just rely on defaults, and it will work.”
A multi-cloud journey
Another topic we discussed was the need for a multi-cloud infrastructure. It’s increasingly becoming a regulator requirement in some industries — and an important part of many business continuity plans.
Currently, Kroo’s main provider is AWS, and most of its workloads are also on AWS. Its setup is pretty simple. It has a single region with multiple availability zones, which is fairly standard for startups and small companies.
But Kroo is now on the way to adopting multi-cloud provisioning. It has started using Google Cloud for some workloads, like BigQuery, and uses Vertex AI for some products. “Being a fintech and operating as a licensed bank means we’re in a heavily regulated environment,” says Fadeev. “So we have to think about disaster recovery — not just because it’s the right thing to do, but to ensure that regulators are satisfied with our approach.”
“Currently, we are focused on being able to roll our services out to different regions. We have a structure for our databases, so we keep backups available in multiple regions. That allows us to quickly spin up database replicas in a different region,” Fadeev says.
It’s a familiar multi-cloud journey among our customers. Fortunately, the ability to clone or move an Aiven service from one region to another in the same cloud is broadly the same as moving it to a different cloud, with everything being persisted without downtime.
Value of open source
Lastly, Fadeev also highlighted the crucial role of open source in Kroo's success, emphasizing how it drives cost efficiency and enhances business value. A key factor in Kroo's decision to partner with Aiven was the ability to maintain close alignment with the open-source Kafka ecosystem.
Having access to the Kafka community and ecosystem is a significant asset for growing businesses. It gives them access to technologies that accelerate growth without being locked into a single vendor.
“We were looking for ways to eliminate the headache of maintenance and version upgrades,” says Fadeev. “What we have out-of-the-box is pretty standard and close to the open-source version of Kafka. It’s one reason why Aiven has worked really well for us.”
“For example, Kroo has around 70 or 80 connectors, essentially the number of services with their own databases that need to publish events to Kafka. It's quite a high number,” Fadeev says. “But we know the load isn’t too high, so we can fit all those connectors into a relatively small Kafka Connect deployment.”
Unlocking cost-efficiencies with tiered storage
So what’s next for Fadeev and for Kroo? What’s the next evolution of its architecture and infrastructure?
Fadeev is excited to explore tiered storage for Kafka from Aiven which allows users to offload to cheaper storage for larger queues. “Currently, we have a fairly restrictive retention policy set to around two weeks or maybe a month for some topics. If we can increase that retention for specific topics, especially to an infinite retention policy, it would help optimize costs significantly.”
Another focus will be on streaming and streaming applications. With Kafka in place, mainly for asynchronous communication between microservices, there’s an untapped opportunity to use it for real-time data streaming and data pipelines.
Enabling data scientists
Kroo also wants to open up its data platform to more of its data teams, particularly those in data science, so they can build more real-time applications. “We’re currently using Vertex AI for some tasks, like automatic alert closure, which has significantly reduced the operational team's workload by handling many false positives,” Fadeev says.
“We expose only the bits of data they need, rather than opening the entire Kafka cluster. Right now, the microservices control the data, using Kafka as a bus to communicate. We need to ensure that the data is in the right shape and that its quality is maintained.”
It’s a challenge that many companies face when exposing data across their organization. But it is one that I am sure Fadeev, and the wider team at Kroo, will overcome. I am very much looking forward to seeing what they accomplish next.
Stay updated with Aiven
Subscribe for the latest news and insights on open source, Aiven offerings, and more.