Building a modern data architecture

Corporate data architectures tend to have long shelf-lives, especially if their development began sometime around when the pre-internet Dark Ages were just ending. It's not surprising that these ancient edifices no longer support modern business goals. Read how to modernize your ancient data environment and set your business up for the future.

26 January 2021
Auri PosoTechnical Copywriter at Aiven

Have you been to Venice? It’s soooo beautiful. There are hundreds of graceful bridges, countless churches and palaces from the awesome to the charming. The city's canals, filled with iconic gondolas, stir the heart. If you pick your time right, you can even do your sightseeing without being mobbed by thousands of other tourists. In fact, outside of the tourist season, it’s a very quiet city.

Like… really quiet.

That’s because not a lot of people actually choose to live in Venice. The roads are uneven, the buildings aren’t readily wheelchair and pushchair accessible, the plumbing sucks and heating is hard to install in those old buildings. The scenic canals actually stink. What’s more, the city has to contend with regular flooding, getting more severe as climate change progresses. Don’t get me wrong, I love Venice, but the city that was once called the Queen of the Adriatic wouldn’t merit that title today.

Now with that image in mind, think about your company’s data architecture. Are you seeing any similarities? Structures that no longer match your company’s modern needs? Stagnant information flows? Awkward routing between destinations? Cobbled-together data highways? If your data is living in a digital Venice, it’s time to make a change!

A look at classical data architecture

Data architecture refers to your data environment, and how it’s designed to store and move bytes of information to support your business goals. It encompasses the products and tools you use to manage data, as well as the processes designed to deliver that data to business users in ways and forms that are useful and convenient for them.

(This is as opposed to a data platform, which is the database engine itself and the dataset creation framework. It's the hard technological core, if you will, of the data architecture.)

The classical take on data architecture is the data warehouse: a static structure designed by the IT department. It houses all the data that accumulates during business operations. If done well, it can provide tailored data streams for reporting and analysis purposes; if done poorly, a glorified data dump. In any case it needs a lot of maintenance and upgrading and it isn’t very flexible or responsive.

In classical data architecture, the focus falls on maintenance and day-to-day operations like server maintenance, upgrades and access management. There are limited resources for implementing business-driven changes. On a high level, this results in a lack of responsiveness to business requirements. The world moves on, while your business intelligence systems keep doing the same old thing.

And that thing is old. No business can, nor should they try to, foresee what kind of system they’ll need in ten years’ time. The way to do good business is not to hazard guesses and bet money on them but to be responsive to change that is actually happening in the present moment.

On a more practical level, the lack of flexibility impacts what types of data can be ingested and the data sources that the system can utilize. With the classical approach, most data streams originate within the organisation. This is because using external data sources is complex and may pose security issues.

A classic corporate warehouse can't really handle new types of data, like social media or IoT, that are expressed with nested structures. At least not well, and not without major reworking. Companies can choose between not using new datasets, or building long pipelines to handle them. But pipelines are fragile, and a minimal change in the source data format can have a drastic impact downstream.

Being inflexible also means that it’s hard to use existing data for new purposes. As time goes on, you’ll discover the need for new types of analytics and outputs. Building on top of the current system will only take you so far.

Modern architecture for a modern world

Rebuilding an old data architecture into a modern one is hard. It means creating not just the technological infrastructure but also new processes, new tools, and a new culture for how data is conceptualized and utilized.

Now that you’ve hopefully decided to invest your time and effort and funds more productively, it’s time to start planning.

Planning always starts by knowing where you’re going. Start with the needs of the people who use your data--customers and business users--and build back towards the data sources. Ask the users (or failing that, ask yourself on the users’ behalf) what they need in order to grow your business and be responsive. Ask customers what they want and need in the real world. Only then start thinking where you’ll get the data to do that.

Keep your ultimate goal in mind every step of the way: an architecture that is

  • flexible
  • extendable
  • collaborative
  • resilient
  • secure
  • and simple to maintain.

For that, you want something that isn’t Renaissance Venice. Instead, how about implementing a lean, clean, minimalist design? And honestly, nothing is as lean, clean and minimalist as a managed open-source cloud-based database service designed by a company from Finland!

Secure base

Consider security from two perspectives. Firstly, access rights: how you are planning to make sure that the right people, and only the right people, have access to the data that they need in a timely fashion. And secondly, how you plan to defend against outside attacks.

It is worth noting that open source offers better security than proprietary solutions. It is true that there are several legitimate security concerns about outside attacks when using cloud services. However, they are counterbalanced by the advanced user management they usually offer. The risks are relatively easy to offset by signing up with a managed open source cloud database like Aiven who takes security issues seriously and can offer you a safe home base.

Flexible structures

You want your data architecture to be easily scalable and extendable to keep your business agile and responsive. A quality DBaaS gives you the freedom and flexibility to scale up and down with changing capacity needs. You also only pay for the capacity that you’re actually using.

With a DBaaS, too, or at least with Aiven, you can easily add more services to your plan. This lowers the threshold to try out new features and capabilities, staying on top of business trends. If not a managed service, at least go for open source. It leaves you much more freedom to build up a comprehensive service stack due to the absence of proprietary connectors and APIs.

The use case itself should govern the choice of the target datastore (usually a choice between relational and no-SQL key-value databases). Don’t be limited by what’s currently available at your company.

Functional design

With functional design in this context we mean an architecture that is easy to use and easy to maintain.

You want to enable your users to collaborate freely. They don't need any extra hassle about credentials and compatibility. Managed services typically offer SSO out of the box and all their services play nice together. This gives your users access to the entire ecosystem.

And as to flexible data structures, having datastores that don’t need updating or downright break when data structures evolve is as functional as it gets.

And as we may have mentioned before, a managed cloud service like Aiven’s is unparalleled in terms of ease of maintenance load.

Wrapping up

Dismantling your classically constructed (and beautiful!) corporate Venice may be rather heartbreaking. But when you do, you get to move to a contemporary environment. That modern space will be open, extendable, efficient, transparent, clean and above all built to match the needs of your business. You’ll be able to achieve your current goals more quickly and more cheaply. What’s more, you can change the direction of your business and reformulate new goals much more flexibly than before. Get off the gondola and board the managed open-source service speedboat now!

Not using Aiven services yet? Sign up now for your free trial at https://console.aiven.io/signup!

In the meantime, make sure you follow our changelog and blog RSS feeds or our LinkedIn and Twitter accounts to stay up-to-date with product and feature-related news.

tipsdatainternet of thingsservices

Start your free 30 day trial

Test the whole platform for 30 days with no ifs, ands, or buts.

Aiven logo

Let‘s connect

Apache Kafka, Apache Kafka Connect, Apache Kafka MirrorMaker 2, M3, M3 Aggregator, Apache Cassandra, Elasticsearch, PostgreSQL, MySQL, Redis, InfluxDB, Grafana are trademarks and property of their respective owners. All product and service names used in this website are for identification purposes only and do not imply endorsement.