An introduction to time series databases

In this post, we take a deep dive into time series databases. We'll look at what they are, why they're needed, and what makes them uniquely suited for their workload.

18 November 2020
Kyle Buzzell
Kyle Buzzell RSS Feed
Head of Growth Marketing at Aiven

One of the more recent specialized database models that have been developed since the introduction of the relational database management system (RDBMS), time series databases have been steadily growing in popularity since 2015. However, they have been outpacing all other database models by a wide margin over the past two years according to DB-Engines.

In this article, we’ll take a deeper look at time series databases, their distinguishing features, benefits, and common use cases.

What is a Time Series Database?

As the name implies, a time series database (TSDB) makes it possible to efficiently and continuously add, process, and track massive quantities of real-time data with lightning speed and precision. While other database models have been used for these kinds of workloads in the past, TSDBs utilize specific algorithms and architecture to deal with their unique needs.

A time series database stores data as pairs of time(s) and value(s). By storing data in this way, it makes it easy to analyze time series, or a sequence of points recorded in order over time. A TSDB can handle concurrent series, measuring many different variables or metrics in parallel.

Early time series databases were primarily used for processing volatile financial data and streamlining securities trading. However, the world’s changed a lot since they were first introduced and many new use cases have emerged as technology has continued to evolve.

For example, the internet of things (IoT) concept and its associated sensors that constantly collect and stream data underlie a number of modern workloads such as powering industrial applications, predicting sales demand, analyzing temperature readings, and providing medical information from wearable devices. As you can imagine, the data produced is staggering...

Well, it can be staggering for more traditional databases whose requirements and design decisions restrict their suitability for such workloads. Luckily, there are time series databases and they are getting better and better at dealing with the ever growing demand of this data type.

What are the benefits of a time series database?

There’s a reason more and more developers and organizations are using time series databases. They deliver a number of benefits, which we’ll now briefly explore.

Benefit: More accurate and meaningful time series measurement

The most obvious benefit is that a time series database makes it easy to measure how datasets change over time. With a time series database in place, you can concurrently view past, present, and future datasets for reporting that is more accurate and meaningful.

Benefit: Resource-efficient data storage

By the very nature of the data type, processing it can require massive amounts of storage, which can be difficult to manage — and very expensive. Time series databases possess tooling that makes it possible to aggregate data into predetermined time periods and eliminate certain data streams as needed and use compression algorithms that optimize storage.

Benefit: Lightning-fast data queries

A TSDB can also make it easy to query and retrieve data based on specific periods. For example, imagine someone who doesn’t remember the title of a book they recently read but know they read it three months ago. Time series databases can help the individual figure out what the book was without having to use a bunch of wildcard searches. Using a time series database, you can quickly find information based on timeframe — enabling rapid retrieval.

Common time series database use cases

1. Accessing IoT data

Most IoT deployments — like connected water, energy, and temperature meters — require constant data collection and reporting at regular intervals. Time series analysis can provide timestamped data points, making it possible to identify seasonal patterns, average usage, and inefficiencies. For example, a connected pH meter connected to a TSDB might tell a technician tasked with maintaining a specific pH level that a certain vat of water is becoming too acidic. IoT endpoints also collect massive amounts of data, requiring highly scalable time series databases.

2. Monitoring web services, applications and infrastructure

Companies can also use time series databases to measure the performance of their applications and web properties. For example, the open source monitoring system Prometheus is a time series database that enables developers to keep tabs on performance trends over time. This enables them to easily detect when problems are occuring, which then allows them to plan maintenance and rapidly respond to incidents to sustain an optimal user experience.

Some web and mobile applications store the events within their app in a TSDB (such as a button click, playing a video or sharing some content). These events allow them to map a user’s journey, identify frustrations or performance bottlenecks and streamline more complex processes.

3. Understanding financial trends

Using time series data to accurately predict financial trends is very difficult. However, a TSDB can provide a wealth of contextual data to help analysts. Let’s take the stock market as an example; a sudden increase in airline stock may coincide with holiday travel. Or an executive leadership purge may spook investors, causing a stock to temporarily tumble. Time series databases make it easy to cross-reference data, providing a richer, clearer picture.

4. Processing self-driving car data

Self-driving cars typically collect about 4,000 GB of data per day, which is beyond the scope of what a typical relational database can process. Time series databases enable faster data ingest and queries and stronger data compression. As a result, they are ideal for processing massive volumes of real-time data that can be used to improve the safety of self-driving cars.

5. Sales forecasting

Retail stores are continuously challenged to predict future sales in order to accurately stock their shelves with products. Thanks to time series databases, retailers can use statistical models in conjunction with historical data and cross-reference it with consumer behavior trends to predict future patterns and make informed decisions about which products to keep in stock and when.

For instance, retailers are now using forecasting to plan ahead and restock bicycles, which are now experiencing a shortage due to the pandemic. Retailers are using data to predict when new products will become available again, what the demand will be like, and what alternative transportation options consumers are buying in lieu of bicycles e.g., trikes, rollerblades, etc.

Wrapping up

As you can see, the use case for time series databases is growing. But, we’re just getting started when it comes to realizing their full promise. As more capable solutions become available, such as M3, we will see a virtuous circle form where new use cases are unlocked.

Even better, managed services will help fuel the expansion of their use cases because users will be able to focus on the problem space instead of standing and managing the infrastructure. To learn more about the latest evolution in time series databases, make sure you read this article.

But there's more...

To find out what the difference is between time series and event data, read this excellent article by Lorna Mitchell: Time series or event data?

For help on choosing the right time series database, you could start with How to choose a time series database: 7+3 considerations and then for more product-specific information take a look at M3 vs other time series databases.

For a case of Grafana combined with M3, have a look at Lorna Mitchell's post Metrics and graphs with M3 and Granana

And these are not the only time series related articles on the Aiven blog, so use the Category menu to find more!

Otherwise, stay up-to-date with the latest industry news and our opinions, and Aiven by subscribing to our blog and changelog RSS feeds, and following us on Twitter.

Time-series-microsite-promo-banner

Start your free 30 day trial

Test the whole platform for 30 days with no ifs, ands, or buts.

Aiven logo

Let‘s connect

Apache Kafka, Apache Kafka Connect, Apache Kafka MirrorMaker 2, M3, M3 Aggregator, Apache Cassandra, Elasticsearch, PostgreSQL, MySQL, Redis, InfluxDB, Grafana are trademarks and property of their respective owners. All product and service names used in this website are for identification purposes only and do not imply endorsement.