Skip to main content

Check data distribution between storage devices in Aiven for ClickHouse®'s tiered storage

Monitor how your data is distributed between the two layers of your tiered storage: SSD and object storage.

If you have the tiered storage feature enabled, your data in Aiven for ClickHouse is distributed between two storage devices (tiers). You can check on what storage devices your databases and tables are stored. You can also preview their total sizes as well as part counts, minimum part sizes, median part sizes, and maximum part sizes.

Prerequisites

Check data distribution in Aiven Console

You can use the Aiven Console to check if tiered storage is enabled on a table and, if it is, how much storage is used on each tier (local SSD and remote object storage) for this particular table.

To access tiered storage's status information, go to the Aiven Console > your Aiven for ClickHouse service's Overview > Databases and tables > your database > your table > View details > Storage details.

Run a data distribution check with the ClickHouse client

  1. Connect to your Aiven for ClickHouse service using, for example, the ClickHouse client.

  2. Run the following query:

    SELECT
    database,
    table,
    disk_name,
    formatReadableSize(sum(data_compressed_bytes)) AS total_size,
    count(*) AS parts_count,
    formatReadableSize(min(data_compressed_bytes)) AS min_part_size,
    formatReadableSize(median(data_compressed_bytes)) AS median_part_size,
    formatReadableSize(max(data_compressed_bytes)) AS max_part_size
    FROM system.parts
    GROUP BY
    database,
    table,
    disk_name
    ORDER BY
    database ASC,
    table ASC,
    disk_name ASC

    You can expect to receive the following output:

    ┌─database─┬─table─────┬─disk_name─┬─total_size─┬─parts_count─┬─min_part_size─┬─median_part_size─┬─max_part_size─┐
    │ datasets │ hits_v1 │ default │ 1.20 GiB │ 6 │ 33.65 MiB │ 238.69 MiB │ 253.18 MiB │
    │ datasets │ visits_v1 │ S3 │ 536.69 MiB │ 5 │ 44.61 MiB │ 57.90 MiB │ 317.19 MiB │
    │ system │ query_log │ default │ 75.85 MiB │ 102 │ 7.51 KiB │ 12.36 KiB │ 1.55 MiB │
    └──────────┴───────────┴───────────┴────────────┴─────────────┴───────────────┴──────────────────┴───────────────┘

The query returns a table with data distribution details for all databases and tables that belong to your service: the storage device they use, their total sizes as well as parts counts and sizing.

What's next