Enable the consumer lag predictor for Aiven for Apache Kafka® Limited availability
The consumer lag predictor in Aiven for Apache Kafka® provides visibility into the time between message production and consumption, allowing for improved cluster performance and scalability.
Prerequisites
Before you start, ensure you have the following:
- Aiven account.
- Aiven for Apache Kafka® service running.
- Prometheus integration set up for your Aiven for Apache Kafka for extracting metrics.
- Necessary permissions to modify service configurations.
- The consumer lag predictor for Aiven for Apache Kafka® is a limited availability feature and requires activation on your Aiven account. Contact the sales team at sales@aiven.io to request activation.
Enable and configure the consumer lag predictor
- Aiven Console
- Aiven CLI
-
Once the consumer lag predictor is activated for your account, log in to the Aiven Console, select your project, and choose your Aiven for Apache Kafka® service.
-
On the Overview page, click Service settings from the sidebar.
-
Go to the Advanced configuration section, and click Configure.
-
In the Advanced configuration window, click Add configuration options.
-
Set
kafka_lag_predictor.enabled
to Enabled. This enables the lag predictor to compute predictions for all consumer groups and topics. -
Configure the following options:
-
Set
kafka_lag_predictor.group_filters
: Specify the consumer group pattern to include only the desired consumer groups in the lag prediction. By default, the consumer lag predictor calculates the lag for all consumer groups, but you can restrict this by specifying group patterns.Example group patterns:
consumer_group_*
: Matches any consumer group that starts withconsumer_group_
, such asconsumer_group_1
orconsumer_group_a
.important_group
: Matches exactly the consumer group namedimportant_group
.group?-test
: Matches consumer groups likegroup1-test
orgroupA-test
, where the?
represents any single character.
-
Set
kafka_lag_predictor.topics
: Specify which topics to include in the lag prediction. By default, predictions are computed for all topics, but you can restrict this by using topic names or patterns.Example topic patterns:
important_topic_*
: Matches any topic that starts withimportant_topic_
, such asimportant_topic_1
,important_topic_data
.secondary_topic
: Matches exactly the topic namedsecondary_topic
.topic?-logs
: Matches topics liketopic1-logs
ortopicA-logs
, where the?
represents any single character.
-
-
Click Save configuration to save your changes and enable consumer lag prediction.
To enable the consumer lag predictor for your Aiven for Apache Kafka service using Aiven CLI:
-
Ensure the consumer lag predictor feature is activated for your account by contacting the sales team at sales@aiven.io. The consumer lag predictor is a limited availability feature and needs to be activated for your account.
-
Get the project information:
avn project details
If you need details for a specific project, use:
avn project details --project PROJECT_NAME
-
Get the name of the Aiven for Apache Kafka service:
avn service list
Make a note of the
SERVICE_NAME
corresponding to your Aiven for Apache Kafka service. -
Enable the consumer lag predictor for your service:
avn service update SERVICE_NAME -c kafka_lag_predictor.enabled=true
Replace
SERVICE_NAME
with your service name.noteThis enables the lag predictor to compute predictions for all consumer groups across all topics.
-
Configure the consumer groups and topics to be included in the lag prediction:
-
For consumer groups: Set the
kafka_lag_predictor.group_filters
option to specify which consumer groups should be included in the lag prediction. By default, the consumer lag predictor calculates the lag for all consumer groups, but you can restrict this by specifying group patterns.avn service update SERVICE_NAME \
-c kafka_lag_predictor.group_filters='["example_consumer_group_1", "example_consumer_group_2"]'- Replace
SERVICE_NAME
with the actual name or ID of your Aiven for Apache Kafka® service. - Replace
example_consumer_group_1
andexample_consumer_group_2
with your consumer group names.
- Replace
-
For topics: Set the
kafka_lag_predictor.topics
option to specify which topics should be included in the lag prediction. By default, predictions are computed for all topics, but you can restrict this by using topic names or patterns.avn service update SERVICE_NAME \
-c kafka_lag_predictor.topics='["important_topic_*", "secondary_topic"]'Replace
important_topic_*
andsecondary_topic
with your topic names or patterns.
-
Monitor metrics with Prometheus
After enabling the consumer lag predictor, you can use Prometheus to access and monitor detailed metrics that provide insights into your Apache Kafka cluster's performance:
Metric | Type | Description |
---|---|---|
kafka_lag_predictor_topic_produced_records_total | Counter | Represents the total count of records produced. |
kafka_lag_predictor_group_consumed_records_total | Counter | Represents the total count of records consumed. |
kafka_lag_predictor_group_lag_predicted_seconds | Gauge | Represents the estimated time lag, in seconds, for a consumer group to catch up to the latest message. |
For example, you can monitor the average estimated time lag in seconds for a consumer group to consume produced messages using the following PromQL query:
avg by(topic,group)(kafka_lag_predictor_group_lag_predicted_seconds_gauge)
Another useful metric to monitor is the consume/produce ratio. You can monitor this per topic and partition for consumer groups by using the following PromQL query:
sum by(group, topic, partition)(
kafka_lag_predictor_group_consumed_records_total_counter
)
/ on(topic, partition) group_left()
sum by(topic, partition)(
kafka_lag_predictor_topic_produced_records_total_counter
)