Track service restore progress using the API
You can track restore progress for individual nodes during service node replacement by using the Aiven API. For example, use this endpoint to monitor the restore progress of a forked service or when applying maintenance.
The service object exposes restore progress under node_states[].progress_updates:
service.node_states[]contains per-node state entries.- When a node is restoring or catching up, its
stateis typicallysyncing_data. - When the state is
syncing_data, the node may includeprogress_updateswith one or more phase objects. - Other node states don't include restore progress data.
progress_updates may be missing or empty even when a node is in syncing_data. This
can occur when a restore completes before detailed progress is reported or when the
service does not emit detailed progress counters.
API endpoints
Restore progress fields are part of the standard service response payload.
- Get a single service (recommended for polling):
GET /project/{project}/service/{service_name} - List services in a project:
GET /project/{project}/service
- Request
- Response
curl -H "Authorization: aivenv1 API_TOKEN" https://api.aiven.io/v1/project/PROJECT_NAME/service/SERVICE_NAME
Replace the following placeholders:
API_TOKEN: Your Aiven API token.PROJECT_NAME: Your Aiven project name.SERVICE_NAME: The name of your service.
{
"service": {
...
"node_states": [
{
"node_name": "...",
"state": "syncing_data",
"progress_updates": [
{
"phase": "basebackup",
"completed": false,
"current": 3410567,
"min": 0,
"max": 7569280,
"unit": "bytes_uncompressed"
}
]
}
],
...
}
}
Node states
Common values for node_states[].state include:
setting_up_vm: The virtual machine is being created or initialized.syncing_data: The node is restoring data or catching up.running: The node is operating normally.leaving: The node is leaving the cluster.unknown: A transient or error state.
progress_updates data model
progress_updates is a list of phase objects. When present, phases appear in the
following order:
preparebasebackupstreamfinalize
Each phase object includes the following fields:
{
"completed": false,
"current": 3410567,
"max": 7569280,
"min": 0,
"phase": "basebackup",
"unit": "bytes_uncompressed"
}
Field semantics
phase: String, required. The restore phase. Possible values:prepare,basebackup,stream, andfinalize.completed: Boolean, required. Whether the phase is complete.current: Number or null, optional. The current progress value. This field can be missing or null.min: Number or null, optional. The starting value for the phase. This field can be missing or null.max: Number or null, optional. The expected total value for the phase. This value can be missing, null, or change while the restore is in progress.unit: String or null, optional. The unit forcurrent,min, andmax. New unit values can appear over time.
- Treat
unitas an opaque identifier. Unknown values can appear. maxmay change while a restore is in progress.- Not all phases report numeric counters. Some services only indicate phase completion.
Why max values can change
The current, min, and max values are best-effort progress indicators. They can be
based on estimates or on system state that changes over time. Treat max as the latest
known expected total, not as a fixed guarantee.
Common reasons max can change include:
- The restore process discovers additional work after it starts, such as files, segments, or objects that become visible only after metadata is read.
- New data is added on the backend while the node is catching up, which moves the completion point forward. This is common during incremental catch-up phases.
- Progress is calculated from system state, such as replication lag, rather than from a fixed work queue. As the system state changes, the value is recalculated.
- The service switches restore strategies during the operation, for example from snapshot restore to replication catch-up, which changes what the counters represent.
As a result:
- Phase percentage can decrease even when the restore operates normally.
- Remaining-time estimates based on
maxare unreliable. - Sudden changes in
maxare expected unless the node remains insyncing_datalonger than expected.
Restore phase meanings
Phase names are standardized, but the underlying work and the meaning of the counters are service-specific.
prepare: Prepares the node for restore.basebackup: Restores the full backup.stream: Applies incremental changes, such as replication or log replay.finalize: Completes final steps before serving traffic.
Not all restores include every phase.
Progress units
Known unit values include:
binlogs: MySQL binlog-based progress.bytes_compressed: Compressed backup data.bytes_uncompressed: Uncompressed backup data.wal_lsn: PostgreSQL® write-ahead log sequence number.
Aiven can add new unit values as new functionality becomes available.
Compute phase progress percentages
You cannot reliably compute overall restore progress. You can compute a phase-specific
progress percentage when min, max, and current are present and max != min.
pct = round(((current - min) / (max - min)) * 100, 1)
When handling progress values:
- If any of
min,max, orcurrentis null or missing, displayn/a. - If
max == min, treat the percentage as undefined. - Expect the percentage to decrease when
maxchanges. - Clamp displayed values to the range
[0, 100].
Polling guidance
Progress updates are best-effort and refresh every 10 seconds while a node is in
syncing_data. Poll the service state every 10 to 30 seconds. More frequent polling
does not provide additional detail.
For each node_states[] entry:
- If
stateis notsyncing_data, no restore progress is available. - If
stateissyncing_data:- If
progress_updatesis missing or empty, the node is restoring without detailed progress data. - Otherwise, the current phase is the last phase where
completedisfalse.
- If
Stop polling when all nodes reach the running state or when a stall is detected.
Stall detection
The API does not provide per-phase timestamps. To detect stalls, use a time-based
threshold, such as a node remaining in syncing_data longer than expected.
Do not rely on counters or max values to estimate remaining time.
Service-specific behavior
Not all services emit detailed progress updates. When available, counters follow these patterns:
- PostgreSQL
basebackup: byte-based counters, commonlybytes_uncompressedstream: WAL catch-up, often reported aswal_lsn
- MySQL
basebackup: byte-based counters, commonlybytes_compressedstream: binlog catch-up, reported asbinlogs
Unit values can vary based on the restore mechanism, service version, and implementation. Some phases may not include numeric counters.
Aiven may introduce new unit values over time. API clients must tolerate missing phases,
missing counters, changing max values, and unknown units.