Reindex Aiven for DataHub search and graph indices Limited availability
Rebuild your OpenSearch indices for search and graph data if your search results or relationship graphs differ from the data in your metadata database. This is useful:
- After OpenSearch® data loss
- When an index is corrupt or inconsistent
- After wiping a cluster or re-provisioning the search backend
- After a schema or mapping change that requires a full reindexing
- For disaster recovery where SQL is intact, but OpenSearch is not
To reindex your indices, run the RestoreIndices upgrade task. This task rebuilds
the indices from the source of truth metadata_aspect_v2 SQL table. It
replays every aspect from the database back into search and graph stores.
You can run this at any time. Events are replayed asynchronously and existing reads keep working. Always test reindexing in a staging environment first.
Prerequisites
Get the URL for the GMS app:
- In your DataHub service, in the DataHub resources section,
open the Aiven App that ends in
-gms. - In the Connection information section, copy the Application URL.
Run the restore indices task
-
In your DataHub service, go to the DataHub resources section.
-
Open the Aiven App that ends in
-upgrade. -
In the Environment variables section, click Edit.
-
On the Variables tab, add the following variables:
Key Value Description UPGRADE_JOBRestoreIndicesThe restore indices task. KAFKA_SCHEMAREGISTRY_URLGMS_APP_URL/schema-registry/api/Queries Kafka topic schemas for re-emitting events. The
GMS_APP_URLis the application URL for the GMS app. -
Optional: Add
UPGRADE_JOB_ARGSto include additional arguments:Arg Description -a cleanWipes each index before repopulating. Use when an index has stale documents that you don't want to carry over. -a batchSizeNumber of records per batch. -a urnBasedPaginationUses URN-based pagination instead of offset-based. Set to truefor large datasets.-a aspectNamesComma-separated list of aspect names to reindex. Use to speed up partial recoveries. For example, aspectNames=datasetProperties,ownership.-a urnLikeSQL LIKEpattern to filter URNs. Use to target specific entity types. For example,urnLike=urn:li:dataset:%to reindex all datasets. -
Click Save.
After setting the variables, the upgrade app restarts automatically. It's in the Building state until the reindexing completes.
-
When the upgrade app is in the Powered off state, remove the variables you added.