Appearance
Replication and Backup Metrics
This page documents Prometheus metrics related to data replication, backup, restore, load, and unload workflows in Yellowbrick. These operations are essential for durability, disaster recovery, and data movement across environments.
Purpose
These metrics allow you to track:
- Replication behavior: Duration, cycle counts, errors, retries, throughput per replica
- Operations: Backup, restore, load, and unload counts and durations
- Operational health: Success/failure breakdowns and replica state transitions
They are critical for monitoring long-running data operations, detecting replication lag or errors, validating backup freshness, and ensuring recovery readiness.
Metrics
| Name | Type | Freq | Labels | Version Introduced | Version Deprecated | Description |
|---|---|---|---|---|---|---|
yb_lime_active_backups_duration | histogram | 5m | - | 7.4.0 | - | Duration of active backups in seconds |
yb_lime_active_loads_duration | histogram | 5m | - | 7.4.0 | - | Duration of active loads in seconds |
yb_lime_active_restores_duration | histogram | 5m | - | 7.4.0 | - | Duration of active restores in seconds |
yb_lime_active_unloads_duration | histogram | 5m | - | 7.4.0 | - | Duration of active unloads in seconds |
yb_lime_backup_chain_age | histogram | 5m | - | 7.4.0 | - | Age in days of each backup |
yb_lime_backups_total | counter | 5m | status | 7.4.0 | - | Number of backups completed in error/success states |
yb_lime_completed_backups_duration | histogram | 5m | - | 7.4.0 | - | Duration of completed backups in seconds |
yb_lime_completed_loads_duration | histogram | 5m | - | 7.4.0 | - | Duration of completed loads in seconds |
yb_lime_completed_restores_duration | histogram | 5m | - | 7.4.0 | - | Duration of completed restores in seconds |
yb_lime_completed_unloads_duration | histogram | 5m | - | 7.4.0 | - | Duration of completed unloads in seconds |
yb_lime_loads_total | counter | 5m | status | 7.4.0 | - | Number of bulk loads completed in error/success states |
yb_lime_replica_cycles_total | counter | 5m | replica_id | 7.4.0 | - | Number of replication cycles completed (both success and errors included) for the corresponding replica |
yb_lime_replica_elapsed_seconds_total | counter | 5m | replica_id | 7.4.0 | - | Number of seconds spent actively replicating for the corresponding replica (cumulative over all replication cycles) |
yb_lime_replica_errored_cycles_total | counter | 5m | replica_id | 7.4.0 | - | Number of replication cycles that ended with error for the corresponding replica |
yb_lime_replica_retries_total | counter | 5m | replica_id | 7.4.0 | - | Number of retries for the corresponding replica (cumulative over all replication cycles) |
yb_lime_replica_sent_bytes_total | counter | 5m | replica_id | 7.4.0 | - | Number of bytes sent for the corresponding replica (cumulative over all replication cycles) |
yb_lime_replica_states | gauge | 5m | state | 7.4.0 | - | Number of replicas in each state |
yb_lime_replica_written_bytes_total | counter | 5m | replica_id | 7.4.0 | - | Number of bytes written for the corresponding replica (cumulative over all replication cycles) |
yb_lime_restores_total | counter | 5m | status | 7.4.0 | - | Number of restores completed in error/success states |
yb_lime_unloads_total | counter | 5m | status | 7.4.0 | - | Number of bulk unloads completed in error/success states |