Appearance
Replication and Backup Metrics
This page documents Prometheus metrics related to data replication, backup, restore, load, and unload workflows in Yellowbrick. These operations are essential for durability, disaster recovery, and data movement across environments.
Purpose
These metrics allow you to track:
- Replication behavior: Duration, cycle counts, errors, retries, throughput per replica
- Operations: Backup, restore, load, and unload counts and durations
- Operational health: Success/failure breakdowns and replica state transitions
They are critical for monitoring long-running data operations, detecting replication lag or errors, validating backup freshness, and ensuring recovery readiness.
Metrics
| Name | Type | Freq | Labels | Description |
|---|---|---|---|---|
yb_lime_backup_chain_age | histogram | 5m | - | Age in days of each backup |
yb_lime_backups_duration | histogram | 5m | - | Duration in seconds of active backups |
yb_lime_backups_total | counter | 5m | status | Number of backups in success/error/active states |
yb_lime_loads_duration | histogram | 5m | - | Duration in seconds of active loads |
yb_lime_loads_total | counter | 5m | status | Number of bulk loads in success/error/active states |
yb_lime_replica_cycles_total | counter | 5m | replica_id | Number of replication cycles completed (both success and errors included) for the corresponding replica |
yb_lime_replica_elapsed_seconds_total | counter | 5m | replica_id | Number of seconds spent actively replicating for the corresponding replica (cumulative over all replication cycles) |
yb_lime_replica_errored_cycles_total | counter | 5m | replica_id | Number of replication cycles that ended with error for the corresponding replica |
yb_lime_replica_retries_total | counter | 5m | replica_id | Number of retries for the corresponding replica (cumulative over all replication cycles) |
yb_lime_replica_sent_bytes_total | counter | 5m | replica_id | Number of bytes sent for the corresponding replica (cumulative over all replication cycles) |
yb_lime_replica_states | gauge | 5m | state | Number of replicas in each state |
yb_lime_replica_written_bytes_total | counter | 5m | replica_id | Number of bytes written for the corresponding replica (cumulative over all replication cycles) |
yb_lime_restores_duration | histogram | 5m | - | Duration in seconds of active restores |
yb_lime_restores_total | counter | 5m | status | Number of restores in success/error/active states |
yb_lime_unloads_duration | histogram | 5m | - | Duration in seconds of active unloads |
yb_lime_unloads_total | counter | 5m | status | Number of bulk unloads in success/error/active states |