Skip to content

Replication and Backup Metrics

This page documents Prometheus metrics related to data replication, backup, restore, load, and unload workflows in Yellowbrick. These operations are essential for durability, disaster recovery, and data movement across environments.

Purpose

These metrics allow you to track:

  • Replication behavior: Duration, cycle counts, errors, retries, throughput per replica
  • Operations: Backup, restore, load, and unload counts and durations
  • Operational health: Success/failure breakdowns and replica state transitions

They are critical for monitoring long-running data operations, detecting replication lag or errors, validating backup freshness, and ensuring recovery readiness.

Metrics

NameTypeFreqLabelsDescription
yb_lime_backup_chain_agehistogram5m-Age in days of each backup
yb_lime_backups_durationhistogram5m-Duration in seconds of active backups
yb_lime_backups_totalcounter5mstatusNumber of backups in success/error/active states
yb_lime_loads_durationhistogram5m-Duration in seconds of active loads
yb_lime_loads_totalcounter5mstatusNumber of bulk loads in success/error/active states
yb_lime_replica_cycles_totalcounter5mreplica_idNumber of replication cycles completed (both success and errors included) for the corresponding replica
yb_lime_replica_elapsed_seconds_totalcounter5mreplica_idNumber of seconds spent actively replicating for the corresponding replica (cumulative over all replication cycles)
yb_lime_replica_errored_cycles_totalcounter5mreplica_idNumber of replication cycles that ended with error for the corresponding replica
yb_lime_replica_retries_totalcounter5mreplica_idNumber of retries for the corresponding replica (cumulative over all replication cycles)
yb_lime_replica_sent_bytes_totalcounter5mreplica_idNumber of bytes sent for the corresponding replica (cumulative over all replication cycles)
yb_lime_replica_statesgauge5mstateNumber of replicas in each state
yb_lime_replica_written_bytes_totalcounter5mreplica_idNumber of bytes written for the corresponding replica (cumulative over all replication cycles)
yb_lime_restores_durationhistogram5m-Duration in seconds of active restores
yb_lime_restores_totalcounter5mstatusNumber of restores in success/error/active states
yb_lime_unloads_durationhistogram5m-Duration in seconds of active unloads
yb_lime_unloads_totalcounter5mstatusNumber of bulk unloads in success/error/active states