Catalog Maintenance Manager

The Catalog Maintenance Manager helps maintain Yellowbrick system catalogs to keep them healthy and efficient — without relying on external scripts. It manages operations such as VACUUM and REINDEX through simple SQL commands.

Why Catalog Maintenance Matters

System catalogs grow as objects are created, modified, or dropped. Without regular maintenance, catalog tables can accumulate:

Dead tuples (deleted rows that still occupy space)
Invalidated indexes

These conditions can slow query planning and increase memory usage. In extreme cases, this can lead to the database going into read-only mode or not accepting new connections. The Catalog Maintenance Manager ensures that catalogs stay lean, accurate, and performant with minimal human intervention.

How It Works

Maintenance operations are coordinated through three components:

SQL Control Layer — SQL commands that start, stop, and monitor maintenance operations.
Health Evaluation Function — Identifies which tables or indexes require attention.
Maintenance Service — Executes the maintenance actions in the background and reports progress.

When maintenance runs, it automatically:

Pauses background services such as GC, Analyzer, and Flusher.
Determines which objects need VACUUM or REINDEX.
Runs those operations in parallel.
Resumes all services once complete.

All of this happens under the control of a single SQL command.

Checking Catalog Maintenance Recommendations

Before running maintenance, you can preview what would be cleaned or rebuilt on a per database basis using the function:

sql

SELECT * FROM sys.catalog_maintenance_recommendations(target_db_oid [, top [, shared_only [, include_all]]]);

Arguments

Option	Description	Default
`target_db_oid`	OID of the database you want recommendations. Required. Pass NULL to use the current database.	`-`
`top`	Limit result to the `top` first recommendations, by order of priority. `-1` shows all recommendations, without prioritization.	`-1`
`shared_tables_only`	Only show maintenance recommendations for shared catalog tables.	`false`
`include_all`	Show maintenance recommendations for both shared and target database catalog tables.	`false`

Note:

Options shared_tables_only and include_all cannot be used at the same time
shared_tables_only and include_all can only be used when target_db_oid is the oid of the yellowbrick database (4400).

This function lists the top objects needing maintenance, along with the maintenance action recommended, why and some useful information:

sql

select * from sys.catalog_maintenance_recommendations(4400, 5);
 dbname          | schemaname |           relname            | relkind | relid | maintenance_action |              reason               | dead_tuples | live_tuples | relpages | size_mb | reclaimable_mb |      last_vacuum       |
-----------------+------------+------------------------------+---------+-------+--------------------+-----------------------------------+-------------+-------------+----------+---------+----------------+------------------------+
 yellowbrick     | pg_catalog | pg_proc                      | r       |  1255 | VACUUM FULL        | severe_bloat_requires_full_vacuum |       20000 |           0 |    20000 |     625 |            625 | 2000-01-01 00:00:00+00 |
 yellowbrick     | pg_catalog | pg_description               | r       |  2609 | VACUUM             | dead_ratio_exceeds_threshold      |        1000 |        1000 |    10000 |   312.5 |         156.25 | 2000-01-01 00:00:00+00 |
 yellowbrick     | pg_catalog | pg_constraint_contypid_index | i       |  2666 | REINDEX            | index_not_ready                   |           0 |           0 |        0 |       0 |              0 | 2000-01-01 00:00:00+00 |
 yellowbrick     | pg_catalog | pg_description_o_c_o_index   | i       |  2675 | REINDEX            | invalid_index                     |           0 |           0 |        0 |       0 |              0 | 2000-01-01 00:00:00+00 |
(4 rows)

Priority order

When ordering using top option, order follows the below prioritization:

VACUUM FULL — Severe bloat
VACUUM — Dead tuples exceed threshold
REINDEX — Index invalid or not ready

Within each group:

Sort by reclaimable_mb

Starting Maintenance

To begin an automated maintenance run, use:

sql

ALTER SYSTEM MAINTENANCE START [ WITH ( option [, option] )]

Options

Option	Description	Default
`max_duration`	Maximum runtime for the maintenance. Valid units for this parameter are "ms", "s", "min", "h", and "d".	`1h`
`pause_services`	List of background services to pause while maintenance is running.	`gc,flusher,analyzer,autovacuum`
`parallelism`	Number of databases to process in parallel.	`4`
`retries`	Retry count for transient errors (locks, timeouts) per action.	`3`
`shared_tables_only`	Only perform maintenance of shared catalog tables.	`false`
`include_all`	Perform maintenance of both shared and per-database catalog tables.	`false`
`top`	Number of highest-priority objects per database to process.	`100`
`dry_run`	When `true`, reports what would be done without executing.	`false`
`verbose`	Enables logging of each step in the Lime logs.	`false`

Maintenance will pause the specified services and run until either all work is complete, the duration limit is reached, or it is stopped manually. Once done, all services are automatically set back to their previous state.

Examples

sql

-- Use all default values
ALTER SYSTEM MAINTENANCE START;

-- Override some options
ALTER SYSTEM MAINTENANCE START
WITH (
    max_duration = '15min',
    parallelism = 4,
    retries = 3,
    shared_tables_only = true,
    include_all = false,
    top = 5,
    dry_run
);

Stopping Maintenance

To stop a maintenance run currently in progress in the background:

sql

ALTER SYSTEM MAINTENANCE STOP;

This command interrupts the current run, ensures all paused services are resumed, and finalizes progress reporting.

Monitoring Progress

Using status function

While maintenance is running — or after it finishes — check the status:

sql

SELECT * FROM sys.catalog_maintenance_status();

This will return the status of the last recorded maintenance run on the system, as a JSON object.

The output includes all actions taken in each database and its status (ok or failed).

The overall status of the maintenance run is given in the state field and can be one of :

idle : No run recorded
starting: Starting up a new maintenance run
running: Maintenance in progress
complete: Maintenance is done, with no failures
failed: Maintenance completed with one or more failures
timeout: Maintenance timed out and stopped
stopping: Maintenance is in the middle of stopping, after ALTER SYSTEM MAINTENANCE STOP
stopped: Maintenance was stopped using ALTER SYSTEM MAINTENANCE STOP

Note: If Lime is restarted, the status record from current or past maintenance run is lost.

Example:

json

yellowbrick_test=# select sys.catalog_maintenance_status();
                                                     catalog_maintenance_status
-------------------------------------------------------------------------------------------------------------------------------------
 {                                                                                                                                  +
   "state" : "complete",                                                                                                            +
   "startedAt" : "2025-11-04T10:25:34.811554461Z",                                                                                  +
   "endedAt" : "2025-11-04T10:25:35.062168162Z",                                                                                    +
   "elapsed" : 0,                                                                                                                   +
   "totalDbs" : 9,                                                                                                                  +
   "totalRelations" : 6,                                                                                                            +
   "processed" : 6,                                                                                                                 +
   "failed" : 0,                                                                                                                    +
   "lastAction" : "a: REINDEX pg_catalog.pg_description_o_c_o_index",                                                               +
   "lastError" : null,                                                                                                              +
   "opts" : {                                                                                                                       +
     "sharedTablesOnly" : false,                                                                                                    +
     "includeAllTables" : true,                                                                                                     +
     "topN" : 100,                                                                                                                  +
     "dryRun" : false,                                                                                                              +
     "retries" : 3,                                                                                                                 +
     "pauseServices" : "gc,flusher,analyzer,autovacuum",                                                                            +
     "maxDuration" : "3600s",                                                                                                       +
     "verbose" : true,                                                                                                              +
     "parallelism" : 4                                                                                                              +
   },                                                                                                                               +
   "databaseActions" : {                                                                                                            +
     "a" : [                                                                                                                        +
       {"schema":"pg_catalog","relname":"pg_proc","action":"VACUUM FULL","result":"ok","elapsedMs":67,"error":null},                +
       {"schema":"pg_catalog","relname":"pg_constraint_contypid_index","action":"REINDEX","result":"ok","elapsedMs":6,"error":null},+
       {"schema":"pg_catalog","relname":"pg_description_o_c_o_index","action":"REINDEX","result":"ok","elapsedMs":10,"error":null}  +
     ],                                                                                                                             +
     "yellowbrick_test" : [                                                                                                         +
       {"schema":"pg_catalog","relname":"pg_proc","action":"VACUUM FULL","result":"ok","elapsedMs":68,"error":null},                +
       {"schema":"pg_catalog","relname":"pg_constraint_contypid_index","action":"REINDEX","result":"ok","elapsedMs":6,"error":null},+
       {"schema":"pg_catalog","relname":"pg_description_o_c_o_index","action":"REINDEX","result":"ok","elapsedMs":11,"error":null}  +
     ]                                                                                                                              +
   },                                                                                                                               +
   "dbSummaries" : {                                                                                                                +
     "a" : {"processed":3,"failed":0,"reclaimedMb":535.0340281641912,"state":"complete","error":null},                              +
     "yellowbrick" : {"processed":0,"failed":0,"reclaimedMb":0.0,"state":"complete","error":null},                                  +
     "_YBPG_stats" : {"processed":0,"failed":0,"reclaimedMb":0.0,"state":"complete","error":null},                                  +
     "yellowbrick_test" : {"processed":3,"failed":0,"reclaimedMb":591.8280384451493,"state":"complete","error":null},               +
     "yellowbrick_test_utf8" : {"processed":0,"failed":0,"reclaimedMb":0.0,"state":"complete","error":null},                        +
     "yellowbrick_unit_tests" : {"processed":0,"failed":0,"reclaimedMb":0.0,"state":"complete","error":null},                       +
     "maint_test" : {"processed":0,"failed":0,"reclaimedMb":0.0,"state":"complete","error":null},                                   +
     "glinda_test_utf8" : {"processed":0,"failed":0,"reclaimedMb":0.0,"state":"complete","error":null},                             +
     "glinda_test_latin9" : {"processed":0,"failed":0,"reclaimedMb":0.0,"state":"complete","error":null}                            +
   }                                                                                                                                +
 }
(1 row)

From the Lime logs

When running in verbose mode, the maintenance operation will log each operation it runs in the Lime logs.

It will also log a small progress status every 15 seconds with the total number of relations to be processed, how many are done / failed and the overall status of the maintenance run.

text

Progress: 3/15 processed, 0 failed (state=running)

Configuration Parameters

Use ALTER SYSTEM to tune global maintenance thresholds and behavior:

Parameter	Default	Description
yb_maintenance_dead_tuple_ratio_threshold	`0.10`	Dead/live tuple ratio that triggers a `VACUUM` recommendation.
yb_maintenance_min_table_mb	`8.0`	Minimum table size considered for maintenance recommendations.
yb_maintenance_include_indexes	`true`	Include indexes in recommendations.
yb_maintenance_max_duration_s	`3600s (1 hour)`	Maximum runtime for a maintenance run.
yb_maintenance_parallelism	`4`	Number of databases to process in parallel during maintenance.
yb_maintenance_retry_attempts	`3`	Number of retry attempts per operation during maintenance.
yb_maintenance_pause_services	`"gc,flusher,analyzer, autovacuum"`	Background services to pause during maintenance.

Note: To make these settings apply to Lime’s maintenance sessions, use ALTER SYSTEM (not SET) so that they persist across processes.

Best Practices

Run catalog maintenance during low workload periods to avoid lock conflicts and impacting regular workloads.
Use dry_run = true to preview scope before running a full session.
Keep yb_maintenance_parallelism modest (2–4) on smaller instances.
Periodically review sys.catalog_maintenance_status() for failures or skipped objects.

Troubleshooting

Symptom	Likely Cause	Resolution
Session stops early with status `timeout`	`max_duration` reached	Increase duration limit.
Some database actions are marked as failed	Locks or transient errors	Adjust `retries` or rerun maintenance in isolation.
Given `max_duration` parameter not applied during maintenance	Invalid unit was given	Make sure the `max_duration` argument unit is one of "ms", "s", "min", "h", or "d"
Configuration variable settings not applied during maintenance	Session configuration variable settings not visible to Lime	Use `ALTER SYSTEM` to persist them.

Workload Management

Distributing Data

Bulk Loading Tables

Bulk Load Examples

Running a Bulk Load

Loading Tables from Parquet Files

ybload Command

Load Data with SQL

Loading Data from Object Storage

Loading from Amazon S3

Loading from Azure Blob Storage

Loading Tables with Spark

Setting up and Running a Spark Job

Setting Up the ybrelay Service

Trickle Loading Data via JDBC

Unloading Data to Object Storage

Unloading Data to Parquet Files

ybunload Command

Installing ybtools

Setting Up a Database Connection

Configuring SSL/TLS for Tools and Drivers

Secure Connections for ODBC/JDBC Clients and ybsql

Appliance

Appliance: Disk Encryption

Setting Up Encrypted Drives

Observability

Observability Metrics

Remote Diagnostics

System Alerts

Creating an Alert Endpoint

Using the System Management Console

ybcli Reference

ybcli: config

Cloud

Configuring

Vanity DNS

Yellowbrick Manager

Installing

CLI Install Instructions

Public Install Instructions

Private Install Instructions

Self-Managed Install Instructions

Permissions

Kubernetes Guides

Observability

Observability Alerts

Observability Metrics

Databases

Backup & Restore

Overview

ybbackup Commands

ybbackupctl Commands

ybrestore Commands

Database Replication

Managing Replication

Setting Up Replication

Encrypting Sensitive Data

LDAP Integration

LDAP Authentication

Synchronizing Users and Groups

Metering

System Views

sys.lock

Workload Management

Creating WLM Resource Pools

Creating WLM Rules

Compatibility Parameters

Data Processing and Formatting

Feature Enablement

General

Tuning

Yellowbrick Row Store (YRS) Alerting Parameters

ybsql \copy Command

ybsql Properties and Variables

SQL Commands

CREATE EXTERNAL FORMAT

CREATE EXTERNAL TABLE

CREATE TABLE

GRANT

Plan Hinting