Skip to content

Auto-Analyzing Tables

The ANALYZE command generates updated statistics for the number of distinct values in table columns. This information is useful to the query planner and needs to be kept up-to-date, especially for tables with columns that appear in the WHERE clause of your queries.

The following operations always include an automatic and immediate analyze operation as part of their own post-processing:

  • Bulk load operations with ybload
  • INSERT INTO...SELECT
  • CREATE TABLE AS statements (CTAS)
  • Flush operations

When these operations complete, their target tables are guaranteed to have up-to-date statistics.

The following operations do not include automatic analyze operations:

  • INSERT INTO...VALUES
  • UPDATE
  • DELETE

When these operations complete, the statistics in the target tables may need to be updated. In that case, they wait for the next auto-analyze operation that is scheduled for the system.

By default, all tables are analyzed automatically on a regular schedule. The system checks tables every 5 minutes (300 seconds) and determines which tables require an ANALYZE operation. If at least 10% of a table's rows have changed, a background ANALYZE operation is run.

Tip: To find out when a table was last analyzed, either query the sys.log_analyze view or use Yellowbrick Manager. In Yellowbrick Manager, go to Databases and select the specific database, schema, and table. On the Details screen, go to Analyze Activity.

In general, the default ANALYZE behavior is optimal and is likely to be the best choice for most workloads. However, the following sections describe two distinct use cases that may require immediate ANALYZE operations to be disabled.

Disabling Immediate ANALYZE Operations for INSERTs

If your workload requires frequent small INSERT INTO...SELECT operations into a large fact table and you want to optimize performance, you can disable the immediate ANALYZE operations that happen after each INSERT. To disable these ANALYZE operations, set the ybd_analyze_after_writes configuration parameter to OFF. This parameter defaults to ON.

After changing this parameter, you have to reload the system configuration (or restart the database) before the new behavior takes effect. To set the parameter globally, use an ALTER SYSTEM command:

yellowbrick=# alter system set ybd_analyze_after_writes to off;

To set the parameter at the session level, you can use ALTER ROLE. For example:

premdb=# create role insert_select;
CREATE ROLE
premdb=# alter role insert_select set ybd_analyze_after_writes to off;
ALTER ROLE

After a configuration reload, immediate ANALYZE operations will be disabled for all sessions run when the insert_select user runs INSERT INTO...SELECT operations. The advantage to disabling the parameter in this way is that it will remain enabled for other write operations.

Disabling Immediate ANALYZE Operations for Bulk Loads

If your workload consists of a large number of continuous trickle-feed loads using ybload, you may need to disable the immediate ANALYZE operations that happen after each ybload operation. To disable these ANALYZE operations, set the ybd_analyze_after_loads configuration parameter to OFF. This parameter defaults to ON.

After changing this parameter, you have to reload the system configuration (or restart the database) before the new behavior takes effect. To set the parameter globally, use an ALTER SYSTEM command:

yellowbrick=# alter system set ybd_analyze_after_loads to off;

To set the parameter at the session level, you can use ALTER ROLE. For example:

premdb=# create role ybload_user;
CREATE ROLE
premdb=# alter role ybload_user set ybd_analyze_after_loads to off;
ALTER ROLE

After a configuration reload, immediate ANALYZE operations will be disabled for all ybload sessions run by ybload_user. The advantage to disabling the parameter in this way is that it will remain enabled for other write operations.

Note that setting this parameter to OFF does not prevent an immediate ANALYZE from being run after intermediate load operations, such as after an INSERT into a target table from a staging table. However, setting ybd_analyze_after_writes to OFF will prevent an immediate ANALYZE in these cases.

Parent topic:Managing Tables and Views