Auto-Analyzing Tables
The ANALYZE command generates updated statistics for the number of distinct values in table columns. This information is useful to the query planner and needs to be kept up-to-date, especially for tables with columns that appear in the WHERE clause of your queries.
- Bulk load operations with
ybload
- INSERT INTO...SELECT
- CREATE TABLE AS statements (CTAS)
- Flush operations
- INSERT INTO...VALUES
- UPDATE
- DELETE
When these operations complete, the statistics in the target tables may need to be updated. In that case, they wait for the next auto-analyze operation that is scheduled for the system.
By default, all tables are analyzed automatically on a regular schedule. The system checks tables every 5 minutes (300 seconds) and determines which tables require an ANALYZE operation. If at least 10% of a table's rows have changed, a background ANALYZE operation is run.
sys.log_analyze
view or use the SMC. In the SMC, go to
Databases, select the database, double-click the row for the
table you want to check, and click Analyze Activity. In general, the default ANALYZE behavior is optimal and is likely to be the best choice for most workloads. However, the following sections describe two distinct use cases that may require immediate ANALYZE operations to be disabled.
Disabling Immediate ANALYZE Operations for INSERTs
If your workload requires frequent small INSERT INTO...SELECT operations into a large fact
table and you want to optimize performance, you can disable the immediate ANALYZE operations
that happen after each INSERT. To disable these ANALYZE operations, set the
ybd_analyze_after_writes
configuration parameter to OFF
.
This parameter defaults to ON
.
yellowbrick=# alter system set ybd_analyze_after_writes to off;
premdb=# create role insert_select;
CREATE ROLE
premdb=# alter role insert_select set ybd_analyze_after_writes to off;
ALTER ROLE
After a configuration reload, immediate ANALYZE operations will be disabled for all
sessions run when the insert_select
user runs INSERT INTO...SELECT
operations. The advantage to disabling the parameter in this way is that it will remain
enabled for other write operations.
Disabling Immediate ANALYZE Operations for Bulk Loads
If your workload consists of a large number of continuous trickle-feed loads using
ybload
, you may need to disable the immediate ANALYZE operations that
happen after each ybload
operation. To disable these ANALYZE operations,
set the ybd_analyze_after_loads
configuration parameter to
OFF
. This parameter defaults to ON
.
yellowbrick=# alter system set ybd_analyze_after_loads to off;
premdb=# create role ybload_user;
CREATE ROLE
premdb=# alter role ybload_user set ybd_analyze_after_loads to off;
ALTER ROLE
After a configuration reload, immediate ANALYZE operations will be disabled for all
ybload
sessions run by ybload_user
. The advantage to
disabling the parameter in this way is that it will remain enabled for other write
operations.
Note that setting this parameter to OFF
does not prevent an immediate
ANALYZE from being run after intermediate load operations, such as after an INSERT into a
target table from a staging table. However, setting
ybd_analyze_after_writes
to OFF
will prevent an
immediate ANALYZE in these cases.