inlist_threshold (integer)
- Default value (as of this release):
10
This configuration parameter specifies the threshold on the number of elements in an IN-list before that IN-list is considered for a semi-join rewrite.
How It Works
Consider a table named result_table
that contains a single integer column named x
.
sql
create table result_table(x int);
insert into result_table(x) values (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11);
If the number of elements in the IN-list exceeds the value of guc_inlist_threshold
, the query planner may rewrite the query into a semi-join, which can optimize performance for larger datasets.
sql
EXPLAIN SELECT * FROM result_table WHERE x IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11);
QUERY PLAN
---------------------------------------------------------------------------------------------------------
id rows_planned workers node
10 11 all SELECT
1 11 all SEMI LEFT INNER HASH JOIN ON (#inlist_100#.val = result_table.x)
4 11 all |-SCAN result_table
| result_table.x = bloom(7) AND scan_constraints: min_max(result_table.x)
7 11 all |-BUILD
9 11 all SCAN TEMP 3
For smaller IN-lists (below the threshold), the query is typically processed as-is without rewriting.
sql
EXPLAIN SELECT * FROM result_table WHERE x IN (1, 2, 3, 4, 5, 6, 7, 8, 9);
QUERY PLAN
-----------------------------------------------------------------------------------
id rows_planned workers node
5 6 all SELECT
2 6 all SCAN result_table
((result_table.x = $8) OR ((result_table.x = $7) OR ((result_table.x = $6) OR ((result_table.x = $5) OR ((result_table.x = $4) OR ((result_table.x = $3) OR ((result_table.x = $2) OR ((result_table.x = $0) OR (result_table.x = $1)))))))))