Appearance
Join Elimination Optimization
Overview
Join elimination is a query optimization in which the planner removes joins that can be proven unnecessary. Since joins are among the most expensive SQL operations, eliminating them can significantly improve query performance.
Join elimination applies to both INNER and OUTER joins in Yellowbrick. It relies on declared PRIMARY KEY, FOREIGN KEY, and UNIQUE constraints to prove that removing a join does not change the query’s results.
Example Schema
The following schema will be used in the examples below:
sql
CREATE TABLE customer (
customer_id INT PRIMARY KEY,
name VARCHAR(20) NOT NULL
);
CREATE TABLE address (
address_id INT PRIMARY KEY,
customer_id INT UNIQUE,
street VARCHAR(50),
CONSTRAINT fk_address_customer FOREIGN KEY (customer_id)
REFERENCES customer(customer_id)
);
CREATE TABLE payment (
payment_id INT PRIMARY KEY,
customer_id INT,
amount INT NOT NULL,
CONSTRAINT fk_payment_customer FOREIGN KEY (customer_id)
REFERENCES customer(customer_id)
);- customer → root entity
- address → one-to-one relationship (unique per customer)
- payment → one-to-many relationship (a customer may have multiple payments)
When Join Elimination Applies
INNER JOIN elimination
If a child table joins to its parent on a declared foreign key, and no columns from the parent are projected or filtered, the join can be removed.
Example:
sql
SELECT c.name
FROM customer c
JOIN address a ON c.customer_id = a.customer_id;The join can be eliminated since the address table does not contribute to the result.
LEFT OUTER JOIN elimination
- One-to-one case: If the inner table has a
UNIQUEorPRIMARY KEYon the join column, the join cannot duplicate rows from the outer table. - With DISTINCT: If the outer relation’s output is made distinct (via
DISTINCTor grouping), duplicates from one-to-many relationships can be eliminated.
Examples:
sql
-- One-to-one join
SELECT c.name
FROM customer c
LEFT JOIN address a ON c.customer_id = a.customer_id;
-- DISTINCT removes duplicates
SELECT DISTINCT c.name
FROM customer c
LEFT JOIN payment p ON c.customer_id = p.customer_id;Insert valid data:
sql
INSERT INTO customer VALUES (1, 'Alice'), (2, 'Bob');
INSERT INTO address VALUES (101, 1, 'Main St'), (102, 2, '2nd Ave');
INSERT INTO payment VALUES (201, 1, 100), (202, 2, 200), (203, 2, 300);Results:
- INNER JOIN (customer ↔ address): returns all customers.
- LEFT JOIN (customer ↔ address): returns all customers with their addresses.
- LEFT JOIN + DISTINCT (customer ↔ payment): returns Alice and Bob once each.
In this case, join elimination is correct and improves performance.
Risk of Wrong Results
Because Yellowbrick does not enforce PRIMARY KEY, FOREIGN KEY, or UNIQUE constraints, the planner assumes declared keys are valid. If constraints are not valid, join elimination may produce wrong results:
- Non-matching rows may appear.
- Rows that should be preserved with NULL values may disappear.
- Duplicates may be introduced or lost.
This is a semantic error, not just a performance issue.
Examples:
Now insert invalid data:
sql
-- Orphaned address (customer_id=99 does not exist)
INSERT INTO address VALUES (103, 99, 'Ghost Lane');
-- Payment with invalid customer
INSERT INTO payment VALUES (204, 77, 500);Effects:
INNER JOIN (customer ↔ address):
- Expected: only valid customer-address pairs.
- With elimination: orphaned address rows may appear as if valid.
LEFT JOIN (customer ↔ address):
- Expected: Bob appears with
NULLif no address exists. - With elimination: Bob’s row may disappear entirely.
- Expected: Bob appears with
LEFT JOIN + DISTINCT (customer ↔ payment):
- Expected: one row per valid customer.
- With elimination: phantom customers may appear, or duplicates may be miscounted.
Disabling Join Elimination
If constraint validity cannot be guaranteed, you can disable join elimination with the configuration parameter enable_join_elimination.
sql
SET enable_join_elimination = off;See Also
- How to Simplify Your Queries with the Yellowbrick Planner - Join Elimination — detailed walkthrough with examples.