Appearance
TABLESAMPLE
Prerequisite: while the feature is in beta, you must explicitly enable it by setting
enable_full_samplescantoON.
Use the TABLESAMPLE clause to retrieve a random sample of rows from a table. This clause is typically appended to a FROM clause and is useful for quickly analyzing a representative subset of a large data set.
sql
FROM <table_name> [ AS ] <alias> TABLESAMPLE <sampling_method> ( <sample_percentage> ) [ REPEATABLE ( seed ) ]Parameters
- table_name
- Name of the table from which to sample rows.
- sampling_method
- The method used to perform the sampling. Supported methods:
- BERNOULLI — The only supported sampling method. Performs sampling at the row level, evaluating each row independently and including it in the sample with the specified probability.
- SYSTEM — Currently this is just an alias for using BERNOULLI.
- sample_percentage
- The approximate percentage of the table's rows to return. Must be a numeric literal between 0 and 100. For example,
TABLESAMPLE BERNOULLI (10)returns approximately 10% of the table. - REPEATABLE ( seed )
- Optional clause to make the sampling deterministic. If specified, the same seed value will always return the same sampled rows (provided that the underlying data hasn't changed). When omitted, the seed is set randomly using a random number generator. The RNG can be seeded by the
setseedfunction, which is another way to achieve determinism.
Notes
TABLESAMPLEreturns an approximate number of rows. There is no guarantee that the exact percentage specified will be returned.- The clause only supports sampling entire tables, not derived tables or views.
- The reproducibility of samples is also influenced by the
setseed()function (which sets the random seed for the session). However, the seed provided in theREPEATABLEclause takes precedence over any value set withsetseed().
Examples
Sample approximately 1% of the rows from the team table using the BERNOULLI method with the seed set to 2:
sql
CREATE TABLE t AS SELECT * FROM sys.rowgenerator LIMIT 1000000;
SELECT COUNT(*) FROM t TABLESAMPLE BERNOULLI (1) REPEATABLE (2); count
-------
10127
(1 row)