Skip to content

TABLESAMPLE

Prerequisite: while the feature is in beta, you must explicitly enable it by setting enable_full_samplescan to ON.

Use the TABLESAMPLE clause to retrieve a random sample of rows from a table. This clause is typically appended to a FROM clause and is useful for quickly analyzing a representative subset of a large data set.

sql
FROM <table_name> [ AS ] <alias> TABLESAMPLE <sampling_method> ( <sample_percentage> ) [ REPEATABLE ( seed ) ]

Parameters

table_name
Name of the table from which to sample rows.
sampling_method
The method used to perform the sampling. Supported methods:
  • BERNOULLI — The only supported sampling method. Performs sampling at the row level, evaluating each row independently and including it in the sample with the specified probability.
  • SYSTEM — Currently this is just an alias for using BERNOULLI.
sample_percentage
The approximate percentage of the table's rows to return. Must be a numeric literal between 0 and 100. For example, TABLESAMPLE BERNOULLI (10) returns approximately 10% of the table.
REPEATABLE ( seed )
Optional clause to make the sampling deterministic. If specified, the same seed value will always return the same sampled rows (provided that the underlying data hasn't changed). When omitted, the seed is set randomly using a random number generator. The RNG can be seeded by the setseed function, which is another way to achieve determinism.

Notes

  • TABLESAMPLE returns an approximate number of rows. There is no guarantee that the exact percentage specified will be returned.
  • The clause only supports sampling entire tables, not derived tables or views.
  • The reproducibility of samples is also influenced by the setseed() function (which sets the random seed for the session). However, the seed provided in the REPEATABLE clause takes precedence over any value set with setseed().

Examples

Sample approximately 1% of the rows from the team table using the BERNOULLI method with the seed set to 2:

sql
CREATE TABLE t AS SELECT * FROM sys.rowgenerator LIMIT 1000000;
SELECT COUNT(*) FROM t TABLESAMPLE BERNOULLI (1) REPEATABLE (2);
 count
-------
 10127
(1 row)