External Tables
This section covers user-defined external tables that are stored in, and accessed from, an NFS-mounted file system outside the database. For information about loading tables from object storage via SQL commands, see the documentation for the Yellowbrick Test Drive.
NFS-mounted external tables support use cases for both exporting data and importing it. You
can create external tables solely for the purpose of writing data out to external files. You
can also read from external tables as part of standard SQL queries, and you can use them to
load other tables via INSERT
and CTAS
statements.
ybload
and
ybunload
operations to read and write data. ybload
operations. External table operations
run ybload
and ybunload
processes on the manager node
and must share limited compute resources with other critical processes. In addition,
external table loads and unloads access source and target files via NFS file systems,
which are typically much slower than local file systems.
For these reasons,
ybload
and ybunload
performance for external table
operations may be significantly slower than performance for standard loads and
unloads. For larger files, the performance difference may be severe.
NFS Mount Points for External Tables
CREATE EXTERNAL
MOUNT
command. This command defines a short reference name for the mount
point and a complete path to the NFS-mounted directory where flat files will be read or
written. For example:
create external mount '/qumulo/yb100' as 'nfs://qumulo:/data/yb100/' with (gid 0, uid 0);
When an external mount is in place, you can read from files that already exist in that location or write to files that will be created in that location. Mounts created for external tables are logged to the sys.mount view.
Definition and Use of External Tables
You can create an external table with a CREATE EXTERNAL TABLE SQL command.
Instead of a regular table name, you provide the external file location. The table is not
registered in the system catalog, and its schema is derived from a query. (This type of
table is also known as a transient external table.) A special USING
clause defines format
choices for writing the data out to the file.
To query an external table, you use special SELECT...FROM EXTERNAL
syntax
and specify the schema of the table in terms of a column list. You can query external tables
continuously as long as their storage files remain in the specified location and the
associated NFS mount is active. A single query can process a combination of regular table
data and external table data. For example, you can join an external table to a regular
table, and you can return the union of an external table and a regular table.
You can also write CTAS statements that select from external tables, and you can insert
data into other tables by selecting from external tables (using INSERT INTO
SELECT
or SELECT INTO
).
When external tables are created or queried, background ybload
and
ybunload
operations generate detailed log files and write them to an NFS
directory relative to the mount point you created. See External Table Examples.
For details about external table command syntax and options, see the SQL Commands section (or use the
ybsql help
command).
Examples of Creating and Selecting From External Tables
- To write data to an external file and create an external table, use a
CREATE EXTERNAL TABLE
statement. This statement runs a query against database tables and writes the results to an external file. For example:create external table '/local_external_tables/ext_season.csv' using(format text) as select * from season;
- To read data from an external table, use a
SELECT...FROM EXTERNAL
statement. This statement runs a query against an external file and returns the data to the client. For example:select c5, c6 from external '/local_external_tables/ext_match.csv' (c1 int, c2 date, c3 int, c4 int, c5 char(3), c6 char(3)) using(format text) order by 1 desc limit 10;
- To create a regular table based on data in an external table, use a
CTAS
statement. For example:create table match_results as select c5, c6 from external '/local_external_tables/ext_match.csv' (c1 int, c2 date, c3 int, c4 int, c5 char(3), c6 char(3)) using(format text) order by 1 desc;
- To read data from an external table and write it into a regular table, use an
INSERT INTO SELECT...FROM EXTERNAL
statement. You do not have to declare the column list for the file in this case; it is optional. TheUSING
clause is required. For example:premdb=# insert into season select * from external 'ext_db/season.csv' using(format csv); INSERT 0 25
The external table in this statement uses the schema of the target table (
season
). The columns in the external data file must be in the same order as they are in the target table, and every column in the target table must also exist in the external file.
Limitations on External Tables
- You cannot create views by selecting from external tables.
- External tables are not backed up or restored.
- External tables cannot be updated, deleted, or truncated.