LOAD TABLE
Bulk load a table from external storage by using a SQL command (instead of the
ybload
client).
LOAD TABLE table_name
[ SOURCE FIELDS (col1 type [, col2 type ] [, ...]) ]
[ FROM ('string1' [,'string2'] [, ...]) ]
EXTERNAL LOCATION location_name
[ EXTERNAL FORMAT { format_name |
(TYPE file_type WITH (option value [, option value ] [, ...])) } ]
[ WITH (option value [, option value ] [, ...]) ]
- table_name
- Name of a regular user table that is the target for the load. The
source files will be read from external storage and written to the
table. To run
LOAD TABLE
commands, you must haveBULK LOAD
privilege on the database andINSERT
privilege on the target table. - SOURCE FIELDS
- Specify all of the fields found in the source files and their data
types. These fields map to the columns in the target table that you want
to load. You must specify all of the fields that appear in the source
files, in order, but you can load a table that only contains a
subset of those fields. For example, you can load a table with four
columns from a source file that has five or more fields. You do not have
to specify the same order of columns in the
CREATE TABLE
statement. The load operation can detect which columns correspond to which source fields by name and data type.In this example, the source file contains five fields:... source fields(seasonid smallint, matchday timestamp, htid smallint, atid smallint, moment varchar(5)) ...
You can load a table that has one or more columns that correspond to these fields. See also LOAD TABLE Examples.
Conversely, you can also load a table that has columns that do not have corresponding fields in the source file ifDEFAULT
column constraints are specified in theCREATE TABLE
statement. For example, you can load a two-column table from a source file that has only one field. In this case,SOURCE FIELDS
identifies the one field that exists in the source file; the second column in the table is loaded with the specified default value. For example:create table t1 (c1 int, c2 int not null default 1); ... load table t1 source fields (c1 int) ...
Note: TheSOURCE FIELDS
option is not available in the Load Data Assistant. - FROM('string1','string2', ...)
- Specify one or more character strings that represent a list of the S3
source files or a "prefix" that identifies those files. These files will
be loaded from the specified external location, which is a bucket name.
If you do not specify a
FROM
list, the load operation will scan the entire bucket and load any files that have a schema that matches the columns in the target table. Therefore, it is recommended that you specify aFROM
list or prefix.If a string exactly matches the key of a specific file stored in the bucket, only that S3 file will be loaded. Otherwise, the string is checked for matches against the beginning part of the key of any objects in the bucket. All of the files that match will be loaded.
For example, the following commands all produce the same results, loading the same two files:premdb=# load table match from ('/premdb/match.csv','/premdb/matchcopy.csv') external location premdbs3data; INSERT 0 17212 premdb=# load table match from ('/premdb/match') external location premdbs3data; INSERT 0 17212
Note: Wildcard patterns are not supported for S3 source files. For example, you cannot use the followingFROM
clause to load files that have a prefix ofmatch
:from ('/premdb/mat*.csv')
Instead you could use:from ('/premdb/mat')
- EXTERNAL LOCATION
- Name of an external location object. See CREATE EXTERNAL LOCATION.
- EXTERNAL FORMAT
- Name of an external format object. See CREATE EXTERNAL FORMAT. Alternatively, you can define
the format in the
TYPE
clause. You can also omit theEXTERNAL FORMAT
clause completely and use the default format settings for the table. - TYPE...WITH
- Specify the source file type:
CSV
,TEXT
,PARQUET
, orBCP
. You must enclose theTYPE
clause inside parentheses, and the followingWITH
clause inside its own set of parentheses:(TYPE format_type WITH(...))
TheWITH
clause that followsTYPE
defines formatting load options, such as the line separator, field delimiter, and field definitions for different data types. Each name-value pair must consist of a valid option (in this context) and a valid setting for that option. For example:(type csv with (skip_blank_lines true, delimiter '|'))
- WITH
- The
WITH
clause at the end of theLOAD TABLE
statement defines general load-processing options. Each name-value pair must consist of a valid option (in this context) and a valid setting for that option. These are separate and distinct from the formatting load options defined by the external format object or in theTYPE WITH()
list.For example:with (source_compression 'gz')
See LOAD TABLE Options.
LOAD TABLE Examples
load table match
from ('/premdb/match.csv')
external location premdbs3data;
INSERT 0 8606
/premdb/newmat
:load table newmatchstats
from ('/premdb/newmat')
external location premdbs3data;
INSERT 0 24785280
load table newmatchstats
from ('/premdb/newmatchstats25mil.gz')
external location premdbs3data
with(source_compression 'gz');
INSERT 0 24785280
load table newmatchstats
from ('/premdb/newmatchstats')
external location premdbs3data
with(source_compression 'gz');
ERROR: [bulksvc-load] Unable to open channel for s3://ybpremdb/premdb/newmatchstats25mil.csv CAUSED BY: Input is not in the .gz format
load table newmatchstats
from ('/premdb/newmatchstats25mil.csv')
external location premdbs3data
external format premdbs3format;
load table newmatchstats
from ('/premdb/newmatchstats25mil.csv')
external location premdbs3data
external format (type csv with(skip_blank_lines true, delimiter ','));
LOAD
TABLE
statement.
create table matchday as
select seasonid, matchday, htid, atid from newmatchstats;
SELECT 24785280
delete from matchday;
DELETE 24785280
load table matchday
source fields(seasonid smallint,
matchday timestamp,
htid smallint,
atid smallint,
moment varchar)
from('/premdb/newmatch')
external location premdbs3data;
INSERT 24785280
c1
and
c2
:create table t1(c1 int, c2 int);
CREATE TABLE
statement:load table t1
source fields(c3 int, c2 int, c1 int)
...
LOAD TABLE Options
The WITH
clause in the LOAD TABLE
statement
supports some load-processing options and source options. These options are listed
alphabetically.
- log_level INFO | OFF | WARN | ERROR | DEBUG | TRACE
- Specify the logging level for the load output. The default level is
INFO
. Other options provide more verbose output. - max_bad_rows NUMBER
- Set the maximum number of rejected rows that a load will tolerate before
aborting and starting to roll back the transaction. The default is
0
(no bad rows); make sure that the source data you are about to load is clean and consistent if you use the default.Note:max_bad_rows 100
means 100 bad rows are allowed; the load will fail on the 101st bad row. - read_sources_concurrently ALWAYS | NEVER | ALLOW | <number>
- Define the behavior for reading source files in parallel:
ALWAYS
,NEVER
,ALLOW
(the default), or a specific number of source files. See "ybload Advanced Processing Options" in the main Yellowbrick documentation. - send_compression ALWAYS | NEVER | AUTO
- Define the compression policy for data buffers before they are sent from the
client to the worker nodes. This option is equivalent to the
--compression-policy
option inybload
. See "ybload Advanced Processing Options" in the main Yellowbrick documentation. - source_compression GZ | BZIP2 | XZ | PACK200 | LZ4
- This option explicitly defines the type of compression used by source data. This option applies to loads from all supported source types and overrides other compression detection methods.
Order of Precedence for Format Specifications
LOAD
TABLE
operation. The following order of precedence is enforced: - In-line format options, as specified with
EXTERNAL FORMAT (TYPE...WITH(…))
- An in-line format name, as specified with
EXTERNAL FORMAT format_name
- A default format, if one exists, for the external location object specified
in the
LOAD TABLE
command
ybload
operations, which means that some values are
automatically detected and the load is attempted with those settings. (The load may
fail in this case.)