Appearance
ybload Options
This section contains detailed descriptions of the bulk load options. Note the following points about the format of these options and their values:
Options are listed in alphabetical order for quick reference.
Option names are shown in lowercase; they are case-sensitive.
Specific valid option values (such as
true
andfalse
) are shown in lowercase. Variables for option values, such asSTRING
, are shown in uppercase. Option values are not case-sensitive.The requirements for quoting option strings vary by client platform. Values are shown without quotes, but quotes are sometimes required. For example, if you specify the
#
character in a Linux shell, it must be enclosed by single or double quotes. If you are using a Windows client, see also Escaping Quotes in Windows Clients.(stdin) : Load from
stdin
(standard input). To load fromstdin
instead of named source files, enter the single-dash character (-) at the end of theybload
command. This must be the last character on the command line, and it must be prefixed with--
to indicate that option parsing is complete. For example:./ybload -d premdb --username bobr -t match -- -
- @file
Specify a file that includes a set of options and values to use for the load. See Saving Load Options to a File.
- --bad-row-file STRING
Define the location and name of a file where rejected rows will be logged. If you do not specify this option, the file defaults to the name
SOURCE_FILENAME.TIMESTAMP.bad
and is written to a location that is reported early in the console or log file output. If the file already exists, it is truncated.- --bigint-field-options
See ybload Field Options.
- --boolean-field-options
See ybload Field Options.
- --bytes-per-transaction
Set the number of bytes to load per commit. The default is 1TB (1099511627776 bytes). You can set this option to modify the frequency of commits when bulk loads are running. This option works in conjunction with
--rows-per-transaction
. The threshold that is met first is applied.- --cacert STRING
Customize trust with secured communication; use this option in combination with the
--secured
option. Enter the file name of a custom PEM-encoded certificate or the file name and password for a Java KeyStore (JKS).For PEM format, the file must be named with a
.pem
,.cert
,.cer
,.crt
, or.key
extension. For example:--cacert cacert.pem
For JKS format, files are always password-protected. Use the following format:
--cacert yellowbrick.jks:changeit
where the
:
character separates the file name from the password of the keystore.See also SSL/TLS Settings For ybtools.
- --char-field-options
See ybload Field Options.
- --comment-char ASCII_CHARACTER
Define the comment character that is used in source files. The default value is the pound sign (#). The value must be a single ASCII character or a valid escape sequence. If
--skip-comment-lines
is set, commented rows in the source file are skipped, not rejected, and do not appear in the bad rows file.When the
--format
option is set, the comment character may be a single-byte or multi-byte character.- --compression-policy
Define the compression policy for data buffers before they are sent from the client to the worker nodes. See ybload Advanced Processing Options.
- --convert-ascii-control, --no-convert-ascii-control
Allow the caret control ASCII character
^@
to be parsed as a single-byte representation ofnull
within a character string. The default is--no-convert-ascii-control
. See also ybload Field Options.- --convert-c-escape, --no-convert-c-escape
Convert (or do not convert) C-style escape sequences when they appear in
CHAR
andVARCHAR
fields. For example, the two-character sequence\t
can be converted into a single tab character (0x09
) or it can be loaded unchanged.This option does not apply to fields with data types other than
CHAR
andVARCHAR
. For example, in anINTEGER
field, the value4\x35
, where\x3
is the C-style escape sequence for the number 5, returns an error:'\' is invalid digit
. (However, the ybsql \copy command will convert C-style escape sequences in all fields.)This option is only supported when the
--format
option is specified. If--format text
is used,--convert-c-escape
is the default behavior. If--format csv
or--format bcp
is used,--no-convert-c-escape
is the default.- --date-field-options
See ybload Date Formats.
- --date-style YMD | DMY | MDY | MONDY | DMONY | Y2MD | DMY2 | MDY2 | MONDY2 | DMONY2
Define the date format in terms of the order and style of the date parts, to avoid any ambiguity in parsing dates. For example,
--date-style MDY
means accept date values such as08-13-2016
.If you specify one of the
Y2
values, such asDMY2
, you must also specify a--y2base
value.You can specify the
--date-style
option more than once in a singleybload
command. For example:--y2base 1990 --date-style MDY --date-style MONDY --date-style DMY2
Note that you can use JSON-style formatting to abbreviate this syntax.
If
--date-style
is not specified, the defaults derive from--date-field-options
.- --dbname, -d
Name of the destination database. See Setting up a Database Connection.
- --decimal-field-options
See ybload Field Options.
- --default-field-options {JSON formatted <FieldOptions>}
Specify the field options to use for fields that do not have their own per-field options or per-type options. See ybload Field Options. The default value is
{}
.- --delimiter SPECIAL_CHARACTER
Define the special character that the source file uses as its field delimiter. All of the following are supported:
- A single Unicode character
- A hex value that corresponds to any ASCII control code (such as
0x1f
) - A valid escape sequence
When the
--format
option is set, the delimiter may be a multi-byte character.If you do not specify a field delimiter,
ybload
auto-detects it from among the following characters:,
|
\t
\us
\uFFFA
See also Setting the --format Option.
- --disable-trust, -k
Disable SSL/TLS trust when using secured communications. Trust is enabled by default. See also Enabling SSL/TLS Encryption.
Important: This option is not supported for use on production systems and is only recommended for testing purposes. It may be useful to disable trust during testing, then enable it when a formal signed certificate is installed on the appliance.
- --double-field-options
See ybload Field Options.
- --dryrun
Do a dry run of the load operation without committing any data to the table. See Using the Dry Run Option.
- --emptymarker STRING_WITH_ESCAPES
Define a marker that matches the character used to represent an empty string in your source file (an empty
CHAR
orVARCHAR
field). The value may be a string, a character, or a valid escape sequence.You may need to quote the
emptymarker
character. For example, you can use--emptymarker '\t'
on the command line, but not--emptymarker \t
.Note: To use
"
as theemptymarker
character, you have to quote strings with a different character ("
is the default). For example:--emptymarker '\"' --quote-char '|'
If null bytes exist within strings, use the
--on-zero-char REMOVE
option.See also NULL and Empty Markers.
- --encoding ENCODING_NAME | -E ENCODING_NAME
Specify the source file's encoding (character set:
UTF8
,LATIN9
,ISO-8859-1
,UTF16
). If no encoding is specified, the encoding of the source file is assumed to match the database encoding. If the encoding of the source file does not match the encoding of the destination database,ybload
will use Java transcoding to "translate" the source data to the encoding of the destination database.Check the documentation for the version of Java you are using on the
ybload
client system. For example, if you are using the Oracle JVM, check this list of supported character sets. Names from either of the first two columns can be used with the--encoding
option. See also the discussion of encodings in Creating Databases.For the fastest load performance under
ybload
, export the data from the source system in the character set of the destination database. For example, load exportedUTF8
data into aUTF8
database. Any required transcoding is likely to have an impact on load performance.Note: To load UTF-16 data, create a
UTF8
database and set the--encoding
option toUTF16
.- --escape-char SPECIAL_CHARACTER
Specify an escape character: any single Unicode character. The behavior of this option depends on the
--format
choice:--format CSV
: used to escape embedded quote characters inside quoted fields. The default escape character is"
.--format TEXT
: used to escape embedded field delimiters and line separators. The default escape character is\
.--format BCP
: disallowed withBCP
format, which does not use escape characters.
When the
--format
option is set, the escape character may be a multi-byte character.- --field-names COLNAME,COLNAME,...
Define a comma-separated list of source field names. *You do not need to specify a list if the source fields match the order and number of the destination table columns.*The
--parse-header-line
option overrides the--field-names
option.If the target table has default values assigned for certain columns, you can specify a partial column list with this option and the default values will be loaded for the other columns. For example, if a table
t1
has three columns,c1
,c2
, andc3
, andc1
was created with aDEFAULT
constraint, the--field-names
list could bec2, c3
.- --format CSV | TEXT | BCP
Specify the formatting style of the incoming data (how the source files were formatted by the export or unload tool that produced them). See also Setting the --format Option. In particular, this option refers to how field delimiters are protected in the data:
CSV
: Delimiters in field values were protected by wrapping the field values in quotes. For example:"2012, Mini Cooper S, ALL4"
TEXT
: Delimiters in field values were protected by preceding them with a backslash escape character. For example:2012\, Mini Cooper S\, ALL4
BCP
: Delimiters in fields were not protected (for compatibility with the Microsoft SQL Serverbcp
tool). For example:2012, Mini Cooper S, ALL4
- --help
Return basic usage information for the
ybload
command and its options.- --help-advanced
Return more advanced usage information for the
ybload
command and its options.- --host
Host name. See Setting up a Database Connection.
- --ignore-emptymarker-case, --no-ignore-emptymarker-case
The first option supports case-insensitive empty-marker comparisons. If you use
--ignore-emptymarker-case
, values ofEMPTY
,Empty
, andempty
are all recognized as empty values when--emptymarker
is set toNONE
. These options apply globally; you cannot specify them per data type or per field. They are applied globally regardless of how the--emptymarker
option was specified.- --ignore-nullmarker-case, --no-ignore-nullmarker-case
The first option supports case-insensitive null-marker comparisons. If you use
--ignore-nullmarker-case
, values ofNULL
,Null
, andnull
are all recognized as null values when--nullmarker
is set toNULL
. These options apply globally; you cannot specify them per data type or per field. They are applied globally regardless of how the--nullmarker
option was specified.- --initial-connection-timeout NUMBER
Number of seconds to wait for initial connections to the database. The default is
120
. This timeout option ensures thatybload
does not wait too long when there is a basic problem with incorrect connection parameters, or a firewall is preventing connection errors from reaching the client. To turn off this option and allow an unlimited wait time, set this option to0
.- --integer-field-options
See ybload Field Options.
- --ip-field-options
See ybload Field Options.
- --java-version
Return the Java version that is running on the client system. The client tools require the 64-bit version of Java 8 (also known as Java 1.8). Java 9 and 10 are not supported.
- --linesep LINE_SEPARATOR
Define the line separator (or row separator) that is used in source files: any single Unicode character or the
\r\n
escape sequence.If you do not specify a line separator,
ybload
auto-detects it from among the following characters:\n
\r\n
\rs
\uFFFB
When the
--format
option is set, the line separator may be a multi-byte character.- --locale LOCALE_NAME | -L LOCALE_NAME
The name of the locale to use for parsing dates, timestamps, and so on. If the locale is not specified, the database locale is assumed to be
C
.Locale names must be of the following form:
<language code>[_<country code>[_<variant code>]]
For example:
--locale en --locale en_US --locale zh_CN
Variant codes are rarely used; for details, see the Java documentation.
- --logfile STRING
Specify the name and location of a log file for the load operation. If the specified file already exists, it will be truncated. If this option is not specified, no log file is written. When you specify this
--logfile
option, also specify a--logfile-log-level
value other thanOFF
.- --log-level OFF | ERROR | WARN | INFO | DEBUG | TRACE
Specify the logging level for the default console output. The default level is
INFO
. (Use the--logfile-log-level
option to specify the logging level for a named log file.)- --logfile-log-level OFF | ERROR | WARN | INFO | DEBUG | TRACE
Specify the logging level for a given log file (as defined with the
--logfile
option). If the level is not specified, it defaults to the--log-level
value. You must specify a--logfile-log-level
value other thanOFF
when you specify the--logfile
option.- --mac-field-options
Specify field options for
MACADDR
andMACADDR8
fields. See ybload Field Options.- --max-bad-rows NUMBER
Set the maximum number of rejected rows that
ybload
will tolerate before aborting and starting to roll back the transaction. (Additional bad rows may be reported before the transaction has finished aborting.) The default is-1
, which means do not abort and roll back, regardless of the number of bad rows.Note:
--max-bad-rows 32
means 32 bad rows are allowed; the load will fail on the 33rd bad row.Rejected rows are written to the location specified with the
--bad-row-file
option.- --nullmarker STRING
Define a string that matches the string used to represent
null
in your source file. If this option is unspecified or set to an empty string, adjacent delimiters without text between them are parsed asNULL
values. This option supports valid escape sequences.You cannot load data in which more than one value in the same column is intended to be parsed as
NULL
. However, you can load data in which different columns have different values forNULL
. See Specifying NULL Behavior for Different Columns and NULL and Empty Markers.- --num-cores NUMBER_MIN_1
Set the number of CPU cores
ybload
will attempt to use to1
or greater. By default,ybload
tries to saturate all of the CPU cores on the host computer. For example, the default values for--num-readers
and the number of actual concurrent readers used by--read-sources-concurrently ALWAYS
are both based on the number of cores on the host computer. The primary purpose of this setting is to restrictybload
to the use of fewer resources when it shares the host computer with other programs.- --num-header-lines [ NUMBER ]
Ignore one or more header lines at the top of the source file. (The first or only header line in a file is typically a list of field names.) The default is
0
, or1
if--parse-header-line
is specified. The maximum number is5
. If you specify multiple source files, the first line is skipped in all of them.- --num-readers
Define the behavior for reading input files in parallel. See ybload Advanced Processing Options.
Note: When you are loading from named pipes, setting this option equal to the number of pipes is recommended. Not loading from pipes concurrently may cause a deadlock with the program that the pipes communicate with.
- --num-parsers-per-reader
Define multiple parsers per reader. See ybload Advanced Processing Options.
- --object-storage-*
See ybload Object Storage Options. These options apply to loads from Azure, AWS S3, and S3-compatible systems.
- --on-extra-field [ REMOVE | ERROR ]
Specify
REMOVE
to allow rows to be loaded when the source file contains more fields than the table has columns. The removed fields must be at the end of the line in the source file. SpecifyERROR
to reject rows with extra fields.ERROR
is the default.- --on-invalid-char [ REPLACE | ERROR ]
Specify the action to take when the source file contains characters that cannot be represented in the database (or characters that are invalid in the source file itself):
- The replacement character for a LATIN9 database is the question mark:
0x3F
(?
). - The replacement character for a UTF8 database is the Unicode replacement character:
U+FFFD
(a question mark in a diamond).
The default is
REPLACE
.- The replacement character for a LATIN9 database is the question mark:
- --on-missing-field [ SUPPLYNULL | ERROR ]
Allow rows to be loaded when the source file contains fewer fields than the table has columns. Specify
SUPPLYNULL
to load the missing columns at the end of the row withnull
values. SpecifyERROR
to reject rows with missing fields.ERROR
is the default.- --on-string-too-long TRUNCATE | ERROR
Truncate character strings that are longer than the specified column width (or return an error). The default is
ERROR
.- --on-unescaped-embedded-quote [ PRESERVE | ERROR ]
Preserve or return an error when unescaped quotes are found inside quoted strings. The default is
ERROR
.- --on-zero-char [ REMOVE | ERROR ]
Remove null bytes (
0x00
characters) that appear within strings inCHAR
andVARCHAR
fields (or return an error). The default isERROR
.- --parse-header-line
Use the header line at the top of the source file (a list of field names) to determine the column names in the target table. This option overrides the
--field-names
option.- --password
Prompt for the
ybload
user's password. See Setting up a Database Connection. The user who runs the load must have INSERT permissions on the table (but does not have to own the table).- --per-field-options {JSON Object}
Specify parsing options for individual fields. See ybload Field Options.
- --port
Port number. See Setting up a Database Connection.
- --quiet
Do not write any output to console. This option is suitable for
cron
invocations of the loader. If--quiet
is specified, you must also specify--logfile
.- --quote-char SPECIAL_CHARACTER
Define the character that is used in source files to quote field values that contain embedded delimiters. Specify any single Unicode character. The default is
"
. This option only applies when you are using--format CSV
.When
--format CSV
is set, the delimiter may be a multi-byte character.- --read-sources-concurrently ALWAYS | NEVER | ALLOW | <NUMBER>
Define the behavior for reading source files in parallel:
ALWAYS
,NEVER
,ALLOW
(the default), or a specific number of source files. See ybload Advanced Processing Options.Note: If you are loading from multiple pipes and this option is not set to
ALWAYS
,ybload
sets it toALWAYS
and returns anINFO
message stating that change. (Not reading from pipes concurrently could cause a deadlock with the program that they communicate with.)- --real-field-options
See ybload Field Options.
- --resume-partial-load-from-offset NUMBER
Skip the specified number of bytes in the source file in order to resume a failed bulk load. The default value is
0
. See Resuming a Partial Load.- --rows-per-transaction
Set the number of rows to load per commit. The default is set to the maximum number of rows that can be loaded (263 – 1). You can reduce this number to increase the frequency of commits when bulk loads are running. This option works in conjunction with
--bytes-per-transaction
. The threshold that is met first is applied.- --secured
Use SSL/TLS to secure all communications. The default is not secured. See also Enabling SSL/TLS Encryption.
- --skip-blank-lines, --no-skip-blank-lines
Skip blank lines in the source file (
true
) or detect blank rows as bad rows (false
). The default value istrue
.- --skip-comment-lines, --no-skip-comment-lines
Skip lines that are commented out in the source file or detect them as bad rows. The default value is
false
(do not skip).Note: If you use
--skip-comment-lines
, make sure the data does not contain any lines that begin with the comment character (which defaults to#
but may bet set to a different character with the--comment
option).- --smallint-field-options
See ybload Field Options.
- --source-compression GZ | BZIP2 | XZ | PACK200 | LZ4
This option explicitly defines the type of compression used by source data and is primarily intended for data sources that do not have file names (such as
STDIN
). This option applies to loads from all supported source types and overrides other compression detection methods: detection based on file name extensions and detection based on discovery of a compression signature in the header of the source data stream.Note that detection based on a compression signature in the source data only works for loads from regular files; it does not apply to loads from
STDIN
, named pipes, or object storage.- --table (or -t)
Name the target table to load and optionally its schema:
schema_name.table_name
. If you do not specify the schema name, the table is assumed to be in thepublic
schema. The schema of the target table is not based on the user'ssearch_path
, regardless of how it is set. The table must exist in the database that you are connecting to for the load. (Do not try to specify the database name as part of the table name. Use theYBDATABASE
environment variable,-d
, or--dbname
.)Note: If you used a quoted case-sensitive identifier to create the target table, you must quote the table name and escape the quotes in the
ybload
command line. For example, if your table is namedPremLeagueStats
:-t \"PremLeagueStats\"
- --time-field-options
See ybload Field Options.
- --timestamp-field-options
See ybload Field Options.
- --timestamptz-field-options
See ybload Field Options.
- --trim-white, --no-trim-white
Trim or retain leading and trailing whitespace characters in each field. With
--format csv
, whitespace characters inside of quotes are always preserved. The default value isfalse
(preserved).- --username
Database username. See Setting up a Database Connection.
- --uuid-field-options
See ybload Field Options.
- --version
Display the version of
ybload
you are running (as part ofybtools
). This option is not intended to be combined with other options. For example:$ ybload --version ybload version 1.2.2-5563
- --y2base YEAR
Define the pivot year (such as
1970
) for two-digit year values (such as97
and16
). For example, if--y2base
is set to1970
, two-digit years of70
and later are assumed to be in the 1900s. Values of69
and earlier are assumed to be in the 2000s.If you want to specify one of the
Y2
values for--date-style
, such asDMY2
, you must also specify a--y2base
value.
Parent topic:ybload Command