Appearance
ybload Command
Each ybload
command defines:
- The target table, which must exist in the database
- An absolute or relative path to an input file or multiple input files (local or S3)
- Database connection options
- Options for processing the load, returning output, and parsing the input data
Note: Do not run bulk loads as a superuser
. The user must have INSERT
privileges on the target table but does not have to own the table.
Basic Syntax
ybload [options] -t [SCHEMA.]TABLENAME SOURCE [SOURCE]...
The -t
option supports either TABLENAME
or SCHEMA.TABLENAME
. If you do not specify the schema name, the table is assumed to be in the public
schema. The schema of the target table is not based on the user's search_path
, regardless of how it is set.
DATABASE.SCHEMA.TABLENAME
is not supported. If you want to specify the database name for a table, use the -d
option.
Note: If you used a quoted case-sensitive identifier to create the target table, you must quote the table name and escape the quotes in the ybload
command line. For example, if your table is named PremLeagueStats
:
-t \"PremLeagueStats\"
The ybload
client tool assumes that the character encoding of the source file (LATIN9 or UTF-8, for example) matches the encoding of the target database. If this not the case, see the --encoding option.
Calling the ybload Utility
The ybload
command connects to the database via JDBC as a Java client application. After you have downloaded and installed the bulk loader client (as part of the ybtools
package), you can run the Linux and Windows executable programs:
ybload
on Linuxybload.exe
on Windows
Getting Online Help
Use the following ybload
options to return online help text:
-?
or--help
for a usage summary and coverage of the basic options.--help-advanced
for a usage summary and a list of advanced options only; to see all options, specify both--help
and--help-advanced
:
$ ./ybload --help --help-advanced
The following command runs the bulk loader and returns basic help text for load options.
$ ./ybload --help
ybload version 5.4.0-20220314113018
ybload loads files into an existing table in a Yellowbrick Database.
All files must have the same field layout.
...
Bulk Load Manager and Data Transfer Ports
In addition to the YBPORT
setting for the database connection, the bulk loader uses the following default ports:
- Bulk Load Manager port:
11111
for managing load transactions and progress. - Data transfer ports:
31000-41000
. On a cluster with 15 active compute nodes, 30 ports in this range are used to load the data across the cluster (2 per active node).
Setting Other Options
The bulk loader comes with a large number of options that define ways to process the load, log results, and parse incoming data correctly. See ybload Options and Common Options in ybtools for details. Frequently used settings include null markers and expected formats for dates and times. You can define some of the more advanced parsing options for specific data types and fields very flexibly by using JSON objects.
Note: By default, ybload
appends rows to tables (also referred to as inserting rows). However, you can use the --write-op
option to set up a load that deletes, updates, or "upserts" rows.
In This Section
Parent topic:Bulk Loading Tables