Appearance
ybload Command
Each ybload
command defines:
- The target table, which must exist in the database
- An absolute or relative path to an input file or multiple input files (local or S3)
- Database connection options
- Options for processing the load, returning output, and parsing the input data
Basic Syntax
ybload [options] -t [SCHEMA.]TABLENAME SOURCE [SOURCE]...
The -t
option supports either TABLENAME
or SCHEMA.TABLENAME
. If you do not specify the schema name, the table is assumed to be in the public
schema. The schema of the target table is not based on the user's search_path
, regardless of how it is set.
DATABASE.SCHEMA.TABLENAME
is not supported. If you want to specify the database name for a table, use the -d
option.
Note: If you used a quoted case-sensitive identifier to create the target table, you must quote the table name and escape the quotes in the ybload
command line. For example, if your table is named PremLeagueStats
:
-t \"PremLeagueStats\"
The ybload
client tool assumes that the character encoding of the source file (LATIN9 or UTF-8, for example) matches the encoding of the target database. If this not the case, see the --encoding option.
Calling the ybload Utility
The ybload
command connects to the database via JDBC as a Java client application. After you have downloaded and installed the bulk loader client (as part of the ybtools
package), you can run the Linux and Windows executable programs:
ybload
on Linuxybload.exe
on Windows
Getting Online Help
Use the following ybload
options to return online help text:
-?
or--help
for a usage summary and coverage of the basic options.--help-advanced
for a usage summary and a list of advanced options only; to see all options, specify both--help
and--help-advanced
:
$ ./ybload --help --help-advanced
The following command runs the bulk loader and returns basic help text for load options.
me@mydesk.io:~/ybtools/bin$ ./ybload --help
ybload version 2.0.0-10349
ybload loads source files into an existing table in a Yellowbrick Database.
...
Setting Up a Database Connection
To run the bulk loader, you have to start a database session on the server where the destination database and table reside. This session requires connection information that you can provide either as ybload
options on the command line or as current values for environment variables:
ybload Options | Environment Variable | Description | Example |
---|---|---|---|
-d or --dbname | YBDATABASE | Destination database name. Default: yellowbrick See also SQL Identifiers. | --dbname premdb export YBDATABASE=premdb |
-h or --host | YBHOST | Destination server host name. Default: localhost | -h test.ybsystem.io export YBHOST=test.ybsystem.io |
-p or --port | YBPORT | Destination server port number. Default: 5432 | --port 5433 export YBPORT=5433 |
-U or --username | YBUSER | Database login username. No default. Note: Do not run bulk loads as a superuser . The user must have INSERT privileges on the target table. | -U bobr export YBUSER=bobr |
-W or --password | YBPASSWORD | Interactive prompt for the database user's password. No default. | --password export YBPASSWORD=******** |
Bulk Load Manager and Data Transfer Ports
In addition to the YBPORT
setting for the database connection, the bulk loader uses the following default ports:
- Bulk Load Manager port:
11111
for managing load transactions and progress. - Data transfer ports:
31000-41000
. On an appliance with 15 active compute nodes, 30 ports in this range are used to load the data across the cluster (2 per active node).
Setting Other Options
The bulk loader comes with a large number of options that define ways to process the load, log results, and parse incoming data correctly. See ybload Options for details. Common examples include setting null markers and defining expected formats for dates and times. You can define some of the more advanced parsing options for specific data types and fields very flexibly by using JSON objects.
In This Section
Parent topic:Bulk Loading Tables