ybload Command

Each ybload command defines:
  • The target table, which must exist in the database
  • An absolute or relative path to an input file or multiple input files (local or S3)
  • Database connection options
  • Options for processing the load, returning output, and parsing the input data

Basic Syntax

ybload [options] -t [SCHEMA.]TABLENAME SOURCE [SOURCE]...

The -t option supports either TABLENAME or SCHEMA.TABLENAME. If you do not specify the schema name, the table is assumed to be in the public schema. The schema of the target table is not based on the user's search_path, regardless of how it is set.

DATABASE.SCHEMA.TABLENAME is not supported. If you want to specify the database name for a table, use the -d option.
Note: If you used a quoted case-sensitive identifier to create the target table, you must quote the table name and escape the quotes in the ybload command line. For example, if your table is named PremLeagueStats:
-t \"PremLeagueStats\"

The ybload client tool assumes that the character encoding of the source file (LATIN9 or UTF-8, for example) matches the encoding of the target database. If this not the case, see the --encoding option.

Calling the ybload Utility

The ybload command connects to the database via JDBC as a Java client application. After you have downloaded and installed the bulk loader client (as part of the ybtools package), you can run the Linux and Windows executable programs:
  • ybload on Linux
  • ybload.exe on Windows

Getting Online Help

Use the following ybload options to return online help text:
  • -? or --help for a usage summary and coverage of the basic options.
  • --help-advanced for a usage summary and a list of advanced options only; to see all options, specify both --help and --help-advanced:
    $ ./ybload --help --help-advanced
The following command runs the bulk loader and returns basic help text for load options.
me@mydesk.io:~/ybtools/bin$ ./ybload --help
ybload version 2.0.0-10349
ybload loads source files into an existing table in a Yellowbrick Database.

Setting Up a Database Connection

To run the bulk loader, you have to start a database session on the server where the destination database and table reside. This session requires connection information that you can provide either as ybload options on the command line or as current values for environment variables:
ybload Options Environment Variable Description Example
-d or --dbname YBDATABASE Destination database name. Default: yellowbrick

See also SQL Identifiers.

--dbname premdb
export YBDATABASE=premdb
-h or --host YBHOST Destination server host name. Default: localhost
-h test.ybsystem.io
export YBHOST=test.ybsystem.io
-p or --port YBPORT Destination server port number. Default: 5432
--port 5433
export YBPORT=5433
-U or --username YBUSER Database login username. No default.
Note: Do not run bulk loads as a superuser. The user must have INSERT privileges on the target table.
-U bobr
export YBUSER=bobr
-W or --password YBPASSWORD Interactive prompt for the database user's password. No default.
export YBPASSWORD=********

Bulk Load Manager and Data Transfer Ports

In addition to the YBPORT setting for the database connection, the bulk loader uses the following default ports:
  • Bulk Load Manager port: 11111 for managing load transactions and progress.
  • Data transfer ports: 31000-41000. On an appliance with 15 active compute nodes, 30 ports in this range are used to load the data across the cluster (2 per active node).

Setting Other Options

The bulk loader comes with a large number of options that define ways to process the load, log results, and parse incoming data correctly. See ybload Options for details. Common examples include setting null markers and defining expected formats for dates and times. You can define some of the more advanced parsing options for specific data types and fields very flexibly by using JSON objects.

Note: By default, ybload appends rows to tables (also referred to as inserting rows). However, you can use the --write-op option to set up a load that deletes, updates, or "upserts" rows.