Skip to content

Bulk Loading Tables

The Yellowbrick bulk loader (ybload) is a Java-based bulk data loader that you invoke from a client system. You can load very large data files from remote systems by running this utility. Yellowbrick recommends using the bulk loader to load all of your database tables.

The loader distributes the data in parallel directly to compute nodes, based on the distribution key in the CREATE TABLE statement. The utility loads a single destination table that you specify, by default appending the loaded rows to any existing rows in the table (including any duplicate rows). You can also use ybload to update, delete, or "upsert" rows. An upsert updates existing rows or inserts new rows in a table as part of a single ybload operation. For updates and upserts, you can manage how duplicate rows are processed.

Yellowbrick recommends that you upgrade to the latest version of the ybtools clients when you upgrade the cluster so that your client and server versions correspond. The client tools are backward-compatible but not always forward-compatible.