Bulk Loading Tables
The Yellowbrick bulk loader (ybload
) is a Java-based bulk data loader that you
invoke from a client system. You can load very large data files from remote systems by running
this utility. Yellowbrick recommends using the bulk loader to load all of your database
tables.
The loader distributes the data in parallel directly to the worker nodes, based on the
distribution key in the CREATE TABLE
statement. The utility loads a single
destination table that you specify, by default appending the loaded rows to any
existing rows in the table (including any duplicate rows). You can also use
ybload
to update, delete, or "upsert" rows. An upsert updates existing rows
or inserts new rows in a table as part of a single ybload
operation. For
updates and upserts, you can manage how duplicate rows are processed.
Yellowbrick recommends that you upgrade to the latest version of the ybtools clients when you upgrade the cluster so that your client and server versions correspond. The client tools are backward-compatible but not always forward-compatible.