Unloading Data
This section explains how to unload tables and query results, using the ybunload
client tool. The ybunload
client is a high-performance parallel export tool that unloads data to files on the client system. The client connects to the manager node, then distributes the work of reading and writing the data to the compute blades. Data can be returned to the client in compressed (GZIP) or uncompressed files. The unloaded data can be streamed out to multiple output files or combined into a single file, depending on your requirements.
Download and install the unload client (as part of the ybtools package) from Yellowbrick Manager or the Customer Support download site. Then run the Linux and Windows executable programs:
ybunload
on Linuxybunload.exe
on Windows
Note: Yellowbrick recommends that you upgrade to the latest version of the ybtools
clients when you upgrade the cluster so that your client and server versions correspond. The client tools are backward-compatible but not always forward-compatible.
The following table shows which formats and storage targets are supported for ybunload
operations:
Supported Formats | Supported Storage Targets | Notes |
---|---|---|
csv | Local file system | csv is the default unload format. |
S3 or S3-compatible object storage | ||
text | Local file system | |
S3 or S3-compatible object storage | ||
parquet | Local file system | |
S3 or S3-compatible object storage | ||
Azure Blob storage, Azure Data Lake Storage Gen2 | Data unloaded to Azure object storage must be unloaded in parquet format. |