Appearance
ybunload Options
This section describes the options you can use in a ybunload command.
- @file
Specify a file that includes a set of options and values to use for the unload. See Saving Unload Options to a File. You can use this option and the
--select-fileoption in the sameybunloadcommand.- --cacert STRING
Customize trust with secured communication; use this option in combination with the
--securedoption. Enter the file name of a custom PEM-encoded certificate or the file name and password for a Java KeyStore (JKS).For PEM format, the file must be named with a
.pem,.cert,.cer,.crt, or.keyextension. For example:--cacert cacert.pemFor JKS format, files are always password-protected. Use the following format:
--cacert yellowbrick.jks:changeitwhere the
:character separates the file name from the password of the keystore.- --cancel-hung-uploads
Attempt to cancel any multi-part uploads to S3 that were left in a hung state because of a previous failure. You must use this option independently; you cannot unload data and cancel uploads in the same
ybunloadcommand. In this case, the-ooption specifies the destination for the uploads that were failing.For example:
$ ybunload -d premdb --username bobr -W -o s3://yb-tmp/premdb/premdb_unloads --cancel-hung-uploadsWarning: Any other uploads to the specified bucket may be cancelled if the files have the same prefix.
Note: Files that are uploaded to S3 are first written to temporary storage. After an interrupted unload, these files remain in this temporary location, but the
--cancel-hung-uploadsoperation clears them. This operation does not clean up any files in the final unload destination (the actual path to the S3 bucket and its folders).See also Unloading Data to an S3 Bucket and --max-file-size.
- --compress, -c NONE | GZIP_FAST | GZIP | GZIP_MORE | GZIP_BEST | GZIP_STREAM_FAST | GZIP_STREAM | GZIP_STREAM_MORE | GZIP_STREAM_BEST
The default is
NONE. The other options write data as GZIP compressed files in two different compression modes: "block mode" and "stream mode." TheGZIP*options run in block mode, and theGZIP_STREAM*run in stream mode. See ybunload Output Files.GZIPandGZIP_FASTare synonyms.GZIP_MOREprovides better compression, but slower unload performance, andGZIP_BESTprovides the best compression but much slower performance.GZIP_STREAMandGZIP_STREAM_FASTare synonyms.GZIP_STREAM_MOREprovides better compression, but slower unload performance, andGZIP_STREAM_BESTprovides the best compression but much slower performance. TheGZIP_STREAM_*options are intended to be used only if your downstream workflow tools cannot handlegzipfiles containing multiple compression blocks. Additionally, theGZIP_STREAM_*options consume significantly more network connections than theGZIP*options, meaning many network routers won't be able to handle the increased number of connections reliably.- --dbname, -d
Name of the source database. The default is
yellowbrick. See also Setting up a Database Connection.- --delimiter STRING | UNICODE_CHARACTER
Define the field delimiter that will separate columns of data in output files. The default is a tab character (
'\t') intextformat, and a comma incsvformat.Valid delimiters include special characters such as
'|', Unicode characters, and other supported escape sequences.Note: The delimiter cannot be the null byte (
0x00).- --disable-trust, -k
Disable SSL/TLS trust when using secured communications. Trust is enabled by default. See also Enabling and Verifying SSL/TLS Encryption.
Important: This option is not supported for use on production systems and is only recommended for testing purposes. It may be useful to disable trust during testing, then enable it when a formal signed certificate is installed on the appliance.
- --force-quote 'column,column,…' | '*'
Specify a list of columns to quote in the output files, or specify
'*'for all columns. This option is allowed only when you are usingcsvformat.NULLvalues are not quoted, regardless of this option.- --format csv | text
Select the output format to use:
text(tab-delimited by default) orcsv(comma-delimited by default). The default format iscsvand creates files with a.csvextension. Text format creates files with a.txtextension.- -?, --help
Return basic usage information for the
ybunloadcommand and its options.- --help-advanced
Return more advanced usage information for the
ybunloadcommand and its options.- -h, --host
Host name. See Setting up a Database Connection.
- --initial-connection-timeout NUMBER
Specify the number of seconds to wait for initial connections to the database. The minimum value is 0, which means wait forever. The default is
120.- --linesep LF | CR | CRLF | RS
Specify a row separator. The default is
LF.RS: The ASCII Record Separator (ASCII code30, UnicodeINFORMATIONAL SEPARATOR TWO, hex0x001e).- --java-version
Return the Java version that is running on the client system. The client tools require the 64-bit version of Java 8 (also known as Java 1.8). Java 9 and 10 are not supported.
- --log-level OFF | ERROR | WARN | INFO | DEBUG | TRACE
Specify the logging level for the default console output. The default level is
INFO. (Use the--logfile-log-leveloption to specify the logging level for a named log file.)- --logfile STRING
Specify the name and location of a log file for the unload operation. If the specified file already exists, it will be truncated. If this option is not specified, no log file is written. When you specify this
--logfileoption, also specify a--logfile-log-levelvalue other thanOFF.Note: When object storage is used for loading or unloading data, logs must be written to the local file system. Specifying a log file in an object storage location, such as an S3 bucket, is not supported.
- --logfile-log-level OFF | ERROR | WARN | INFO | DEBUG | TRACE
Specify the logging level for a given log file (as defined with the
--logfileoption). If the level is not specified, it defaults to the--log-levelvalue. You must specify a--logfile-log-levelvalue other thanOFFwhen you specify the--logfileoption.- --max-file-size
Specify the maximum size of a file that
ybunloadcan export. For example:10GBor1TB. Do not use fractional values. The default maximum is 50GB. The minimum file size is 1GB.Note: AWS S3 uploads are made up of a series of "parts." Each part must be >5MB, and no multi-part upload may exceed 10,000 parts. The default part size is 6MB, which means that each file that is uploaded to S3 cannot exceed 60GB. This restriction does not limit the total amount of data that
ybunloadcan upload to S3, only the size of individual files. Increasing the default part size is not recommended.- --nullmarker
String to use as a null marker in the unloaded files. For example:
--nullmarker '[NULL]'For
--format textloads, the default null marker is\N.For other formats, no default null marker is used. In the output,
NULLvalues will appear as adjacent delimiters with no text between them.- -o
Name a local file directory where the unload directories and files will be placed. If you do not specify a location, your current working directory will be used. The default prefix for unload files is
unload, but you can modify it by setting the--prefixoption.Note: If an output file with a given prefix already exists in the location where you are unloading data,
ybunloadreturns an error. For example:18:56:51.166 [ERROR] Unable to create output file: ./unload_1_1_.csv File already exists. Remove existing file or use --truncate-existingYou can work around this error by removing the file manually, by using the
--truncate-existingoption, or by setting the--prefixoption.- --parallel, --no-parallel
Enable parallel processing on all workers for the final sort operation that occurs when an unload query contains an ORDER BY clause. By default, the final ORDER BY sort runs on a single worker. If the unload query does not have an explicit ORDER BY clause, this option has no effect.
When you use the
--paralleloption, files are unloaded in streams from each worker, and the rows are guaranteed to be sorted within each file and within each stream. However, the complete set of files from all of the workers will not be unloaded in order.If you intend to reload data that was unloaded with the
--paralleloption, it is recommended that you create the target table with aSORT ONcolumn. The presence of aSORT ONcolumn causesybloadsource files to be loaded in fully sorted order. If you need to stitch the unload files together for loading or for use in other applications, you may need to write a script that checks the first and last lines of each unloaded file.See also --select, -s and ybunload Output Files.
- -p, --port
Port number. See Setting up a Database Connection.
- --prefix
Specify a prefix to attach to each unload file name. The default is
unload. For example, if you use the prefix04-30-18, the files are numbered consecutively in the following format:04-30-18_1_1_.csv 04-30-18_1_2_.csv 04-30-18_1_3_.csv 04-30-18_1_4_.csv 04-30-18_1_5_.csv ...The convention for naming files is as follows:
<prefix_><streamID_><partnumber_>.<extension>prefix_: As defined, orunloadby default.stream id_: A number assigned to each data stream from the workers. The streams are not in any particular order relative to a specific worker, and a single worker may provide multiple streams.partnumber_: An incrementing number starting from1for each stream..extension: file type, such as.csvor.gz.
- -q, --quiet
Do not write any output to console. This option is suitable for
croninvocations. If--quietis specified, you must also specify--logfile.- --quote UNICODE_CHARACTER
Specify the character to use for quoting.
- --quote-escape UNICODE_CHARACTER
Specify the character to use for escaping quotes.
- --secured
Use SSL/TLS to secure all communications. The default is not secured. See also Enabling and Verifying SSL/TLS Encryption.
- --select, -s
Run a SQL
SELECTstatement to unload data; any valid query is allowed. You cannot unload the results of other SQL statements, such asINSERTorCREATE TABLE AS(CTAS).Note: If you want to unload data to a single file, use the
--selectoption and include anORDER BYclause in the query. Sorting the unloaded data in this way requires the entire result set of the query to fit into memory (or memory plus spill space).- --select-file, -f
Specify a file that contains the query you want to run in the
ybunloadoperation (instead of entering the statement directly on the command line with the--selectoption). You can use this option and the@fileoption in the sameybunloadcommand.- --stdout
Unload data to
stdoutinstead of a destination directory. When you use this option, also specify-q,--logfile, and a--logfile-log-levelvalue other thanOFF.- --table, -t
This option supports
TABLENAME,SCHEMA.TABLENAME, orDATABASE.SCHEMA.TABLENAMEformats. If you do not specify the schema name, the table is assumed to be in thepublicschema. You do not have to quote table names that have mixed case. For example, all of the following table entries are valid for a table namedAwayTeam:-t public."AwayTeam" -t 'public."AwayTeam"' -t "AwayTeam" -t AwayTeam- --truncate-existing
Remove all the existing output files in the destination directory that have the same prefix as the files being unloaded. You cannot use this option when unloading data to AWS S3 buckets.
- -U, --username
Database username. See Setting up a Database Connection.
- --version
Display the version of
ybunloadyou are running (as part ofybtools). This option is not intended to be combined with other options. For example:$ ybunload --version ybunload version 1.2.2-5653- -W, --password
Database user's password. See Setting up a Database Connection.
Parent topic:ybunload Command