Skip to content

Unloading Data to Azure Storage

These instructions assume that you already have a Microsoft Azure account, know the Azure location of the files you want to load, and have the appropriate credentials ready at hand. If not, follow the instructions on the Azure portal before proceeding with the information here.

These instructions apply to both Azure Blob Storage and Azure Data Lake Storage Gen 2.

For example:

Assuming that these prerequisites are in place, you can unload data from a table or a query to files stored in an Azure container.

To unload a table or the results of a query to an Azure container:

  1. Use one of the documented Azure Authentication Methods.

For example, to authenticate via the ybload command-line options, you will need:

  • A storage account name (your "identity"). Specifying the Azure endpoint (https://<storage account>.blob.core.windows.net) in addition to the account name is optional.
  • An access key or a generated SAS token (your "credential") You can only access one object store per ybunload operation. For example, you cannot specify connection details for both an Azure client and an S3 client in the same command, and you cannot unload data from multiple Azure endpoints or storage accounts.
  1. Identify the location in Azure where you want the unloaded files to be stored (container/blob).

At the end of the ybunload command line, you will specify the destination with the -o option and a URI, which is an abbreviated Azure path. For example:

-o azure://premdb/
  1. Run the ybunload command in the usual way, but be sure to include:
  • --format parquet

  • Other Parquet Processing Options, as needed.

  • Azure identity (storage account name) and credential (access key or SAS token)

    These options are not needed if you are using the Azure CLI (az login) means of authentication. However, you will need to specify the --object-store-endpoint option.

    See Object Storage Options.

  • -o option (destination directory)

    If you want the unloaded files to be named in a specific, recognizable way, use the --prefix option.

For example, the following command would unload the match table from the premdb database and store the unloaded files in parquet format in an Azure container:

$ ybunload -d premdb --username bobr -W -t match --format parquet 
--object-store-credential '****************************************' 
--object-store-identity 'ybbobr' 
-o azure://premdb/

Alternatively, you can save the identity and credential values to a Java properties file and name the file in the ybunload command. See Object Storage Options and Azure Blob Storage URIs.

Parent topic:Unloading Data to Object Storage