Skip to content

Loading from Azure Blob Storage

This section explains how to use the ybload client to load tables from an Azure Blob object store.

These instructions assume that you already have access to an Azure Blob storage account that contains the data to be loaded. For example:

If you do not have access to a storage account, follow the instructions in the Microsoft documentation before proceeding with the information in this section.

To bulk load a table from data in an Azure Blob container:

  1. Use one of the documented Azure Authentication Methods.

For example, to authenticate via the ybload command-line options, you will need:

  • A storage account name (your "identity"). Specifying the Azure endpoint (https://<storage account>.blob.core.windows.net) in addition to the account name is optional.
  • An access key or a generated SAS token (your "credential") You can only access one object store per ybload operation. For example, you cannot specify connection details for both an Azure client and an S3 client in the same command, and you cannot load data from multiple Azure endpoints or storage accounts.
  1. Identify the location in Azure of the data you want to load (container/blob).

At the end of the ybload command line, you will specify the source data in a URI, which is an abbreviated Azure path. For example:

azure://premdb/match.csv

You can load multiple files by specifying a prefix for the blob instead of a complete file name. For example, the following command will find all files in the premdb container that begin with match0:

azure://premdb/match0

You can load data from multiple containers within the same ybload command, as long as they belong to the same storage account.

  1. Run the ybload command in the usual way. Include the following details in the command line:
  • Standard ybload options such as --format and --delimiter

  • A local path on the client system for the bad row file. (If you specify a log file, it must also be on the local file system.)

  • Azure identity (storage account name) and credential (access key or SAS token)

    These options are not needed if you are using the Azure CLI (az login) means of authentication. However, you will need to specify the --object-store-endpoint option.

  • Location(s) of the source data (specified last on the command line, without an option name) For example, the following command would load a single blob (match.csv) from the premdb container:

$ ybload -d premdb --username bobr -W -t match --format csv --delimiter ',
' --bad-row-file '/home/brumsby/newazurebad' 
--object-store-credential '****************************************' 
--object-store-identity 'ybbobr' 
azure://premdb/match.csv

Alternatively, you can save the identity and credential values to a Java properties file and name the file in the ybload command. See Object Storage Options and Azure Blob Storage URIs.

In This Section

Parent topic:Loading Data from Object Storage