Loading from Azure Blob Storage
This section explains how to use the ybload
client to load tables from an Azure Blob object store.
These instructions assume that you already have access to an Azure Blob storage account that contains the data to be loaded. For example:
If you do not have access to a storage account, follow the instructions in the Microsoft documentation before proceeding with the information in this section.
To bulk load a table from data in an Azure Blob container:
- Use one of the documented Azure Authentication Methods.
For example, to authenticate via the ybload
command-line options, you will need:
- A storage account name (your "identity"). Specifying the Azure endpoint (
https://<storage account>.blob.core.windows.net
) in addition to the account name is optional. - An access key or a generated SAS token (your "credential") You can only access one object store per
ybload
operation. For example, you cannot specify connection details for both an Azure client and an S3 client in the same command, and you cannot load data from multiple Azure endpoints or storage accounts.
- Identify the location in Azure of the data you want to load (
container/blob
).
At the end of the ybload
command line, you will specify the source data in a URI, which is an abbreviated Azure path. For example:
azure://premdb/match.csv
You can load multiple files by specifying a prefix for the blob instead of a complete file name. For example, the following command will find all files in the premdb
container that begin with match0
:
azure://premdb/match0
You can load data from multiple containers within the same ybload
command, as long as they belong to the same storage account.
- Run the
ybload
command in the usual way. Include the following details in the command line:
Standard
ybload
options such as--format
and--delimiter
A local path on the client system for the bad row file. (If you specify a log file, it must also be on the local file system.)
Azure identity (storage account name) and credential (access key or SAS token)
These options are not needed if you are using the Azure CLI (
az login
) means of authentication. However, you will need to specify the--object-store-endpoint
option.Location(s) of the source data (specified last on the command line, without an option name) For example, the following command would load a single blob (
match.csv
) from thepremdb
container:
$ ybload -d premdb --username bobr -W -t match --format csv --delimiter ',
' --bad-row-file '/home/brumsby/newazurebad'
--object-store-credential '****************************************'
--object-store-identity 'ybbobr'
azure://premdb/match.csv
Alternatively, you can save the identity and credential values to a Java properties file and name the file in the ybload
command. See Object Storage Options and Azure Blob Storage URIs.
In This Section
Parent topic:Loading Data from Object Storage