Skip to content

CREATE EXTERNAL LOCATION

Create an external location that you can either use for primary database storage or that you can reference when you load a table from external storage. This object defines either the target storage container for table data or the location of source files that you intend to load.

CREATE EXTERNAL LOCATION [ IF NOT EXISTS ] name
PATH 'path'
EXTERNAL STORAGE storage_name
[ EXTERNAL FORMAT format_name ]
[ USAGE PRIMARY DEFAULT | EXTERNAL ]
IF NOT EXISTS

Create the object if it does not already exist. If it does exist, do not create it and do not return an error.

PATH

Name of the S3 bucket or Azure container that contains (or will contain) source files for the load. For example:

path 'yb_premdb'

Do not specify any sub-folders or slash characters, just the bucket or container name. (If folders exist inside the bucket, you can specify them as part of the prefix string in the LOAD TABLE command.)

EXTERNAL STORAGE

Name of an external storage object. See CREATE EXTERNAL STORAGE. This parameter is required.

EXTERNAL FORMAT

Name of an external format object. See CREATE EXTERNAL FORMAT. The format you specify will be used as the default format for parsing files for this external location object. This parameter is optional; when the table is loaded, the format can be defined with a set of load-specific options that override the default format.

USAGE PRIMARY DEFAULT | EXTERNAL

Whether the external location (and associated object storage) is intended for storing persistent table data or loading and unloading (data movement). A USAGE PRIMARY DEFAULT location will be used to store all the table data in the databases that belong to an instance. A USAGE EXTERNAL location will be used for loading and unloading data. If you intend to create a USAGE PRIMARY DEFAULT location, you must first deselect the Create initial external storage option in Yellowbrick Manager when you create the instance. See Configuring Primary Storage in AWS.

You can use the same primary storage location for multiple data warehouse instances.

You must have the correct privileges to run this command. See ON EXTERNAL object.

Examples

Create an external location object that references an external storage object:

premdb=> create external location premdbazuredata 
path 'premdb' external storage "premdbAzure";
CREATE EXTERNAL LOCATION

Drop an external location object and re-create it with the optional external format now specified:

premdb=> drop external location if exists premdbs3data;
DROP EXTERNAL LOCATION

premdb=> create external location premdbs3data 
path 'ybpremdb'
external storage premdbs3
external format premfiles;
CREATE EXTERNAL LOCATION

Create an external location object that is used for primary database storage:

yellowbrick=# create external location azurevpc 
path 'ybownbucket' 
external storage azurevpc_storage 
usage primary default; 
CREATE EXTERNAL LOCATION

Configuring Primary Storage in AWS

Primary storage is per-instance object storage that is used for storing the data in Yellowbrick tables and other database objects. If you want to use your own primary storage, as opposed to allowing Yellowbrick to create that storage for you when you create an instance, follow these steps:

  1. In the AWS console, using the same account you used to install the VPC, create a new bucket. In the Permissions tab, add the following policy, where ybownbucket is the bucket name:
{
    "Version": "2012-10-17",
    "Id": "EnableBucketEncryption",
    "Statement": [
        {
            "Sid": "Deny non-TLS",
            "Effect": "Deny",
            "Principal": "*",
            "Action": "s3:*",
            "Resource": [
                "arn:aws:s3:::ybownbucket",
                "arn:aws:s3:::ybownbucket/*"
            ],
            "Condition": {
                "Bool": {
                    "aws:SecureTransport": "false"
                }
            }
        }
    ]
}
  1. Create a new data warehouse instance with the Create initial external storage option deselected. (The default is selected.)
  2. Before creating any clusters, databases, or tables in this instance, create an external storage object and an external location object that references the storage object. Connect to the instance you just created (you do not need a cluster). For example:
create external storage inst4ownbucket 
type S3
endpoint 'https://s3.us-east-1.amazonaws.com'
region 'us-east-1'
identity '********************'
credential '****************************************';

create external location inst4awsvpc 
path 'ybownbucket' 
external storage inst4ownbucket 
usage primary default;
  1. Start creating clusters, databases, and tables in the new instance. The bucket named in the PATH for the CREATE EXTERNAL LOCATION object will be used to store Yellowbrick data.

Configuring Primary Storage in Microsoft Azure