Skip to content

S3 Authentication Methods

There are several different ways to authenticate when you are using ybload to load data from AWS S3 or S3-compatible object storage. You can authenticate implicitly by using supported S3-specific mechanisms, or explicitly by using ybload command-line options. Your organization's S3 administrator should provide instructions for the approach you should use. See Best Practices for Managing AWS Access Keys for further recommendations.

S3 credentials must be provided in a manner supported by ybload and the AWS Java SDK:

  • Secure methods (integrated into your organization's login/identity mechanism):

  • EC2 roles (when running on Amazon EC2 instances)

  • SAML 2.0-compatible identity provider

  • Custom identity provider bridge to AWS

  • Other methods:

  • Object Storage Options

  • Environment variables: AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY. Other AWS environment variables are also supported, including AWS_SESSION_TOKEN, AWS_PROFILE, AWS_REGION, and AWS_CREDENTIAL_PROFILES_FILE.

  • URI query parameters: aws_access_key_id and aws_secret_access_key

  • A credential file, typically located at ~/.aws/credentials (location may vary by platform)

  • Required Permissions

  • Note that ybload requires access to read both the object and object metadata. Metadata access is a grantable permission separate from read access. Attempting to load a file from a public bucket with the option to download the file using an HTTP URL will cause ybload to fail and throw the following error:

    Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: xxxxxxxxxxxxxxxx)

An installation of the AWS CLI is not required, but it does provide the aws configure command, which is useful for setting credentials. For details, see the AWS Command Line Interface documentation.

Order of Precedence for Authentication Methods

You may have multiple credential settings available to you, based on your AWS account setup and how you script the ybload command. Therefore it is important to know which authentication mechanism takes precedence when the ybload command is run. Note that credentials set explicitly with ybload object storage options, if set (either on the command line or in a properties file), always take precedence over implicit credentials set via aws configure or other methods.

The order of precedence is as follows:

  1. Access key and access key ID specified with the --object-store-identity and --object-store-credential command-line options
  2. Access key and access key ID specified with the yb.file. prefix in one of the following:
  3. URI parameter
  4. Properties file named with the --object-store-provider-config command-line option
  5. Access key and access key ID without the yb.file. prefix in one of the following:
  6. URI parameter
  7. Properties file named with the --object-store-provider-config command-line option
  8. Implicit authentication via environment variables and supported AWS configuration files: ~/.aws/* files: (~/.aws/credentials or ~/.aws/config)

Parent topic:Loading from Amazon S3