Bulk Data Loading

Yellowbrick supports loading data through standard SQL INSERT and PostgreSQL \copy commands. This works well for small quantities of data, especially where the data needs to be inserted and available to query immediately. However, like other columnar databases, when tens of megabytes through terabytes of data need to be loaded efficiently, a bulk mode is supported that moves data directly to a compute cluster, bypassing the shared services.

Bulk loads insert data directly into database tables from source files in object storage, source files on NFS servers or source files on local disc. There are a number of different ways to bulk load data, outlined below.

Bulk Loading with `ybload`

Data can be loaded through a command-line tool called ybload which is part of the client tools distribution. It supports a wide variety of file formats and protocols, as well as third party integrations. For more information see the the ybload documentation.

Bulk loading via SQL

On cloud platforms, a full SQL grammar supports loading data from external object storage. See SQL-Based Loads from External Storage and the LOAD TABLE command.

Using Yellowbrick Manager Load Assistant

Yellowbrick Manager contains a simple load assistant that can be used for importing data sets with minimal SQL knowledge. See Loading a Table via the Load Assistant for a walkthrough.

Configuring SSL/TLS for Tools and Drivers

Secure Connections for ODBC/JDBC Clients and ybsql

sys.lock

Bulk Load Examples

Running a Bulk Load

Loading Tables from Parquet Files

ybload Command

Loading from Amazon S3

Loading from Azure Blob Storage

Setting up and Running a Spark Job

Setting Up the ybrelay Service

LDAP Authentication

Synchronizing Users and Groups

Appliance: Disk Encryption

Setting Up Encrypted Drives

Remote Diagnostics

System Alerts

Creating an Alert Endpoint

Using the System Management Console

ybcli Reference

ybcli: config

AWS Marketplace

Create Stack

Docker

Cloud: Configuration

Vanity DNS

Yellowbrick Manager

Cloud: Enterprise Edition Getting Started

SQL-Based Loads from External Storage

Cloud: Installation

CLI Install Instructions

Permissions

Private Install Instructions

Public Install Instructions

Cloud: Kubernetes Guides

CREATE EXTERNAL FORMAT

CREATE EXTERNAL TABLE

CREATE TABLE

GRANT

Plan Hinting

SELECT

GROUP BY Clause

Subqueries

Data Type Casting

DECIMAL

JSON

JSONB

SQL String Constants

Aggregate Functions

Conditional Expressions

Datetime Functions

Formatting Functions

Geospatial functions

Mathematical Functions

Network Address Functions

Pattern Matching

Regular Expression Details

SQL Operators and Pattern Matching Functions

SQL Conditions

SQL User Defined Function (UDF)

SQL UDF Create Function

String Functions

ENCRYPT_KS

System Functions

Type-Safe Casting Functions

Window Functions

Creating WLM Resource Pools

Creating WLM Rules

Rule Examples

Bulk Data Loading ​

Bulk Loading with ybload ​

Bulk loading via SQL ​

Using Yellowbrick Manager Load Assistant ​

Bulk Data Loading

Bulk Loading with `ybload`

Bulk loading via SQL

Using Yellowbrick Manager Load Assistant