Sharing System Resources

Data warehouse resources can be shared in various ways. The main resources that can be allocated to database queries and other operations are:

CPU
Memory
Temporary spill space

For example, on a system with a high degree of concurrency, you can allocate more CPU to specific queries by increasing their priority. For complex queries that require more memory or spill space, you can request more of these resources at different points during execution. Long-running queries can be queued behind very fast queries to avoid starving short-running queries of adequate resources.

Workload management (WLM) refers generally to the process of sharing system resources in an optimal way so that database operations can be done efficiently and with respect to some order of priority. The goal is to meet service-level agreements (SLAs) between users and database administrators. These SLAs typically define performance requirements for both end-user queries and other database operations, such as bulk loads, backups, and system maintenance.

A workload is a set of queries or other database requests that is to some extent a known quantity. For example, if a group of users run ad hoc queries against the same set of tables every day, that set of queries, though somewhat arbitrary, may be thought of as a known and expected workload. A resource-intensive report that is run by one user at the same time every morning may also be anticipated as a separate (and probably high-priority) workload. A third example is database administration work, such as bulk loads and backups, which may occur during a "maintenance window," when end users have minimal access to the system.

Workloads may be defined across many different dimensions: in terms of when they are run, the application or user that runs them, the type of work, their expected duration, whether they are resource-intensive, and so on. These variables are typical for MPP database systems, which are rarely used for one type of query or by one type of user. Some familiar use cases that workload management can address are as follows:

Runaway queries: identify and stop long-running queries that, for example, select all of the rows from a very large table (whether issued naively, by mistake, or at a "bad time")
Short-query bias: give priority to queries that run very fast (subsecond speed) and prevent them from being queued behind longer-running queries for which an instant response is neither expected nor required
Ad hoc queries: place "browsing" or "discovery" queries at a lower priority in the queue than more critical queries that are needed to run the business
Time-sensitive queries: apply different rules at different times of the day or week. For example, weekly business roll-ups have the highest priority until they are done. All other queries have lower priority.
Admin queries: allocate resources to run sysadmin queries immediately, especially internally generated queries that maintain the database (for example, operations that flush and analyze new table rows).
Loads and updates: write queries that do batch loads, deletes, and updates must not starve read queries.
Logging, auditing, and reporting: log user-defined messages and tag queries as they are executed; learn about system usage in order to adjust future workload management behavior; create audit trails for separate applications and user groups.

To optimize resource allocation based on workloads, Yellowbrick administrators create WLM objects called rules, resource pools, and profiles. These objects define a flexible set of heuristics to translate typical WLM use cases into an optimal strategy for resource allocation and scheduling. You can set up WLM objects either in Yellowbrick Manager or by using SQL commands.

Configuring SSL/TLS for Tools and Drivers

Secure Connections for ODBC/JDBC Clients and ybsql

sys.lock

Bulk Load Examples

Running a Bulk Load

Loading Tables from Parquet Files

ybload Command

Loading from Amazon S3

Loading from Azure Blob Storage

Setting up and Running a Spark Job

Setting Up the ybrelay Service

LDAP Authentication

Synchronizing Users and Groups

Appliance: Disk Encryption

Setting Up Encrypted Drives

Remote Diagnostics

System Alerts

Creating an Alert Endpoint

Using the System Management Console

ybcli Reference

ybcli: config

AWS Marketplace

Create Stack

Docker

Cloud: Configuration

Vanity DNS

Yellowbrick Manager

Cloud: Enterprise Edition Getting Started

SQL-Based Loads from External Storage

Cloud: Installation

CLI Install Instructions

Permissions

Private Install Instructions

Public Install Instructions

Cloud: Kubernetes Guides

CREATE EXTERNAL FORMAT

CREATE EXTERNAL TABLE

CREATE TABLE

GRANT

Plan Hinting

SELECT

GROUP BY Clause

Subqueries

Data Type Casting

DECIMAL

JSON

JSONB

SQL String Constants

Aggregate Functions

Conditional Expressions

Datetime Functions

Formatting Functions

Geospatial functions

Mathematical Functions

Network Address Functions

Pattern Matching

Regular Expression Details

SQL Operators and Pattern Matching Functions

SQL Conditions

SQL User Defined Function (UDF)

SQL UDF Create Function

String Functions

ENCRYPT_KS

System Functions

Type-Safe Casting Functions

Window Functions

Creating WLM Resource Pools

Creating WLM Rules

Rule Examples

Sharing System Resources ​

Sharing System Resources