Restarting Queries

This section describes how WLM handles cases where the system restarts queries or an administrator (or rule) triggers a restart. See also Moving Queries.

Restartable Queries

Queries may be restarted by the system or by an administrator when they meet certain criteria. To be restartable, a query must be in one of the following states:

assemble
compile
acquire_resources
run
Restart On Error

Note: Restarting a query in the assemble, compile, acquire_resources, or runstates will run the query again, going through all earlier states in the query's life cycle.

However, restarting a query during Restart On Error will only move the query to the target resource pool.

If a query has already returned its first row to the client (client_wait state), it cannot be restarted.

In addition, only the following types of queries can be restarted:

select
ctas
insert (INTO SELECT, but not VALUES)

If a query is processing an error or is already in the process of restarting, it cannot be restarted; however, a restart may be possible later. See the Default Error Recovery section.

Default Error Recovery

By default, the same query may be restarted only once before it is deemed to be unrecoverable. The restart policy in system-defined rules (global_restartErrorPolicy and global_restartErrorPolicySuperuser) enforces this behavior; in general, you do not need to change it. Depending on the type of error a query encounters, attempting to restart it more than once may not be of practical value. The query may error out in exactly the same way before it eventually aborts.

The system rules named global_restartErrorCodes and global_restartErrorCodesSuperuser constrain the default restart policy so that it applies only to queries that fail with the following subset of recoverable error codes:

KE038 RPCCHANNELCLOSED
KE039 RPCCHANNELBROKEN
KE041 Library file write
P0004 ERRCODE_ASSERT_FAILURE
WM001 ERRCODE_WORKER_OFFLINE
WM002 ERRCODE_SYSTEM_NOT_READY
WM003 ERRCODE_CANCEL_FOR_RESTART
YB044 RecoverableGeneric

Queries that fail with other recoverable error codes (or any other error code) are not subject to the policy defined by these rules. Again, you do not need to change this behavior, but you can create new, modified ErrorCodes rules that include other codes from the recoverable list. See Restart Rule Examples.

Another system-defined rule (flex_expandResourcesErrorHandler) defines restart behavior for queries that run in the flex profile:

log.info(w + ' is restarting for error ' + w.errorCode);
if (String(w.errorCode).match(/53200|KE002|YB004|KE032|KE029|YB006|EEOOM/)) {

  // See if we can't expand resources; if we can, lets try the query with more resources.
  if (!wlm.assignMaximumResources(w)) {
    w.errorRecoverable = false;
    log.info(w + ' cannot expand resources; marked as not recoverable');
  } else {
    w.errorRecoverable = true;
    log.info(w + ' expanded resources for restart (memory ' + w.requestedMemoryMB + ', spill ' + w.requestedSpillMB + ')');
  }
}

This rule logs messages for queries that restarted after failing with a specific subset of the recoverable error codes (errors that typically indicate conditions under which a query is likely to benefit from more resources). For example, if a query runs out of memory, the flex profile can expand or contract its resources to accommodate different levels of concurrency, effectively making more (or all) of its memory available.

If wlm.assignMaximumResources returns true, this means that resources available to the pool (memory, temp space, priority/CPU) were expanded, and the INFO message that is sent when the query completes logs those runtime resource values. If the resources were not expanded, the INFO message reports that instead. Note that the wlm.assignMaximumResources property is in place for the purpose of error recovery and may not be of practical use within your own WLM rules. User-defined rules are more likely to benefit from setting requested resources, as defined by a different set of memory and spill space properties, or by setting query priority.

Requesting Additional Resources for Restarted Queries

Administrators can restart a running query in another pool explicitly by using the RESTART query command. When you restart a query in this way, you can request specific resources (memory, priority, and spill space), which are allocated if they are available when the command is submitted.

For example:

premdb=# restart 347010 to wlm resource pool large 
with ( priority high, memory '500MB', memory '40%', spill '10%' );
RESTART

The same functionality is available via WLM rules that contain a restart action. The following rule shows how to request resources when a query restarts in a pool. The rule defines two actions based on a set of criteria.

if ((String(w.application).indexOf('ybsql') >= 0) &&
   w.user === 'bobr' &&
   w.errorCode === 'EEOOM') {
	w.requestedMemoryPercent = 100;
	w.restartInResourcePool('max_memory_pool');
}

This rule requests a restart in the max_memory_pool pool with 100% of its memory if an OOM error occurs for user bobr running a ybsql query.

Configuring SSL/TLS for Tools and Drivers

Secure Connections for ODBC/JDBC Clients and ybsql

sys.lock

Bulk Load Examples

Running a Bulk Load

Loading Tables from Parquet Files

ybload Command

Loading from Amazon S3

Loading from Azure Blob Storage

Setting up and Running a Spark Job

Setting Up the ybrelay Service

LDAP Authentication

Synchronizing Users and Groups

Appliance: Disk Encryption

Setting Up Encrypted Drives

Remote Diagnostics

System Alerts

Creating an Alert Endpoint

Using the System Management Console

ybcli Reference

ybcli: config

AWS Marketplace

Create Stack

Docker

Cloud: Configuration

Vanity DNS

Yellowbrick Manager

Cloud: Enterprise Edition Getting Started

SQL-Based Loads from External Storage

Cloud: Installation

CLI Install Instructions

Permissions

Private Install Instructions

Public Install Instructions

Cloud: Kubernetes Guides

CREATE EXTERNAL FORMAT

CREATE EXTERNAL TABLE

CREATE TABLE

GRANT

Plan Hinting

SELECT

GROUP BY Clause

Subqueries

Data Type Casting

DECIMAL

JSON

JSONB

SQL String Constants

Aggregate Functions

Conditional Expressions

Datetime Functions

Formatting Functions

Geospatial functions

Mathematical Functions

Network Address Functions

Pattern Matching

Regular Expression Details

SQL Operators and Pattern Matching Functions

SQL Conditions

SQL User Defined Function (UDF)

SQL UDF Create Function

String Functions

ENCRYPT_KS

System Functions

Type-Safe Casting Functions

Window Functions

Creating WLM Resource Pools

Creating WLM Rules

Rule Examples

Restarting Queries ​

Restartable Queries ​

Default Error Recovery ​

Requesting Additional Resources for Restarted Queries ​

Restarting Queries

Restartable Queries

Default Error Recovery

Requesting Additional Resources for Restarted Queries