Restart Rule Examples

The following examples describe rules that are evaluated when queries are restarted because of an error.

Modified Restart on Error System Rule

The global_restartErrorCodes and global_restartErrorCodesSuperuser system rules trigger a one-time restart attempt for queries that return a small subset of recoverable error codes. If you want to extend these rules to apply to more error codes, Yellowbrick recommends that you create new rules rather than modify existing rules. (If system rule versions change during an upgrade, your modifications to those existing rules will be lost.)

Start by looking at the default definition of global_restartErrorCodes:
yellowbrick=# select * from sys.wlm_active_rule where rule_name ='global_restartErrorCodes';
-[ RECORD 1 ]+-----------------------------------------------------------------------------------------------------
profile_name | (global)
rule_name    | global_restartErrorCodes
rule_type    | restart_for_error
order        | 1
enabled      | t
superuser    | f
expression   | // Recoverable error codes:                                                                         +
             | //  - KE041 # Library file write                                                                    +
             | //  - YB044 # RecoverableGeneric                                                                    +
             | //  - WM001 # ERRCODE_WORKER_OFFLINE                                                                +
             | //  - KE039 # RPCCHANNELBROKEN                                                                      +
             | //  - KE038 # RPCCHANNELCLOSED                                                                      +
             | //  - P0004 # Assert failure                                                                        +
             | //  - WM002 # System not ready                                                                      +
             | //  - WM003 # Query restarting                                                                      +
             | if (w.errorRecoverable === undefined || w.errorRecoverable == null || w.errorRecoverable) {         +
             |   w.errorRecoverable = String(w.errorCode).match(/KE041|YB044|WM001|KE039|KE038|P0004|WM002|WM003/);+
             | }                                                                                                   +
You can use the SMC or a SQL command to create a new rule with a different name that copies most of this rule definition. For example, use the same main settings:

Now use the same rule definition, but add another error code from the recoverable list (KE001 in this example):

Now disable the existing global_restartErrorCodes rule and activate changes to make your new rule, global_restartErrorCodesKE001Added, take effect instead.

Alternatively you can enable both rules, but in that case your new rule would need a higher rule order, such as 10, so that the existing rule is applied first. Also you would need to remove the following code from the rule definition:
if (w.errorRecoverable === undefined || w.errorRecoverable == null || w.errorRecoverable) {   }

Note that there are two instances of the ErrorCodes rule: one for superusers and one for non-superusers. You may need to modify both of these to suit your requirements.

Restart and Try to Expand Resources

The flex profile has a predefined rule, flex_expandResourcesErrorHandler, which attempts to increase the resources available to a query that errors out with one of several specified error codes. The attempt to expand resources happens when the query restarts and only applies to the flex profile.

The rule is defined as follows: + ' is restarting for error ' + w.errorCode);
if (String(w.errorCode).match(/53200|KE002|YB004|KE032|KE029|YB006|EEOOM/)) {

  // See if we can't expand resources; if we can, lets try the query with more resources.
  if (!wlm.assignMaximumResources(w)) {
     w.errorRecoverable = false; + ' cannot expand resources; marked as not recoverable');
  } else {
     w.errorRecoverable = true; + ' expanded resources for restart (memory ' + w.requestedMemoryMB + ', spill ' + w.requestedSpillMB + ')');

The wlm.assignMaximumResources(w) property returns true if expanded resources (memory and spill space) are available from the flex pool. Additional resources may or may not be available, depending on concurrent query activity in that pool. This rule also logs appropriate INFO messages, either marking the query as not recoverable or listing the resources available on restart.

See also Recoverable Error Codes.