Resuming a Partial Load

If a bulk load consists of multiple transactions and fails after some transactions have been committed, you can resume the load. Use the --resume-partial-load-from-offset option to resume at a specific byte offset, as reported by the load messages. For example:

...
2018-01-17 17:02:59.989 [FATAL] <main>  FAILED BULK LOAD: Last commit occurred after 300 good rows
2018-01-17 17:02:59.989 [ WARN] <main>  At the time of the last commit:
   300 good row(s) had been committed
   3 bad row(s) had been skipped
   1 source(s) had been partially loaded

2018-01-17 17:02:59.989 [ WARN] <main>  
To resume loading from the last committed position, invoke ybload as follows:
   1) ybload <original options> --resume-partial-load-from-offset 150000 /data/tests/tmp1.csv

2018-01-17 17:02:59.989 [ WARN] <main>  BEWARE: Additional bad rows were written to the bad row file after the last commit
2018-01-17 17:02:59.989 [ WARN] <main>          When fixing rows in the bad row file, ignore any bad rows that follow this message:
2018-01-17 17:02:59.990 [ WARN] <main>          "----- successful commit after 3 bad rows -----"

In the following example, the load can restart from the beginning of the fourth file. Therefore the --resume-partial-load-from-offset option is not necessary:

...
2018-01-17 17:03:06.949 [FATAL] <main>  FAILED BULK LOAD: Last commit occurred after 300 good rows
2018-01-17 17:03:06.949 [ WARN] <main>  At the time of the last commit:
   300 good row(s) had been committed
   3 bad row(s) had been skipped
   3 source(s) had been completely loaded
   6 source(s) had not started to load

2018-01-17 17:03:06.950 [ WARN] <main>  
To resume loading from the last committed position, invoke ybload as follows:
   1) ybload <original options>  \
   /data/tests/tmp4.csv  \
   /data/tests/tmp5.csv  \
   /data/tests/tmp6.csv  \
   /data/tests/tmp7.csv  \
   /data/tests/tmp8.csv  \
   /data/tests/tmp9.csv

2018-01-17 17:03:06.950 [ WARN] <main>  BEWARE: Additional bad rows were written to the bad row file after the last commit
2018-01-17 17:03:06.950 [ WARN] <main>          When fixing rows in the bad row file, ignore any bad rows that follow this message:
2018-01-17 17:03:06.950 [ WARN] <main>          "----- successful commit after 3 bad rows -----"

In the third example, two separate ybload operations need to be run to complete the load:

...
2018-01-17 17:03:12.118 [FATAL] <main>  FAILED BULK LOAD: Last commit occurred after 800 good rows
2018-01-17 17:03:12.118 [ WARN] <main>  At the time of the last commit:
   800 good row(s) had been committed
   8 bad row(s) had been skipped
   1 source(s) had been completely loaded
   1 source(s) had been partially loaded
   1 source(s) had not started to load

2018-01-17 17:03:12.119 [ WARN] <main>  
To resume loading from the last committed position, invoke ybload as follows:
   1) ybload <original options> --resume-partial-load-from-offset 100000 /data/tests/big2.csv
   2) ybload <original options>  \
   /data/tests/big3.csv

2018-01-17 17:03:12.119 [ WARN] <main>  BEWARE: Additional bad rows were written to the bad row file after the last commit
2018-01-17 17:03:12.119 [ WARN] <main>          When fixing rows in the bad row file, ignore any bad rows that follow this message:
2018-01-17 17:03:12.119 [ WARN] <main>          "----- successful commit after 8 bad rows -----"

Configuring SSL/TLS for Tools and Drivers

Secure Connections for ODBC/JDBC Clients and ybsql

sys.lock

Bulk Load Examples

Running a Bulk Load

Loading Tables from Parquet Files

ybload Command

Loading from Amazon S3

Loading from Azure Blob Storage

Setting up and Running a Spark Job

Setting Up the ybrelay Service

LDAP Authentication

Synchronizing Users and Groups

Appliance: Disk Encryption

Setting Up Encrypted Drives

Remote Diagnostics

System Alerts

Creating an Alert Endpoint

Using the System Management Console

ybcli Reference

ybcli: config

AWS Marketplace

Create Stack

Docker

Cloud: Configuration

Vanity DNS

Yellowbrick Manager

Cloud: Enterprise Edition Getting Started

SQL-Based Loads from External Storage

Cloud: Installation

CLI Install Instructions

Permissions

Private Install Instructions

Public Install Instructions

Cloud: Kubernetes Guides

CREATE EXTERNAL FORMAT

CREATE EXTERNAL TABLE

CREATE TABLE

GRANT

Plan Hinting

SELECT

GROUP BY Clause

Subqueries

Data Type Casting

DECIMAL

JSON

JSONB

SQL String Constants

Aggregate Functions

Conditional Expressions

Datetime Functions

Formatting Functions

Geospatial functions

Mathematical Functions

Network Address Functions

Pattern Matching

Regular Expression Details

SQL Operators and Pattern Matching Functions

SQL Conditions

SQL User Defined Function (UDF)

SQL UDF Create Function

String Functions

ENCRYPT_KS

System Functions

Type-Safe Casting Functions

Window Functions

Creating WLM Resource Pools

Creating WLM Rules

Rule Examples

Resuming a Partial Load ​

Resuming a Partial Load