Parquet Schema Mapping and Type Casting
This section lays out the mapping and casting support for parquet types to Yellowbrick data types (data types supported for storage in columns in Yellowbrick tables).
Mapping for Parquet Boolean Type
The parquet boolean type maps directly to the Yellowbrick boolean data type. No other mappings are supported.
Mappings for Parquet INT32 Types
The following table indicates which parquet INT32 data types map to Yellowbrick data types, either directly or with casting.
The first row in the table refers to the parquet primitive type, and the subsequent rows to annotated logical types.
| CHAR, VARCHAR | BOOLEAN | SMALLINT | INT | BIGINT | REAL | DOUBLE | DECIMAL | DATE | TIME | TIMESTAMP, TIMESTAMPTZ | UUID | IPV4, IPV6 | MACADDR, MACADDR8 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| INT32 | Yes, with cast | No | Yes, with cast | Yes | Yes, with cast | Yes, with cast | Yes, with cast | Yes, with cast | No | No | No | No | No | No |
| INT/UINT (8/16/32, sign) | Yes, with cast | No | Yes: INT(8), UINT(8), INT(16) | Yes: UINT(16),INT(32) | Yes: UINT(32) | Yes, with cast | Yes, with cast | Yes, with cast | No | No | No | No | No | No |
| DECIMAL (1-9) | Yes, with cast | No | Yes, with cast | Yes, with cast | Yes, with cast | Yes, with cast | Yes, with cast | Yes | No | No | No | No | No | No |
| DATE | Yes, with cast | No | No | No | No | No | No | No | Yes | No | Yes, with cast | No | No | No |
| TIME (MILLIS) | Yes, with cast | No | No | No | No | No | No | No | No | Yes | No | No | No | No |
Mappings for Parquet INT64 Types
The following table indicates which parquet INT64 data types map to Yellowbrick data types, either directly or with casting.
The first row in the table refers to the parquet primitive type, and the subsequent rows to annotated logical types.
| CHAR, VARCHAR | BOOLEAN | SMALLINT | INT | BIGINT | REAL | DOUBLE | DECIMAL | DATE | TIME | TIMESTAMP, TIMESTAMPTZ | UUID | IPV4, IPV6 | MACADDR, MACADDR8 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| INT64 | Yes, with cast | No | Yes, with cast | Yes, with cast | Yes, with cast | Yes, with cast | Yes, with cast | Yes, with cast | No | No | No | No | No | No |
| INT/UINT(64, sign) | Yes, with cast | No | Yes, with cast | Yes, with cast | Yes: INT(64) | Yes, with cast | Yes, with cast | Yes: UINT(64) | No | No | No | No | No | No |
| DECIMAL (1-18) | Yes, with cast | No | Yes, with cast | Yes, with cast | Yes, with cast | Yes, with cast | Yes, with cast | Yes | No | No | No | No | No | No |
| TIME (MICROS, NANOS) | Yes, with cast | No | No | No | No | No | No | No | No | Yes | No | No | No | No |
| TIMESTAMP (UTC, unit) | Yes, with cast | No | No | No | No | No | No | No | Yes, with cast | Yes, with cast | Yes | No | No | No |
Mapping for Parquet FLOAT, DOUBLE, and INT96 Types
The following table indicates which parquet FLOAT, DOUBLE, and INT96 data types map to Yellowbrick data types, either directly or with casting.
| CHAR, VARCHAR | BOOLEAN | SMALLINT | INT | BIGINT | REAL | DOUBLE | DECIMAL | DATE | TIME | TIMESTAMP, TIMESTAMPTZ | UUID | IPV4, IPV6 | MACADDR, MACADDR8 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| FLOAT | Yes, with cast | No | No | No | No | Yes | Yes, with cast | No | No | No | No | No | No | No |
| DOUBLE | Yes, with cast | No | No | No | No | Yes, with cast | Yes | No | No | No | No | No | No | No |
| INT96 | Yes, with cast | No | Yes, with cast | Yes, with cast | Yes, with cast | Yes, with cast | Yes, with cast | Yes | No | No | No | No | No | No |
Mapping for Parquet Byte Array Types
The following table indicates which parquet byte array data types map to Yellowbrick data types, either directly or with casting.
The first row in the table refers to the parquet primitive type, and the subsequent rows to annotated logical types.
| CHAR, VARCHAR | BOOLEAN | SMALLINT | INT | BIGINT | REAL | DOUBLE | DECIMAL | DATE | TIME | TIMESTAMP, TIMESTAMPTZ | UUID | IPV4, IPV6 | MACADDR, MACADDR8 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| BYTE ARRAY | Yes | No | No | No | No | No | No | No | No | No | No | No | No | No |
| STRING/UTF-8 | Yes | Yes, with cast | Yes, with cast | Yes, with cast | Yes, with cast | Yes, with cast | Yes, with cast | Yes, with cast | Yes, with cast | Yes, with cast | Yes, with cast | Yes, with cast | Yes, with cast | Yes, with cast |
| ENUM | Yes | No | No | No | No | No | No | No | No | No | No | No | No | No |
| DECIMAL(N) | Yes, with cast | No | Yes, with cast | Yes, with cast | Yes, with cast | Yes, with cast | Yes, with cast | Yes | No | No | No | No | No | No |
| JSON | Yes | No | No | No | No | No | No | No | No | No | No | No | No | No |
| BSON | Yes | No | No | No | No | No | No | No | No | No | No | No | No | No |
Mapping for Parquet Fixed-Length Byte-Array Types
The following table indicates which parquet fixed-length byte-array data types map to Yellowbrick data types, either directly or with casting.
The first row in the table refers to the parquet primitive type, and the subsequent rows to annotated logical types.
| CHAR, VARCHAR | BOOLEAN | SMALLINT | INT | BIGINT | REAL | DOUBLE | DECIMAL | DATE | TIME | TIMESTAMP, TIMESTAMPTZ | UUID | IPV4, IPV6 | MACADDR, MACADDR8 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| FIXED-LENGTH BYTE-ARRAY | Yes | No | No | No | No | No | No | No | No | No | No | No | No | No |
| 16/UUID | Yes, with cast | No | No | No | No | No | No | No | No | No | No | Yes | No | No |
| N/DECIMAL(N) | Yes, with cast | No | Yes, with cast | Yes, with cast | Yes, with cast | Yes, with cast | Yes, with cast | Yes | No | No | No | No | No | No |
| 12/Interval | No | No | No | No | No | No | No | No | No | No | No | No | No | No |
Parent topic:Loading Tables from Parquet Files