Skip to content

Parquet Schema Mapping and Type Casting

This section lays out the mapping and casting support for parquet types to Yellowbrick data types (data types supported for storage in columns in Yellowbrick tables).

Mapping for Parquet Boolean Type

The parquet boolean type maps directly to the Yellowbrick boolean data type. No other mappings are supported.

Mappings for Parquet INT32 Types

The following table indicates which parquet INT32 data types map to Yellowbrick data types, either directly or with casting.

The first row in the table refers to the parquet primitive type, and the subsequent rows to annotated logical types.

CHAR, VARCHARBOOLEANSMALLINTINTBIGINTREALDOUBLEDECIMALDATETIMETIMESTAMP, TIMESTAMPTZUUIDIPV4, IPV6MACADDR, MACADDR8BYTEA
INT32Yes, with castNoYes, with castYesYes, with castYes, with castYes, with castYes, with castNoNoNoNoNoNoNo
INT/UINT (8/16/32, sign)Yes, with castNoYes: INT(8), UINT(8), INT(16)Yes: UINT(16),INT(32)Yes: UINT(32)Yes, with castYes, with castYes, with castNoNoNoNoNoNoNo
DECIMAL (1-9)Yes, with castNoYes, with castYes, with castYes, with castYes, with castYes, with castYesNoNoNoNoNoNoNo
DATEYes, with castNoNoNoNoNoNoNoYesNoYes, with castNoNoNoNo
TIME (MILLIS)Yes, with castNoNoNoNoNoNoNoNoYesNoNoNoNoNo

Mappings for Parquet INT64 Types

The following table indicates which parquet INT64 data types map to Yellowbrick data types, either directly or with casting.

The first row in the table refers to the parquet primitive type, and the subsequent rows to annotated logical types.

CHAR, VARCHARBOOLEANSMALLINTINTBIGINTREALDOUBLEDECIMALDATETIMETIMESTAMP, TIMESTAMPTZUUIDIPV4, IPV6MACADDR, MACADDR8BYTEA
INT64Yes, with castNoYes, with castYes, with castYes, with castYes, with castYes, with castYes, with castNoNoNoNoNoNoNo
INT/UINT(64, sign)Yes, with castNoYes, with castYes, with castYes: INT(64)Yes, with castYes, with castYes: UINT(64)NoNoNoNoNoNoNo
DECIMAL (1-18)Yes, with castNoYes, with castYes, with castYes, with castYes, with castYes, with castYesNoNoNoNoNoNoNo
TIME (MICROS, NANOS)Yes, with castNoNoNoNoNoNoNoNoYesNoNoNoNoNo
TIMESTAMP (UTC, unit)Yes, with castNoNoNoNoNoNoNoYes, with castYes, with castYesNoNoNoNo

Mapping for Parquet FLOAT, DOUBLE, and INT96 Types

The following table indicates which parquet FLOAT, DOUBLE, and INT96 data types map to Yellowbrick data types, either directly or with casting.

CHAR, VARCHARBOOLEANSMALLINTINTBIGINTREALDOUBLEDECIMALDATETIMETIMESTAMP, TIMESTAMPTZUUIDIPV4, IPV6MACADDR, MACADDR8BYTEA
FLOATYes, with castNoNoNoNoYesYes, with castNoNoNoNoNoNoNoNo
DOUBLEYes, with castNoNoNoNoYes, with castYesNoNoNoNoNoNoNoNo
INT96
(--int96-as-timestamp)
Yes, with castNoNoNoNoNoNoNoYes, with castYes, with castYesNoNoNoNo
INT96
(--no-int96-as-timestamp)
Yes, with castNoYes, with castYes, with castYes, with castYes, with castYes, with castYesNoNoNoNoNoNoNo

Mapping for Parquet Byte Array Types

The following table indicates which parquet byte array data types map to Yellowbrick data types, either directly or with casting.

The first row in the table refers to the parquet primitive type, and the subsequent rows to annotated logical types.

CHAR, VARCHARBOOLEANSMALLINTINTBIGINTREALDOUBLEDECIMALDATETIMETIMESTAMP, TIMESTAMPTZUUIDIPV4, IPV6MACADDR, MACADDR8BYTEA
BYTE ARRAYYesNoNoNoNoNoNoNoNoNoNoNoNoNoYes
STRING/UTF-8YesYes, with castYes, with castYes, with castYes, with castYes, with castYes, with castYes, with castYes, with castYes, with castYes, with castYes, with castYes, with castYes, with castYes
ENUMYesNoNoNoNoNoNoNoNoNoNoNoNoNoYes
DECIMAL(N)Yes, with castNoYes, with castYes, with castYes, with castYes, with castYes, with castYesNoNoNoNoNoNoNo
JSONYesNoNoNoNoNoNoNoNoNoNoNoNoNoYes
BSONYesNoNoNoNoNoNoNoNoNoNoNoNoNoYes

Note: When loading data from parquet bytea array to a bytea column or varchar column, leading and trailing whitespaces are preserved.

Mapping for Parquet Fixed-Length Byte-Array Types

The following table indicates which parquet fixed-length byte-array data types map to Yellowbrick data types, either directly or with casting.

The first row in the table refers to the parquet primitive type, and the subsequent rows to annotated logical types.

CHAR, VARCHARBOOLEANSMALLINTINTBIGINTREALDOUBLEDECIMALDATETIMETIMESTAMP, TIMESTAMPTZUUIDIPV4, IPV6MACADDR, MACADDR8BYTEA
FIXED-LENGTH BYTE-ARRAYYesNoNoNoNoNoNoNoNoNoNoNoNoNoYes
16/UUIDYes, with castNoNoNoNoNoNoNoNoNoNoYesNoNoYes
N/DECIMAL(N)Yes, with castNoYes, with castYes, with castYes, with castYes, with castYes, with castYesNoNoNoNoNoNoNo
12/IntervalNoNoNoNoNoNoNoNoNoNoNoNoNoNoNo