Appearance
Spark and ybrelay Glossary
- Apache Spark
- Open-source platform for grid computing; a framework for solving analytics problems at large scale.
- Avro
- Row-based storage format with its data definition in JSON, and the data itself in binary format, making it compact and efficient.
- FIFOs
- Named pipes, as produced by the Linux
mkfifo
command. - HDFS
- Open-source Apache Hadoop distributed file system; manages very large data sets running on commodity hardware.
- keystore
- A Java keystore (JKS) file that contains certificate and public/private key information required to run Spark jobs with TLS enabled.
- Parquet
- Apache Parquet, an open-source column-oriented data storage format commonly used in Hadoop projects.
- Spark application
- An application that generically consumes any data Spark feeds it (in row form).
- Spark job
- A job that is submitted to Spark to handle large-scale data export or import.
- TLS
- Transport Layer Security, a communications protocol for authenticated and encrypted connections over a network. The TLS and SSL terms tend to be used interchangeably. SSL/TLS is also used.
- ybrelay
- Yellowbrick "relay" client tool that accepts incoming data in various formats from any external file system and calls
ybload
to bulk load it into tables. - ybload
- Yellowbrick bulk load client tool.
Parent topic:Loading Tables with Spark