Skip to content

TLS Support for ybrelay and Spark

Three separate hosts are involved in a ybrelay setup that runs a Spark job:

  • Spark application platform (data source)
  • ybrelay server (where the ybrelay tool is installed, with connectivity in both directions)
  • Yellowbrick data warehouse (location of target database table)

The supported TLS configuration requires trust to be established on the entire path between the three hosts. This means that the ybrelay server is listening for requests on TLS, and the Yellowbrick data warehouse is listening for requests on TLS. Also, if a ybrelay service endpoint is configured for TLS, it is TLS-only; it cannot listen non-TLS.

The TLS configuration relies on keystores, which contain the required certificates and public/private key-pairs that the hosts use to verify trust when they receive connection requests. A keystore includes the key material for the relay service itself, and it will share the identity of the data warehouse with ybload. See Creating a Java Keystore.

The ybrelay service is typically started by systemd, so the systemd configuration needs to pass the keystore on service startup. See Running the ybrelay-init Script

TLS Requirements for Spark

Specify --cacert as a Spark application option to identify:

  • The certificate used between Spark and the ybrelay server.
  • The certificate used between Spark and the data warehouse.

Spark requires both certificates, except when the --disable-trust option is in use for testing. You can specify a keystore file as the value for the --cacert option.

See also Spark Application Options.

TLS Requirements for the ybrelay Server

Specify --keystore as a ybrelay service option to identify:

  • The private key for the ybrelay service (its own private key).
  • The certificate used between the ybrelay service and the data warehouse (to start the ybload session).

See Running the ybrelay-init Script and ybrelay Service Options.