Appearance
Encrypting Sensitive Data
You can use the ENCRYPT_KS
and DECRYPT_KS
SQL functions to protect access to sensitive character data, such as Personally Identifiable Information (PII). Encryption functions scramble the input data they receive, rendering it unreadable to any user or role who does not have both the privileges and the access key required to decrypt it. Authorized users can run the encryption functions against any valid SQL expression that evaluates to a character string.
You can also create tables with encrypted columns; in this case, those columns are encrypted automatically when data is loaded or inserted. You do not need to call the ENCRYPT_KS
function explicitly.
The Yellowbrick implementation of the ENCRYPT_KS
and DECRYPT_KS
functions contains code under the OpenSSL license.
Secrets Manager Configuration
The ENCRYPT_KS
and DECRYPT_KS
functions are backed by an integrated secrets manager (or key store). The key store functions differently on appliance and cloud platforms.
Appliance platforms
Before you can use the SQL encryption functions, you have to unlock the keystore file by running the following ybcli
command on the manager node:
YBCLI(12545) (PRIMARY - yb100-mgr0)> config keystore sql enable
Are you sure you want to configure keystore for SQL integration?
Response (yes/no): yes
...
This command sets up access to the vault. The system looks up keys in the vault when they are specified as arguments to encryption functions in SQL queries.
Other platforms
On other platforms, which use Kubernetes, the database will use Kubernetes secrets and no configuration should be necessary. Check that the secrets manager is correctly configured per the below instructions.
Check the configuration
The ybd_secrets_manager
parameter defines access to the secrets manager. For example:
premdb=> show ybd_secrets_manager;
ybd_secrets_manager
---------------------
k8ss
(1 row)
SQL Configuration
- Grant
CREATE
privilege per schema to users and roles who will be able to create, delete, and describe keys created in that schema. (This is the same type of privilege that you would grant to users who are going to create tables and other objects.) For example:
premdb=# grant create on schema public to yb100;
GRANT
- Create keys with the
CREATE KEY
SQL command, using a valid hexadecimal string to define a "secret." Keys belong to the database and schema where you create them. For example:
premdb=# CREATE KEY yb100key SECRET 'a1b2c3d4e5f0';
CREATE KEY
This command creates a key for the public schema in the premdb
database. The secret for the key is saved in the encryption keystore. (Authorized users pass in the key name, not its associated secret, when they run encryption functions.) Administrators can query the sys.key
view to return a list of keys that have been created per database and schema. For example:
premdb=# select * from sys.key;
key_id | name | schema_id | owner_id | creation_time
--------+----------+-----------+----------+-------------------------------
16480 | yb100key | 2200 | 16007 | 2019-11-07 15:26:30.577896-08
(1 row)
- Grant
ENCRYPT
and/orDECRYPT
access on keys to specific users and roles. For example:
premdb=# grant all on key yb100key to yb100;
GRANT
This example grants ENCRYPT
and DECRYPT
privileges to members of the role named yb100
.
- Test the functions by encrypting and then decrypting a column from a table:
premdb=# create table encrypted_names as select encrypt_ks(nickname, yb100key) enc from team;
SELECT 50
premdb=# select decrypt_ks(enc,yb100key) from encrypted_names;
decrypt_ks
------------
Gunners
Villains
Tykes
Blues
Rovers
...
For more details about the commands used in these steps, see:
Calling the Functions
The ENCRYPT_KS
and DECRYPT_KS
functions take the following arguments:
- An input expression (required), which must evaluate to a character string. For example, this could be a column name, the concatenation of values from two columns, or the result of another function.
- The name of the key (required). Users specify keys to ensure that the data is encrypted and decrypted consistently every time the same source data is accessed. Keys are mapped to secrets stored in the vault.
- One of three standard AES algorithms, all of which use Output Feedback Mode (OFB):
- AES with a 128-bit key (the default)
- AES with a 192-bit key
- AES with a 256-bit key
- An optional "initialization vector" provides an additional level of obfuscation. This vector provides an efficient way to re-scramble the encrypted data without having to rebuild the overall character mapping that is generated when a new key is used. Instead of using a different key for each function call, you can use the same key and adjust the vector. Taken together, the key and vector parameters function like a login credential with two levels of privacy and security.
Using ybunload
, you can unload data with a query that contains the ENCRYPT_KS
function. You can reload the data as is with ybload
, then decrypt it with a query or CTAS
statement that uses the DECRYPT_KS
function. (You cannot load and decrypt data in one step with ybload
.)
Creating Tables with Encrypted Columns
When you create a table with encrypted columns, data inserted or loaded into those columns is automatically encrypted using the ENCRYPT_KS
function. In turn, encrypted data is automatically decrypted when it is selected by users who have DECRYPT
privilege on the key (and SELECT
privilege on the table). Users writing queries against encrypted columns do not have to specify the DECRYPT_KS
function. Users who do not have DECRYPT
privilege on the key see the stored encrypted values.
The CREATE TABLE statement supports the ENCRYPTED
column constraint for VARCHAR
columns only. For example, the following table is defined with three encrypted VARCHAR
columns: weekly_wages
, dob
(date of birth), and cob
(country of birth):
premdb=# create table player(
playerid bigint not null,
teamid smallint not null,
seasonid smallint not null,
firstname varchar(30),
lastname varchar(30),
position char(1),
dob varchar(100) encrypted with(COLUMN_ENCRYPTION_KEY=playerkey,ENCRYPTION_TYPE=RANDOMIZED,ALGORITHM='AES_256_OFB'),
weekly_wages varchar(100) encrypted with(COLUMN_ENCRYPTION_KEY=playerkey,ENCRYPTION_TYPE=RANDOMIZED,ALGORITHM='AES_256_OFB'),
avg_mins_per_match double precision,
matches_played real,
cob varchar(100) encrypted with(COLUMN_ENCRYPTION_KEY=playerkey,ENCRYPTION_TYPE=RANDOMIZED,ALGORITHM='AES_256_OFB'));
CREATE TABLE
premdb=# \d player
Table "public.player"
Column | Type | Modifiers
--------------------+------------------------+---------------------------------------------------------------------------------------------------------------------------
playerid | bigint | not null
teamid | smallint | not null
seasonid | smallint | not null
firstname | character varying(30) |
lastname | character varying(30) |
position | character(1) |
dob | character varying(100) | encrypted with (column_encryption_key = premdb.public.playerkey, encryption_type = randomized, algorithm = 'aes_256_ofb')
weekly_wages | character varying(100) | encrypted with (column_encryption_key = premdb.public.playerkey, encryption_type = randomized, algorithm = 'aes_256_ofb')
avg_mins_per_match | double precision |
matches_played | real |
cob | character varying(100) | encrypted with (column_encryption_key = premdb.public.playerkey, encryption_type = randomized, algorithm = 'aes_256_ofb')
Distribution: Hash (playerid)
This table is then loaded from a CSV file with rows like these:
15,23,27,Virgil,Van Dijk,D,07-08-1991,180000,87.6345,37.4,Netherlands
16,24,27,David,Silva,M,01-08-1986,250000,79.6723,30.5,Spain
17,22,27,Jamie,Vardy,F,01-11-1987,80000,77.1986,31.6,England
...
User bobr
belongs to a role with SELECT
privilege on the table, but this role does not have DECRYPT
privilege on the key. Therefore, user bobr
sees rows like these:
premdb=> select * from player where playerid between 15 and 17;
-[ RECORD 1 ]------+-------------------------------------------------------------
playerid | 15
teamid | 23
seasonid | 27
firstname | Virgil
lastname | Van Dijk
position | D
dob | CVdbnbTmmeYaaA4SZfOV590jszv8Ooj09zGHsIlB7Q9WhjS2ywNMrg9JVA==
weekly_wages | CYJzBAQSvlufeMDyWtH2Yr5TA7Ao+ulFpP3eY2SwBSVpNb/DlQjn
avg_mins_per_match | 87.6345
matches_played | 37.4
cob | CYz7BPoHAjpNJF0E+O8K/gU8yhRj7oFgcCKvb37lG/LlSep/9M2UbFHHECo=
-[ RECORD 2 ]------+-------------------------------------------------------------
playerid | 16
teamid | 24
seasonid | 27
firstname | David
lastname | Silva
position | M
dob | CUdWko/n2n6SbaaVb39Ck6RYwPH9+intp+oP0p7Uq5M+mMIPQa6iV+wMzA==
weekly_wages | CXWWk/rZxpNJM5LtBfW32nx+eoQ/0X9NexX236RNAC3N4533++rD
avg_mins_per_match | 79.6723
matches_played | 30.5
cob | CXjnwR9g7+iFgQi8IfBsZRYpmj4+sgkmlvSHrZFdCnZmfG+q4YI=
-[ RECORD 3 ]------+-------------------------------------------------------------
playerid | 17
teamid | 22
seasonid | 27
firstname | Jamie
lastname | Vardy
position | F
dob | CeVFkbgdYq2ZOjW7s0E/RQSWlN8W+TsvwYVu8W5EDDPRuRFJS5Mwu7/QyQ==
weekly_wages | CRMmOw16X8foeWOsq0N+x7kTNXTTFsHo5aXXeH6yzhwmeWu1XMI=
avg_mins_per_match | 77.1986
matches_played | 31.6
cob | Cf9r4Q60m/5FtYWSa8jwu3164z0SoL8UDxuz0p3FPQaRQmTjU7oL7w==
If user bobr
is granted DECRYPT
privilege on the key (either directly or via a role in which he has membership), all of the column values are visible in clear text:
premdb=# grant decrypt on key playerkey to bobr;
GRANT
premdb=# \c premdb bobr
You are now connected to database "premdb" as user "bobr".
premdb=> select * from player where playerid between 15 and 17;
playerid | teamid | seasonid | firstname | lastname | position | dob | weekly_wages | avg_mins_per_match | matches_played | cob
----------+--------+----------+-----------+----------+----------+------------+--------------+--------------------+----------------+-------------
15 | 23 | 27 | Virgil | Van Dijk | D | 07-08-1991 | 180000 | 87.6345 | 37.4 | Netherlands
16 | 24 | 27 | David | Silva | M | 01-08-1986 | 250000 | 79.6723 | 30.5 | Spain
17 | 22 | 27 | Jamie | Vardy | F | 01-11-1987 | 80000 | 77.1986 | 31.6 | England
(3 rows)
For more details about creating tables with encrypted columns, see CREATE TABLE. For details about the specific algorithms used to encrypt data, see Encryption and Decryption Algorithms.