Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Is there an existing routine for encryption?

avatar
Rising Star

Is there a library or call that already defines either AES or similar strong encryption methods?

I was looking for a way to do decryption of a HIVE table using SPARK and then load only certain tables in via a protected notebook in Zepplin or such.

1 ACCEPTED SOLUTION

avatar

@devers

If you mean a way to decrypt a file that has been encrypted with HDFS encryption, then no. The encryption and decryption with HDFS as-rest encryption is more complex. The EEK is stored with the file, and you have to talk to the KMS to get the decrypted key, etc. You can use HDFS encryption with Hive and Spark to take care of this for you.

If you want to generate a key pair and use that for both Hive and Spark to encrypt/decrypt data, that can be done, but would be part of loading and working with the data. You'd need to define a UDF for Hive to use for decryption so you could reference it with a select statement, and you'd need to use libraries in Scala or Python for Spark to decrypt the data. Both would have to have access to the keys for decryption, though, and that may be difficult to architect in a secure fashion.

View solution in original post

1 REPLY 1

avatar

@devers

If you mean a way to decrypt a file that has been encrypted with HDFS encryption, then no. The encryption and decryption with HDFS as-rest encryption is more complex. The EEK is stored with the file, and you have to talk to the KMS to get the decrypted key, etc. You can use HDFS encryption with Hive and Spark to take care of this for you.

If you want to generate a key pair and use that for both Hive and Spark to encrypt/decrypt data, that can be done, but would be part of loading and working with the data. You'd need to define a UDF for Hive to use for decryption so you could reference it with a select statement, and you'd need to use libraries in Scala or Python for Spark to decrypt the data. Both would have to have access to the keys for decryption, though, and that may be difficult to architect in a secure fashion.