Created 03-09-2017 09:55 PM
I have a requirement, where i need to encrypt certain sensitive data before landing/ingestion into Hadoop. Just want to understand, how Hadoop process these kind of encrypted data(be it in hive or pig or any map-reduce).
Do we need to write specific programs? to read this kind of encrypted files in hadoop or do we need to set any parameters on hive table or pig session to read these this kind of encrypt data ?
Any ideas/thoughts or suggestions ?
Created 03-10-2017 01:31 AM
You might want to look at TDE in Hadoop:
https://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html
Created 03-10-2017 06:41 PM
Thanks, This is done on the folder level of encryption, however i am looking on the fields level of encryption rather than entire file. I know ranger has this feature, however, this only help us on the hive column level of encryption when i query it, but eventually, when i look at the raw file, i could still see the sensitive data.
Created 03-13-2017 01:13 PM
You can use any kind of encryption as long as you can write your own SerDe to process the data on hive.