Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

HDFS Data Protection?

avatar
Rising Star

Recently, I encountered a security issue, The HDFS data combined with KMS and Ranger protection, then the file will be stored in HDFS is plaintext or ciphertext. In other words, If I uninstall the KMS and Ranger plugin, Do these HDSF file is a plain text?

1 ACCEPTED SOLUTION

avatar

@Hefei Li

The data is stored encrypted with a copy of the encrypted decryption key (EDEK) attached to the file. No user will be able to access the contents of the O/S level files unless they get the KMS to provide an unencrypted version of the decryption key (DEK). The EDEK is stored with the file so the KMS can determine which version of the key was used to encrypt the file to provide the appropriate DEK once policy checks for access to the file have passed. At the HDFS layer, the user has to have policy access to the KMS key to unencrypt the file. The user will not be able to decrypt the file unless this policy check passes. If you uninstall Ranger and the KMS, you will start seeing errors in the HDFS logs when you try to access files in an encryption zone because the namenode will no longer be able to communicate with the KMS for keys or Ranger for key access policies to the files.

View solution in original post

10 REPLIES 10

avatar

Hi,

Not sure to understand your question. If you enable KMS, you will have the availability to create encryption zones (in other words directories) in HDFS where stored file will be encrypted. If then you uninstall the KMS plugin, encryption zone (and associated files) will remain unchanged and the data will not be readable.

avatar
Rising Star

And What I mean is, If I completely uninstall KMS and Ranger, those stored in HDFS file will be readable?

avatar

No. Unless you get back your encrypted data from your encryption zones before uninstalling Ranger KMS, the encrypted data will not be readable. The reason is that for each file stored in an encryption zone, Ranger KMS stores a DEK (Data Encryption Key) which is needed to decrypt your data. In conclusion, your data will remain on HDFS but it will remain encrypted.

avatar
Rising Star

@Pierre Villard

If I do not want to enable KMS and Ranger, now what do i need do to ensure that the HDFS data is readable.

Thanks for your reply.

avatar

If you are talking about data that you already encrypted, you have to get back your data out of your encryption zones before uninstalling Ranger KMS.

If you don't have any encrypted data, you can use HDFS without any problem with or without Ranger KMS. Your data won't be encrypted and will be readable by any application on top of HDFS.

Note: if you have KMS installed, it does not mean your data is encrypted. You have to specifically create encrypted zones in HDFS to get your data encrypted. All your original data is untouched and readable.

avatar
Master Guru

@Hefei Li - Data will be safe but not readable, you can only read it if you have decryption key and KMS Running.

avatar
Rising Star

If I do not want to enable KMS and Ranger, now what do i need do to ensure that the HDFS data is readable.

Thanks for your reply.

avatar

@Hefei Li

The data is stored encrypted with a copy of the encrypted decryption key (EDEK) attached to the file. No user will be able to access the contents of the O/S level files unless they get the KMS to provide an unencrypted version of the decryption key (DEK). The EDEK is stored with the file so the KMS can determine which version of the key was used to encrypt the file to provide the appropriate DEK once policy checks for access to the file have passed. At the HDFS layer, the user has to have policy access to the KMS key to unencrypt the file. The user will not be able to decrypt the file unless this policy check passes. If you uninstall Ranger and the KMS, you will start seeing errors in the HDFS logs when you try to access files in an encryption zone because the namenode will no longer be able to communicate with the KMS for keys or Ranger for key access policies to the files.

avatar
Rising Star

Thank you so much for your help.