12-21-2018 02:44 PM - last edited on 12-22-2018 06:50 AM by cjervis
We have HDFS Encryption at rest enables in our Kerberized cluster.
I am able to create encryption zone, write data into it as admin (who created the key & zone). Other users (not in the same LDAP group as the above admin user) are not able to access it even with FACL's set to rwx - because they are not authorized for [DECRYPT_EEK].
1. When a user creates an encryption zone, by default, other users in same unix group gets access to [DECRYPT_EEK]. Is this true?
2. Usually an user is able to see the encrypted data (not the actual data) if he/she has read permissions to it. But, this is not the case with HDFS Rest encryption. Unless the user is able to decrypt the data, they are not allowed to read it. - correct?
3. Is there a way to show/display the encrypted data (without decrypting it)? If so, how?
4. What/where does GENERATE_EEK & GET_METADATA fit into this concept?
The whole concept of maintaining keys in KMS/KTS and encrypt the data based on that - seems to be more like a blocking the access to the data rather than the fact that the data is encrypted.
if someone can please provide some clarity, would be greatly beneficial. Thank you.
01-16-2019 04:11 PM
The KMS service within the hadoop frame work is responsible for handling key material. The KMS is not responsible for encrypting or decrypting data. The KTS is not connected to access control over data. All encrypted data handling occurs within the DFS client framework.
1.) You will need to review and understand the concepts laid out in our documentation and upstream related to securing the KMS. Cloudera ships a secure by default ACL configuration. New keys are not automatically alotted any access controls. No users are authorized to access new keys which have undefined Acess Controls. The KMS ACL engine is designed to control key release and it is not in any way connected to the underlying HDFS Posix controls. The ACL engine indicately controls access to Encrypted data by controlling access to key material.
2.) Your question here is moderately confusing. HDFS Encryption is Transparent to the DFS client. If a user is authorized to perform decrypt EEK operations they may view the encrypted data. Raw encrypted data is not normally visible to clients in the capacity I believe you are attempting to describe outside the context of the raw end point exposed to the supergroup users.
3.) You can access the raw data end point as a super user if you would like to verify that the data is encrypted. This information is documented publicly in both upstream and in our documentation.
hdfs dfs -ls /.reserved/raw/
4.) The Generate EEK operation is handled internally by the HDFS service user and is not normally exposed to operators.
If you are a cloudera customer you should reach out to your account team for additional training and details.