Created on 01-30-2016 04:51 AM - edited 09-16-2022 03:01 AM
I am new to HDFS Transparent encryption feature.
I am using cloudera CDH 5.4.9 and trying to use hdfs encryption in following manner.
# I am using Java KeyStore and configured KMS service and integrated HDFS to use this JAVA KMS
# Create two users
useradd -m user1
passwd user1
useradd -m user2
passwd user2
# As a user1 perform following operations
# Create key and create user1 encryption zone
# link user1 zone to created key
su user1
hadoop key create user1key1
hadoop fs -mkdir /tmp/user1zone1
su hdfs
hdfs crypto -createZone -keyName user1key1 -path /tmp/user1zone1
# verify zone is created
hdfs crypto -listZones
# create file with user1 credential and put into user1 encryption zone
echo "Hello World" > /tmp/helloWorld.txt
hadoop fs -put /tmp/helloWorld.txt /tmp/user1zone1
hadoop fs -cat /tmp/user1zone1/helloWorld.txt
Hello World
# Verify if file is encrypted
su hdfs
hadoop fs -cat /.reserved/raw/tmp/zone1/helloWorld.txt
T▒▒6▒5▒▒7̼[
# Now login as another user
su user2
hadoop fs -cat /tmp/user1zone1/helloWorld.txt
Hello World
If encryption zone was created by user1 then how another user2 is able to view the encrypted data.
I might be missing something very basic here.
Can anyone shed some light on this?
Created 02-09-2016 01:40 PM
Vmshah,
Do both users belong to the same group?
d(rwx)(r)-x(r)-x -- according to permissions set, here user1 groups and others can read and execute the data.
If you want only user 1 to read, write and execute the data then set the permissions accordingly.(eg: hadoop fs -chmod 700 /tmp/user1zone1/helloWorld.txt )
Created 03-30-2016 12:09 PM
Hi,
Encryption at rest is used for protecting your data from an unauthorized user who has no read permission in hdfs or has no access to cluster and is trying to read it from the disk directly.
In your example the directory /tmp/user1zone1 has read access for all cluster users and hence user2 is allowed to read from it.
drwxr-xr-x - user1 supergroup 0 2016-02-10 02:42 /tmp/user1zone1
Created 02-08-2016 08:58 PM
First of all both users are accessing the file because u may not have set the permissions of both the users accordingly to access that file. Dont get confused with Encryption and permission. Question you asked is something related to file level permissions and encryption has lot more use cases compare to permissions.
When creating a new file in an encryption zone, the NameNode asks the KMS to generate a new EDEK encrypted with the encryption zone’s key. The EDEK is then stored persistently as part of the file’s metadata on the NameNode.
When reading a file within an encryption zone, the NameNode provides the client with the file’s EDEK and the encryption zone key version used to encrypt the EDEK. The client then asks the KMS to decrypt the EDEK, which involves checking that the client has permission to access the encryption zone key version. Assuming that is successful, the client uses the DEK to decrypt the file’s contents.
Hope this clears your question!!!
Created 02-09-2016 01:31 PM
Thanks for the reply.
Confusion i had was because of following question.
If let's say user1 has put file into encryption zone
hadoop fs -put /tmp/helloWorld.txt /tmp/user1zone1
drwxr-xr-x - user1 supergroup 0 2016-02-10 02:42 /tmp/user1zone1
now let's say as user2 execute following comand
hadoop fs -cat /tmp/user1zone1/helloWorld.txt
no matter whatever user i use to read content i am able to read the content of a file.
Should user2 able to see original text contents?
Created 02-09-2016 01:40 PM
Vmshah,
Do both users belong to the same group?
d(rwx)(r)-x(r)-x -- according to permissions set, here user1 groups and others can read and execute the data.
If you want only user 1 to read, write and execute the data then set the permissions accordingly.(eg: hadoop fs -chmod 700 /tmp/user1zone1/helloWorld.txt )
Created 02-09-2016 01:54 PM
Created 03-30-2016 12:09 PM
Hi,
Encryption at rest is used for protecting your data from an unauthorized user who has no read permission in hdfs or has no access to cluster and is trying to read it from the disk directly.
In your example the directory /tmp/user1zone1 has read access for all cluster users and hence user2 is allowed to read from it.
drwxr-xr-x - user1 supergroup 0 2016-02-10 02:42 /tmp/user1zone1