Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

HDFS encryption confusion

avatar
Contributor

I am new to HDFS Transparent encryption feature.


I am using cloudera CDH 5.4.9 and trying to use hdfs encryption in following manner.
# I am using Java KeyStore and configured KMS service and integrated HDFS to use this JAVA KMS
# Create two users
useradd -m user1
passwd user1
useradd -m user2
passwd user2

# As a user1 perform following operations
# Create key and create user1 encryption zone
# link user1 zone to created key
su user1
hadoop key create user1key1
hadoop fs -mkdir /tmp/user1zone1
su hdfs
hdfs crypto -createZone -keyName user1key1 -path /tmp/user1zone1

# verify zone is created
hdfs crypto -listZones

# create file with user1 credential and put into user1 encryption zone
echo "Hello World" > /tmp/helloWorld.txt
hadoop fs -put /tmp/helloWorld.txt /tmp/user1zone1
hadoop fs -cat /tmp/user1zone1/helloWorld.txt
Hello World

# Verify if file is encrypted
su hdfs
hadoop fs -cat /.reserved/raw/tmp/zone1/helloWorld.txt
T▒▒6▒5▒▒7̼[

# Now login as another user
su user2
hadoop fs -cat /tmp/user1zone1/helloWorld.txt
Hello World


If encryption zone was created by user1 then how another user2 is able to view the encrypted data.
I might be missing something very basic here.

Can anyone shed some light on this?

2 ACCEPTED SOLUTIONS

avatar
Contributor

Vmshah,

 

Do both users belong to the same group?

 

d(rwx)(r)-x(r)-x -- according to permissions set, here user1 groups and others can read and execute the data.

 

If you want only user 1 to read, write and execute the data then set the permissions accordingly.(eg: hadoop fs -chmod 700 /tmp/user1zone1/helloWorld.txt )

View solution in original post

avatar
Master Collaborator

Hi,

 

Encryption at rest is used for protecting your data from an unauthorized user who has no read permission in hdfs or has no access to cluster and is trying to read it from the disk directly. 

 

In your example the directory /tmp/user1zone1 has read access for all cluster users and hence user2 is allowed to read from it. 

drwxr-xr-x - user1 supergroup 0 2016-02-10 02:42 /tmp/user1zone1

View solution in original post

5 REPLIES 5

avatar
Contributor

First of all both users are accessing the file because u may not have set the permissions of both the users accordingly to access that file. Dont get confused with Encryption and permission. Question you asked is something related to file level permissions and encryption has lot more use cases compare to permissions.

 

When creating a new file in an encryption zone, the NameNode asks the KMS to generate a new EDEK encrypted with the encryption zone’s key. The EDEK is then stored persistently as part of the file’s metadata on the NameNode.

 

When reading a file within an encryption zone, the NameNode provides the client with the file’s EDEK and the encryption zone key version used to encrypt the EDEK. The client then asks the KMS to decrypt the EDEK, which involves checking that the client has permission to access the encryption zone key version. Assuming that is successful, the client uses the DEK to decrypt the file’s contents.

 

Hope this clears your question!!!

avatar
Contributor

Thanks for the reply.

 

Confusion i had was because of following question.

If let's say user1 has put file into encryption zone

 

hadoop fs -put /tmp/helloWorld.txt /tmp/user1zone1

drwxr-xr-x - user1 supergroup 0 2016-02-10 02:42 /tmp/user1zone1

 

now let's say as user2 execute following comand 

hadoop fs -cat /tmp/user1zone1/helloWorld.txt

 

no matter whatever user i use to read content i am able to read the content of a file.

 

Should user2 able to see original text contents?

 

avatar
Contributor

Vmshah,

 

Do both users belong to the same group?

 

d(rwx)(r)-x(r)-x -- according to permissions set, here user1 groups and others can read and execute the data.

 

If you want only user 1 to read, write and execute the data then set the permissions accordingly.(eg: hadoop fs -chmod 700 /tmp/user1zone1/helloWorld.txt )

avatar
Contributor
I understand that setting permission will control the read write.

But then how encryption is useful to prevent other users reading your data.

I understand if you get block level access to file, user will not be able
to read.

For security related to other user in the same system seeing encrypted
data, I am not sure if there would be the use case for that or not

avatar
Master Collaborator

Hi,

 

Encryption at rest is used for protecting your data from an unauthorized user who has no read permission in hdfs or has no access to cluster and is trying to read it from the disk directly. 

 

In your example the directory /tmp/user1zone1 has read access for all cluster users and hence user2 is allowed to read from it. 

drwxr-xr-x - user1 supergroup 0 2016-02-10 02:42 /tmp/user1zone1