04-04-2016 02:07 AM
I have been reading about encrypting data over wire while reading/writing from/to HDFS.
I found that to enable data encryption over wire for HDFS, user needs to do following:
To enable encryption of data transfered between DataNodes and clients, and among DataNodes, proceed as follows:
|Enable Data Transfer Encryption||Check this field to enable wire encryption.|
|Data Transfer Encryption Algorithm||Optionally configure the algorithm used to encrypt data.|
|Hadoop RPC Protection|
I am trying to understand that is it possible to do encryption over network without using Kerberos.
Assume i do not have kerberos enabled for cluster or client application, and i still enable Data Transfer Encryption via dfs.encrypt.data.transfer to true in the hdfs-site.xml.
This document does not clarify why whether to use any key file on client machine or not. I am trying to understand how client application knows how to encrypt/decrypt data without knowledge of public key. I might be missing something here, but atleast this part of document does not talk about generating key and copying it to client machine or using it somehow while connecting to cluster.
Can someone please point if there are any further details in document or somewhere else on how to use encryption over wire step by step?