Created on 06-29-201601:54 PM - edited 08-17-201911:51 AM
Security is a key element when discussing Big Data. A common requirement with security is data encryption. By following the instructions below, you'll be able to setup transparent data encryption in HDFS on defined directories otherwise known as encryption zones "EZ".
Before starting this step-by-step tutorial, there are three HDP services that are essential (must be installed):
a) If using Oracle JDK, verify JCE is installed (OpenJDK has JCE installed by default)
If the server running Ranger KMS is using Oracle JDK, you must install JCE (necessary for Ranger KMS to run) instructions on installing JCE can be found here
b) CPU Support for AES-NI optimization
AES-NI optimization requires an extended CPU instruction set for AES hardware acceleration.
There are several ways to check for this; for example:
cat /proc/cpuinfo | grep aes
Look for output with flags and 'aes'.
c) Library Support for AES-NI optimization
You will need a version of the libcrypto.so library that supports hardware acceleration, such as OpenSSL 1.0.1e. (Many OS versions have an older version of the library that does not support AES-NI.)
A version of the libcrypto.so libary with AES-NI support must be installed on HDFS cluster nodes and MapReduce client hosts -- that is, any host from which you issue HDFS or MapReduce requests. The following instructions describe how to install and configure the libcrypto.so library.
RHEL/CentOS 6.5 or later
On HDP cluster nodes, the installed version of libcrypto.so supports AES-NI, but you will need to make sure that the symbolic link exists:
* To access Ranger KMS (Encryption) - login using the username "keyadmin", the default password is "keyadmin" - remember to change this password
b) Choose Encryption > Key Manager
* In this tutorial, "hdptutorial" is the name of the HDP cluster. Your name will be different, depending on your cluster name.
c) Choose Select Service > yourclustername_kms
d) Choose "Add New Key"
e) Create the new key
Length - either 128 or 256 * Length of 256 requires JCE installed on all hosts in the cluster"The default key size is 128 bits. The optional -size parameter supports 256-bit keys, and requires the Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy File on all hosts in the cluster. For installation information, see the Ambari Security Guide."
Step 3: Add KMS Ranger Policies for encrypting directory
* You will now be able to read/write data to your encrypted directory /zone_encr. If you receive any errors - including "IOException:" when creating an encryption zone in Step 4 (b) take a look at your Ranger KMS server /var/log/ranger/kms/kms.log -> there usually is a permission issue accessing the key
* To find out more about how transparent data encryption in HDFS works, refer to the Hortonworks blog here