Community Articles

Find and share helpful community-sourced technical articles.
Labels (2)
avatar

To enable HTTPS for web HDFS, do the following:

 

Step 1: 

Get the keystore to use in HDFS configurations.

a) In case cert is getting signed by CA, do the following:

1. Generate a keystore for each host. Make sure the common name portion of the certificate matches the hostname where the certificate will be deployed.
keytool -genkey -keyalg RSA -alias c6401 -keystore /tmp/keystore.jks -storepass bigdata -validity 360 -keysize 2048

2. Generate CSR from above keystore
keytool -certreq -alias c6401 -keyalg RSA -file /tmp/c6401.csr -keystore /tmp/keystore.jks -storepass bigdata

3. Now get the singed cert from CA - file name is /tmp/c6401.crt

4. Import the root cert to JKS first. (Ignore if it already present)
keytool -import -alias root -file /tmp/ca.crt -keystore /tmp/keystore.jks
Note: here ca.crt is root cert

5. Repeat step4 for intermediate cert if there is any.

6. Import signed cert into JKS
keytool -import -alias c6401 -file /tmp/c6401.crt -keystore /tmp/keystore.jks -storepass bigdata

7. Import root cert to trust store (Here it creates new truststore.jks )
 keytool -import -alias root -file /tmp/ca.crt -keystore /tmp/truststore.jks -storepass bigdata

8. Import intermediate cert (if there is any) to trust store (similar to step 7)

OR,

b) Do the following steps in case you are planning to use self-signed cert.

1. Generate a keystore for each host. Make sure the common name portion of the certificate matches the hostname where the certificate will be deployed.
# keytool -genkey -keyalg RSA -alias c6401 -keystore /tmp/keystore.jks -storepass bigdata -validity 360 -keysize 2048

2. Generate truststore
Note: Truststore must contains certificate of all servers, you can use below commands to export cert from keystore and then import it to truststore
# keytool -export -file /tmp/c6401.crt -keystore /tmp/truststore.jks -storepass bigdata -alias c6401 -rfc
# keytool -import -alias c6401 -file /tmp/c6401.crt -keystore /tmp/truststore.jks -storepass bigdata

Step 2:

Import truststore certificates to java truststore (cacerts or jssecacerts)

 

keytool -importkeystore \
-srckeystore /tmp/truststore.jks \
-destkeystore /usr/java/default/jre/lib/security/cacerts \
-deststorepass changeit \
-srcstorepass bigdata

 

Step 3:

Login to Ambari and configure/ add following properties in core-site.xml.

hadoop.ssl.require.client.cert=false
hadoop.ssl.hostname.verifier=DEFAULT
hadoop.ssl.keystores.factory.class=org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory
hadoop.ssl.server.conf=ssl-server.xml
hadoop.ssl.client.conf=ssl-client.xml

Step 4:

Add/ modify following properties in hdfs-site.xml:

For non-HA cluster:
dfs.http.policy=HTTPS_ONLY
dfs.client.https.need-auth=false
dfs.datanode.https.address=0.0.0.0:50475
dfs.namenode.https-address=NN:50470
dfs.namenode.secondary.https-address=c6401-node3.coelab.cloudera.com:50091

Note: you can also set dfs.http.policy=HTTP_AND_HTTPS

 

For HA-enabled clusters:
dfs.http.policy=HTTPS_ONLY
dfs.client.https.need-auth=false
dfs.datanode.https.address=0.0.0.0:50475
dfs.namenode.https-address.<nameservie>.nn1= c6401-node2.coelab.cloudera.com:50470
dfs.namenode.https-address.<nameservie>.nn2= c6401-node3.coelab.cloudera.com:50470
dfs.journalnode.https-address=0.0.0.0:8481

 

Step 5:

Update the following configurations under Advanced ssl-server (ssl-server.xml)

ssl.server.truststore.location=/tmp/truststore.jks
ssl.server.truststore.password=bigdata
ssl.server.truststore.type=jks
ssl.server.keystore.location=/tmp/keystore.jks
ssl.server.keystore.password=bigdata
ssl.server.keystore.keypassword=bigdata
ssl.server.keystore.type=jks

Step 6:

Update the following configurations under Advanced ssl-client (ssl-client.xml)

ssl.client.truststore.location=/tmp/truststore.jks
ssl.client.truststore.password=bigdata
ssl.client.truststore.type=jks
ssl.client.keystore.location=/tmp/keystore.jks
ssl.client.keystore.password=bigdata
ssl.client.keystore.keypassword=bigdata
ssl.client.keystore.type=jks

Step 7:

Restart HDFS service

 

Step 8:

Import the CA root (and Intermediate, if any) to ambari-server truststore by running:

 

ambari-server setup-security

 

For self-signed certs, make sure you import namenode(s) certificates to ambari-server truststore

Refer to Steps to set up Truststore for Ambari Server for more details.

 

Step 9:

Open namenode web UI in https mode on 50470 port

 

Tips:

  • When you enable the HTTPS for HDFS, Journal node and NN starts in HTTPS mode; check for journal node and namenode logs for any errors.
  • You can skip the step to create truststore.jks file and make use to java truststore instead. However, ensure you import certs (all required certs) to java truststore.

More articles

26,261 Views
Comments
avatar

How to create keystore file, in the above procedure below NOTE has we been specified as to create seperate keystore file for each node, can you kindly provide the steps to create the keystore.

Note: create separate keystore file for each NAMENODE host with the file as as keystore.jks and have it under /tmp/

avatar

How to create keystore file, In step 4 NOTE, specified to create keystore for each name node, can you kindly provide steps to create the keystore file.

Note: create separate keystore file for each NAMENODE host with the file as as keystore.jks and have it under /tmp/

avatar

Hi pappu , I have followed the same steps you mentioned for enabling HTTPS(SSL) on my Sandbox HDFS. I am pretty sure every thing is matched to your instructions. But I am not able to access my sandbox hdfs on https (port:50470). I am using self signed cert for keystore. Can you kindly help me out for this ? Answer few questions : 1) Do the case with sandbox is different as compared to real time cluster ? 2) Does there any thing need to be done in host system (OS) ? 3) Does any user permissions need to be set on the certificates folder and files ? I have set 700 for folder and 600 for keystore file. I have set hdfs:hadoop as user:group. Help will be appreciated. Thanks

avatar

Hi pappu ,

I have followed the same steps you mentioned for enabling HTTPS(SSL) on my Sandbox HDFS. I am pretty sure every thing is matched to your instructions. But I am not able to access my sandbox hdfs on https (port:50470). I am using self signed cert for keystore. Can you kindly help me out for this ?

Answer few questions :

1) Does the case with sandbox is different as compared to real time cluster ? 2) Does there any thing need to be done in host system (OS) ? 3) Does any user permissions need to be set on the certificates folder and files? I have folder owner:group as hdfs:hadoop , given permission 700 to folder and 600 to files.

After setting up all cluster service remains up. But https access does not get success.

Help will be appreciated.

Thanks

avatar

@Syed Jawad Gilani

Do you still have this issues? sorry - i did not check your questions earlier.

avatar

Its pleasure to receive your reply. I have applied the SSL on all the service mentioned in your posts on my cluster. Now just receiving few service related issues after enabling SSL. One of them is, we are not able to run a Hive map reduce job after enabling SSL. I am currently investigating below mentioned issue in which we are not able to execute MR jobs. After enabling SSL, they are getting error javax.net.ssl.SSLException: java.lang.RuntimeException: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty. Can you give me some solution for this ?

avatar

@Syed Jawad Gilani

it is difficult to tell the reason by seeing above error message. but are you generating keystore files from the latest Java and trying to use in older java? please check that.

avatar
New Contributor

This was a ton of help towards getting HTTPS hookup for me. I can connect to ports 50475, 8090, and 8044 but still having trouble with the datanode port 50470. Any thoughts?

avatar

Hi all,

I followed this guide to enable SSL for HDFS as root user on my NameNode server. Then I copied the keystore and the truststore on my DataNode servers (as suggested here at step 4) because when I tried to start HDFS service I received a FileNotFoundException on /tmp/keystore.jks. After coping files on DataNode servers I tried to start HDFS service but starting DataNode servers I receive an "Permission Denied" error on the same file: here at step 6 it suggests to change permissions on the server key location, but when I execute

chgrp -R yarn:hadoop /tmp/

I receive following error

chgrp: invalid group: ‘yarn:hadoop’

any suggestions for me?

Thanks in advance!

avatar
Explorer

@amarnath reddy pappu I've followed all the steps and enabled ssl with self signed cert. Can you please confirm if the below webhdfs url is right way to access ? Am not getting any response when I use it.

https://hostname:50470/webhdfs/v1/test/testdata.txt?user.name=root&op=OPEN