Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

enabling SSL/TLS for HDFS - running into issues

enabling SSL/TLS for HDFS - running into issues

Expert Contributor

Hello - i've a HDP 2.5 cluster (8 node), and i'm trying to enable SSL/TLS for HDFS .. using the following link -> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.3/bk_Security_Guide/content/ch_wire-https.html

i'm trying to create the hostkey using the following command ->

keytool -keystore /etc/security/clientKeys/keystore.jks -genkey -alias nwk8

The client key -> /etc/security/clientKeys/keystore.jks is the default entry in file -> /etc/hadoop/2.5.3.0-37/0/ssl-client.xml

This is not available ..

Have some basic questions (since i dont think i understand this yet) - which .jks file should i use ? is that something i get from CA ? What if i use OpenSSL ?

Any inputs on this would be appreciated.

10 REPLIES 10

Re: enabling SSL/TLS for HDFS - running into issues

Expert Contributor

@mqureshi, @Kuldeep Kulkarni, @Gerd Koenig, @Andrew Ryan - looping you in.. any ideas on this.

Re: enabling SSL/TLS for HDFS - running into issues

When executing the keytool command, if the specified keystore file does not exist it will be created. However, the parent directory need to exist.

[root@c6401 ~]# ls -l /etc/security/clientKeys
ls: cannot access /etc/security/clientKeys: No such file or directory
[root@c6401 ~]# mkdir -p  /etc/security/clientKeys
[root@c6401 ~]# /usr/jdk64/jdk1.8.0_77/bin/keytool -genkey -keystore /etc/security/clientKeys/keystore.jks -alias nwk8
Enter keystore password:
Re-enter new password:
What is your first and last name?
  [Unknown]:  nwk8.example.com
What is the name of your organizational unit?
  [Unknown]:
What is the name of your organization?
  [Unknown]:
What is the name of your City or Locality?
  [Unknown]:
What is the name of your State or Province?
  [Unknown]:
What is the two-letter country code for this unit?
  [Unknown]:
Is CN=nwk8.example.com, OU=Unknown, O=Unknown, L=Unknown, ST=Unknown, C=Unknown correct?
  [no]:  yes


Enter key password for <nwk8>
	(RETURN if same as keystore password):
[root@c6401 ~]# ls -l /etc/security/clientKeys
total 4
-rw-r--r-- 1 root root 1311 May 11 19:52 keystore.jks

You can optionally use OpenSSL to generate the keys and certificates, but you may need to import them into a Java Keystore for them to be usable by Hadoop. It is unclear to me whether ssl.server.truststore.type can be set to anything other then JKS.

Re: enabling SSL/TLS for HDFS - running into issues

Re: enabling SSL/TLS for HDFS - running into issues

Expert Contributor

@amarnath reddy pappu - thanks, ..

qq, wrt

3.Nowget the singed cert from CA - file name is/tmp/c6401.crt

How do i get the signed certificate ?

Highlighted

Re: enabling SSL/TLS for HDFS - running into issues

New Contributor

@amarnath reddy pappu

@mqureshi, @Kuldeep Kulkarni, @Gerd Koenig, @Andrew Ryansmaple-mapreduce-job-error.txtmapreduce-error-in-hive-wiht-beeline.txt We have enabled SSL/TLS on HDP cluster by following @amarnath reddy pappu blog : https://community.hortonworks.com/articles/52875/enable-https-for-hdfs.html

and HDP documentation. Almost all service opening on Https defined port. But Only issue we are currently facing is :

MAP REDUCE JOBS ARE NOT LAUNCHING We use hive through beeline connector.

While executing query we receive error : WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.

Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. java.lang.RuntimeException: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty (state=08S01,code=1)

Tried a Sample Map reduce Job alone as well. That also got failed. Error is long so attaching here. I would appreciate your help. :)

Re: enabling SSL/TLS for HDFS - running into issues

Super Guru
@Karan Alang

Based on your question, let me just elaborate the difference. You are confusing a few things here.

first, your keystore.jks is a keystore file which will store your private/public key pairs. think, like you have a safe box where you keep keys to go to different secret rooms. That safe box is your keystore. keys are stored inside this keystore. You have generated a key called nwk8 to be stored in this keystore file.

"The client key -> /etc/security/clientKeys/keystore.jks is the default entry in file -> /etc/hadoop/2.5.3.0-37/0/ssl-client.xml"

I am not sure I understand this. I am not sure where this file /etc/hadoop/2.5.3.0-37/0/ssl-client.xml comes from all of a sudden.

Have some basic questions (since i dont think i understand this yet) - which .jks file should i use ? is that something i get from CA ? What if i use OpenSSL ?

You can use your .jks created in above step or if there is an Enterprise keystore (described in this link - Hadoop SSL keystore management factory. If you have one in your organization then you should use that).

You don't get keystore from your certificate authority. Your certificate authority will only give you signed certificates and I can be more than sure that you will have an internal certificate authority or use OpenSSL. So if you don't have an internal authority then just use OpenSSL to make your own authority and sign your certificate (ask your boss).

To get a signed certificate, you will first create a certificate signing request which will be sent to your certificate authority and they will in return provide you with a certificate. That's it.

Re: enabling SSL/TLS for HDFS - running into issues

Expert Contributor

@mqureshi, @amarnath reddy pappu ,

thanks, I've use the steps in enable-https-for-hdfs , and done the following

On nwk6 (Node 6, where the nameNode is Installed)

1) Generate the jks file

2) Get the certificate signed (using OpenSSL)

3) make entries in core-site.xml, hdfs-site.xml

4) updated the files -> ssl-server.xml, ssl-client.xml

5) re-started HDFS service

6) have a questions about this next step ->

----------------------------------------------

Step7:

Make sure you import the CA root to Ambari-server by running "ambari-server setup-security"

-----------------------------------------

Couple of questions on this --

a) when i run -> ambari-server setup-security, i see options given below ... So, do i use the option 5 i.e. Import the certificate to the truststore ?

b) Pls. note -> the truststore & keystores were create on nwk6 (where nameNode is installed), while ambari is installed on nwk7. So, do the keystore & truststore need to be copied onto nwk7 Or re-created ?

[root@nwk2-bdp-hadoop-07 tmp]# ambari-server setup-security

Using python /usr/bin/python Security setup options... =========================================================================== Choose one of the following options: [1] Enable HTTPS for Ambari server. [2] Encrypt passwords stored in ambari.properties file. [3] Setup Ambari kerberos JAAS configuration. [4] Setup truststore. [5] Import certificate to truststore. =========================================================================== Enter choice, (1-5):

Pls. note - while the the above steps (except for ambari-server setup-security) have gone through fine, the https url (for NameNode UI, https://<nwk06>:50470) is not working.

Re: enabling SSL/TLS for HDFS - running into issues

Super Guru

screen-shot-2017-05-15-at-51812-pm.pngscreen-shot-2017-05-15-at-45432-pm.pngStep7:

Make sure you import the CA root to Ambari-server by running "ambari-server setup-security"

So you auhtority is OpenSSL. In this case, there should be an OpenSSL root certificate which was used to sign your certificate that you have imported. You need that root certificate.

Let me elaborate on how this works. Let's say you go to your bank website (let's say "www.chase.com"). Now how do you know that it really is "chase.com"? What if someone has hacked the connection and rerouted you to their own server and the site looks just like chase.com. You then proceed to enter your username and password and get an error. You wonder what has happened. Next thing you know, the hacker has your user id and password and can now use on the real website to access your account. So how do we resolve this problem? What you do is, you say that I cannot trust when chase.com says it is chase.com. I want someone who I trust certify to me that it is indeed chase.com. So you decide to first trust some authorities like verisign or thwate etc and then they certify to you that the site you are visiting is indeed "chase.com". At this point you are probably wondering, when did I trust verisgn or thwate or any other authority for that matter. Well, check your browser. Under browser advance settings, you should see "manage certificates" or something like that. Check system root certificates in there. Most browsers already come with root certificates with most dominant players like verisign, thwate etc. When you visit chase.com, chase.com provides a certificate (ssl connection only) and your browser says "hold on, let me check and confirm with my root certificates if your certificate was signed by an authority I trust." Once verified, your browser says perfect and you visit the website. You usually see a green lock on top. This of course happens behind the scene. If you try to visit a website for which a certificate is signed by an authority you don't trust or your browser doesn't trust (usually happens for internal website), you get "this connection is not trusted" and option to "proceed anyway".

The two screenshots show you that chase.com is signed by verisgn, an authority my browser trusts and second shows you all the root certificates that are installed in my browser.

So, you need to import the root certificate of your OpenSSL authority which has signed your certificate. Without it, just like your browser, you will get an error that is similar to "this connection is not trusted".

a) when i run -> ambari-server setup-security, i see options given below ... So, do i use the option 5 i.e. Import the certificate to the truststore ?

yes, you need to import your OpenSSL root certificate into your truststore. Notice the name. It says you trust these authorities. Truststore is a special type of keystore which stores root certificates for authorities you have decided to trust. The browser screenshot I have shared is my browser truststore.

b) Pls. note -> the truststore & keystores were create on nwk6 (where nameNode is installed), while ambari is installed on nwk7. So, do the keystore & truststore need to be copied onto nwk7 Or re-created ?

First, this is something I have not done but a certificate issued for nwk6 will not work for nwk7. Think about it when you created a certificate signing request. It asked you a bunch of questions including common name. That is the name of your server. certificates are created for servers from which certificate signing request was created. But I think you are trying to secure your namenode/clients and not ambari (I may be wrong). You might need separate certificates for each server. Long story short. certificate issued for one server will not work for any other server. Hope this helps.

Re: enabling SSL/TLS for HDFS - running into issues

Expert Contributor

@mqureshi

- thanks for the detailed reply & explanation on this, that really helps clarify the concept.

However, a followup on this .. i've configured SSL/TLS for HDFS, how do i test this & ensure SSL is implemented correctly for HDFS ?

the https Namenode url does not seems to be working, pls see screenshot attached.

Also, attached is the screenshort of the http NameNode url & the configured values of dfs.https.port & dfs.namenode.https-address, in hdfs-site.xml.

screen-shot-2017-05-15-at-35026-pm.png

screen-shot-2017-05-15-at-35101-pm.png

screen-shot-2017-05-15-at-35035-pm.png