If I am using self-signed certificates, where do I put the created my.truststore? Into $JAVA_HOME/jre/lib/security/my.truststore or $JAVA_HOME/jre/lib/security/jssecacerts?
It is not clear to me how it is going to be used and where would I need to specify it when configuring Levels 1-3 of TLS. It is also not clear to me if both keystore and truststore are needed or keystores are enough?
Any good introduction on the concepts of certificates, truststore, keystore?
Cloudera's documentation does have some basic background here:
For Cloudera Manager / Agent encryption:
To your specific questions:
my.truststore or jssecacerts?
By default, the JDKs ships with a file that contains common public CA certificates for trust: $JAVA_HOME/jre/lib/cacerts
Java will look to that file if it is not told (in a program or via command line) to use another file.
If there is a $JAVA_HOME/jre/lib/jssecacerts file present, that file will be read instead of the "cacerts" file there.
If you want to use my.truststore (your custom file) then you would need to let the JVM know to use that file. Cloudera Manager should have configuration options form most Java Keystores, so that is one way to tell the JVM where to find the truststore. See this section of the JSSE reference for more information about the truststore:
1. if CM (Cloudera Manager)'s truststore configuration for a service or role is specified, that will be used.
2. If none is specified, java will use "javax.net.ssl.trustStore" if it is set (at the command line, for instance)
3. If "javax.net.ssl.trustStore" is not set, then$JAVA_HOME/jre/lib/jssecacerts is used
4. If $JAVA_HOME/jre/lib/jssecacerts does not exist, then the default $JAVA_HOME/jre/lib/cacerts is used.
When securing Cloudera Manager/Agent communication (I assume that is what you meant by levels 1-3):
1. no truststore is required for level 1 (TLS enabled, no certificate validation)
2. only openssl trust store is required for level 2 (TLS enabled, agent needs trust for the Cloudera Manager server certificate signer
3. Both of the above are required, and Cloudera Manager needs to be configured to read a truststore containing all the agents' public certificates. (CM validates the agents' certificates)
I don't have a really good beginners guide to the topics handy at the moment, but you'll find some reasonable information here:
"3. Both of the above are required, and Cloudera Manager needs to be configured to read a truststore containing all the agents' public certificates. (CM validates the agents' certificates)
The documentation (https://www.cloudera.com/documentation/enterprise/latest/topics/how_to_configure_cm_tls.html#concept... ) only says to put rootcert into jssecacerts, it does not say to put agent public keys there. Should the documentation be fixed?
The agents' certificates only need to be in the CM truststore for "level 3" CM/agent security. In that scenario, the agent presents its public certificate to CM and CM verifies it trusts the certificate. In other trust scenarios, only the root CA in the chain of trust needs to be in the truststore.
I have configured all the 3 levels. But since the documentation did not say that I should import agent certificates, I did not. And, probably as a result of that, most of the Cloudera Management Services are down. I'll try to add all the agent certificates and see if it helps. So to double check, for each agent certificate I should do:
keytool -importcert -alias md01.rcc.local-agent -keystore $JAVA_HOME/jre/lib/security/jssecacerts -file md01.rcc.local-agent.pem -storepass <..>
Only jssecacerts on CM server needs to have agent certificates or all the nodes? Do the nodes need CM server certificates?
It seems we removed that section from CM 5.10. I'll test in the coming days to find out if the behavior has changed. The step to import the agent certificates is still in the 5.9 docs:
However, the 5.9 docs are not clear, either.
I'm going to go over the TLS documentation in coming weeks and fix problems and clarify the instructions.
For now, I recommend you import each of the agents' certificates into a single JKS file (after having imported your root CA first). Point CM's Path to Truststore configuration to that JKS file. After that, restart Cloudera Manager.
If you still have problems, we can help.
I just tried importing all the agent certificates and (just in case, can it hurt?) server certificate into the truststore and distributing it accross the nodes, restarting server, agents, Cloudera Management services.
It did not solve my problems. Most of the Cloudera Management Services are still down.
Also note in the 5.9 link you just sent me:
"Important: Only perform this step if your Agent certificates have not been enabled for TLS Web Client Authentication. See Step 4 for instructions on how to examine Agent certificates.
By "this step" they mean importing agent certificates into the truststore.
But the latest instructions did require enabling TLS Web Client Authentication. In my agent certificates I have the corresponding line:
openssl x509 -text -in md01.rcc.local-agent.pem
Digital Signature, Key Encipherment
X509v3 Extended Key Usage:
TLS Web Server Authentication, TLS Web Client Authentication
X509v3 Subject Alternative Name:
Any other ideas? What does enabling "TLS Web Server Authentication, TLS Web Client Authentication" do exactly? I still do not understand how agents are supposed to authenticate to CM if one does not put agent certificates into the truststore.
Might it be necessary to append root certificate to each agent certificate pem file before importing into the truststore?
Might there be some problems with permissions or ownership here (for example, I would assume that at least keys should be readable only by u, maybe g, but certainly not o?
[root@md01 try2]# ls -l /opt/cloudera/security/pki
lrwxrwxrwx 1 cloudera-scm cloudera-scm 51 Feb 27 20:57 agent.cert.pem -> /opt/cloudera/security/pki/md01.rcc.local-agent.pem
lrwxrwxrwx 1 cloudera-scm cloudera-scm 51 Feb 27 20:57 agent.jks -> /opt/cloudera/security/pki/md01.rcc.local-agent.jks
lrwxrwxrwx 1 cloudera-scm cloudera-scm 51 Feb 28 14:44 agent.key -> /opt/cloudera/security/pki/md01.rcc.local-agent.key
-rw-r--r-- 1 cloudera-scm cloudera-scm 1241 Feb 28 23:40 md01.rcc.local-agent.csr
-rw-r--r-- 1 cloudera-scm cloudera-scm 4295 Feb 28 23:40 md01.rcc.local-agent.jks
-rw-r--r-- 1 cloudera-scm cloudera-scm 1991 Mar 1 00:05 md01.rcc.local-agent.key
-rw-r--r-- 1 cloudera-scm cloudera-scm 5008 Mar 1 00:05 md01.rcc.local-agent.p12
-rw-r--r-- 1 cloudera-scm cloudera-scm 8394 Feb 28 23:40 md01.rcc.local-agent.pem
-rw-r--r-- 1 cloudera-scm cloudera-scm 1195 Feb 28 23:45 md01.rcc.local-server.csr
-rw-r--r-- 1 cloudera-scm cloudera-scm 4263 Feb 28 23:45 md01.rcc.local-server.jks
-rw-r--r-- 1 cloudera-scm cloudera-scm 8232 Feb 28 23:45 md01.rcc.local-server.pem
-rw-r--r-- 1 cloudera-scm cloudera-scm 2175 Feb 28 23:40 rootca.cert.pem
[root@md01 try2]# ls -l /etc/cloudera-scm-agent/agentkey.pw
-r--r----- 1 root root 13 Mar 1 00:16 /etc/cloudera-scm-agent/agentkey.pw
(this file is supposed to contain keystore, not truststore password, right?)
[root@md01 try2]# ls -l $JAVA_HOME/jre/lib/security/jssecacerts
-rw-r--r-- 1 root root 126218 Mar 2 10:51 /usr/java/jdk1.8.0_121/jre/lib/security/jssecacerts
I do not remember if I mentioned it, but I do not have intermediate certificates. So in the latest instructions, whenever intermediate certificates were mentioned, I used root certificate. Is that OK?
If you don't have intermediate certificates, don't worry about them. Take the instructions to mean "if you have intermediate certificates..."
When troubleshooting these sort of problems, it is important to define the problem very clearly. To do so, we will need to understand what client/server communication is failing.
If the agents are not heartbeating to Cloudera Manager, then the cluster will appear in BAD health. To find out if they are heartbeating, go to the Cloudera Manager Hosts -> All hosts page.
There, look at the "Last Heartbeat" column and if there are no values or values greater than 15s, there is a problem where the agent's hearbeat is not being received.
If that is the case, go to the agents and review the /var/log/cloudera-scm-agent/cloudera-scm-agent.log file for exceptions regarding heartbeats. You can share them here for review.
Also, look at the /var/log/cloudera-scm-server/cloudera-scm-server.log on the Cloudera Manager host to see if there are exceptions or messages occurring at the same time as the heartbeat error in the agent log. Share them here if there are.
From there we can take a closer look at your agent configuration and certificates.
"When troubleshooting these sort of problems, it is important to define the problem very clearly. To do so, we will need to understand what client/server communication is failing."
Hadoop itself seems to be happy. None of Hadoop components report any problems. All the heartbeats are under 15s.
What is not working are Cloudera Management Services. All of them are in red or unknown state if you go into Cloudera Management Services. Also, on the CM front page there are two messages:
Request to the Service Monitor failed. This may cause slow page responses. View the status of the Service Monitor.
Request to the Host Monitor failed. This may cause slow page responses. View the status of the Host Monitor.