Support Questions

Find answers, ask questions, and share your expertise

Ambari lost heartbeat during restart

avatar
Rising Star

Hi,

I am trying to setup SSL between different HDP components. For that i need to modify some files i.e. hdfs-site.xml, core-site.xml etc. So i stopped HDFS, YARN, MAPREDUCE services from Ambari dashboard and made changes to these config files from ambari. When i am trying to restart services it is showing error and not able to restart. I checked Ambari-server logs and found below

22 Feb 2017 06:50:03,692 ERROR [ambari-heartbeat-processor-0] HeartbeatProcessor:554 - Operation failed - may be retried. Service component host: DATANODE, host: slave1 Action id 120-0 and Task id 961 22 Feb 2017 06:50:03,701 ERROR [ambari-heartbeat-processor-0] HeartbeatProcessor:554 - Operation failed - may be retried. Service component host: NAMENODE, host: master Action id 120-0 and Task id 958

22 Feb 2017 06:50:03,715 ERROR [ambari-heartbeat-processor-0] HeartbeatProcessor:554 - Operation failed - may be retried. Service component host: DATANODE, host: slave2 Action id 120-0 and Task id 959

My ambari-agents are running on all the nodes of the cluster(2 slave nodes and 1 namenode). I have no idea why it is not taking heartbeats.

Any help would be appreciated?

Thanks

Rahul

1 ACCEPTED SOLUTION

avatar
Master Mentor

@rahul gulati

Can you please double check if you have followed the steps as mentioned in the article in order to enable SSL for HDFS components ? https://community.hortonworks.com/articles/52875/enable-https-for-hdfs.html

AND

https://community.hortonworks.com/articles/52876/enable-https-for-yarn-and-mapreduce2.html

Also are you able to run service checks properly from ambari UI for HDFS?

View solution in original post

12 REPLIES 12

avatar
Master Mentor

@rahul gulati

Can you please double check if you have followed the steps as mentioned in the article in order to enable SSL for HDFS components ? https://community.hortonworks.com/articles/52875/enable-https-for-hdfs.html

AND

https://community.hortonworks.com/articles/52876/enable-https-for-yarn-and-mapreduce2.html

Also are you able to run service checks properly from ambari UI for HDFS?

avatar
Rising Star

@Jay SenSharma

I have followed below mentioned link to setup SSL for HDFS and Yarn.

http://getthekt.com/2016/02/securing-hadoop-cluster-part-1-ssltls-for-hdfs-and-yarn/

I assume this is almost same as the one you mentioned.

After that i stopped HDFS, Yarn, Mapreduce from Ambari. And modified hdfs-site.xml, yarn-site.xml, mapred-site.xml from Ambari itself. But i have modified ssl-server.xml and ssl-client.xml from Namenode CLI at path

/etc/hadoop/conf. And i copied ssl files from master node to slave nodes at same path /etc/hadoop/conf.

And then i am trying to restart the HDFS and it is giving above error. I am unable to understand why heartbeats are stopped although ambari-agents are running on all nodes.(Slave1, Slave2 and Master)?

Please let me know if there is any issue with above mentioned steps.

I appreciate your help!!

Thanks

avatar
Rising Star

@Jay SenSharma

Got this error now. Dont know why it is looking for /etc/security/serverkeys/keystore.jks. Although i have already modified this property from ambari. One more thing. Since i have tried changing these properties from hadoop cli. Do you want me to modify them from Ambari dashboard. I think that might be the issue since it is looking at default location?

2017-02-22 08:04:40,697 ERROR namenode.NameNode (NameNode.java:main(1759)) - Failed to start namenode.
java.io.FileNotFoundException: /etc/security/serverKeys/keystore.jks (No such file or directory)

avatar
Rising Star

@Jay SenSharma

I am able to resolve the issue. Actually the problem was that i modified ssl-server.xml and ssl-client.xml from hadoop cli on namenode. Now i did it using Ambari. And provide the correct paths of keystore files and all my services are running. 🙂 Thanks for the help.

Just another thing how to test this SSL.

I am able to open NameNode UI at port 50470 and DataNode UI at 50475. But connection is still showing as insecure. Any reasons why?

Also Can we enable SSL for Ranger and Spark?

Thanks

avatar
Rising Star

@Jay SenSharma

Could you please anwser query mentioned above?

Thanks

avatar
Master Mentor

@rahul gulati

Regarding your query "You are able to open NameNode UI at port 50470 and DataNode UI at 50475. But connection is still showing as insecure."

>>> Can you please let us know where exactly are you checking that the connection is unsecure? If it is a lab environment then we can enable the HADOOP_OPTS to have the Debug option "-Djavax.net.debug=ssl" enabled. that will show us more detailed information about the SSL communication .. if it is happening correctly or not?

.

Regarding Enabling "Ranger" SSL" you might want to refer to: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.0/bk_Security_Guide/content/configure_non_amb...

http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.0/bk_Security_Guide/content/configure_ambari_r...

avatar
Rising Star

@Jay SenSharma

I am seeing that at the top of browser address bar. Although the connection is going on port 50470 and 50475 using https but i dont understand why is it showing as connection insecure at top of browser address bar? Does that make any significance here?

Thanks

not-secure.png

avatar
Master Mentor

@rahul gulati

"Not Secure" usually means that the certificate identity cannot be verified. Usually this happens when you use "Self Signed" Certificates (instead of using a CA-signed certificate). But still if you want then you can disable that warning though like:

For Chrome: https://developers.google.com/web/updates/2016/10/avoid-not-secure-warn

FireFox: https://support.mozilla.org/t5/Fix-slowness-crashing-error/What-does-quot-Your-connection-is-not-sec...

avatar
Rising Star

@Jay SenSharma

yeah its a self signed certificate. And now i am seeing errors in installing Ranger as well because of below error. I assume this is due to ssl. right?

java.lang.IllegalStateException: Can't get secure connection to master:50470/jmx?get=Hadoop:service=NameNode,name=FSNamesystem::tag.HAState. Truststore path or password is not set