Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Ambari lost heartbeat during restart

avatar
Rising Star

Hi,

I am trying to setup SSL between different HDP components. For that i need to modify some files i.e. hdfs-site.xml, core-site.xml etc. So i stopped HDFS, YARN, MAPREDUCE services from Ambari dashboard and made changes to these config files from ambari. When i am trying to restart services it is showing error and not able to restart. I checked Ambari-server logs and found below

22 Feb 2017 06:50:03,692 ERROR [ambari-heartbeat-processor-0] HeartbeatProcessor:554 - Operation failed - may be retried. Service component host: DATANODE, host: slave1 Action id 120-0 and Task id 961 22 Feb 2017 06:50:03,701 ERROR [ambari-heartbeat-processor-0] HeartbeatProcessor:554 - Operation failed - may be retried. Service component host: NAMENODE, host: master Action id 120-0 and Task id 958

22 Feb 2017 06:50:03,715 ERROR [ambari-heartbeat-processor-0] HeartbeatProcessor:554 - Operation failed - may be retried. Service component host: DATANODE, host: slave2 Action id 120-0 and Task id 959

My ambari-agents are running on all the nodes of the cluster(2 slave nodes and 1 namenode). I have no idea why it is not taking heartbeats.

Any help would be appreciated?

Thanks

Rahul

1 ACCEPTED SOLUTION

avatar
Master Mentor

@rahul gulati

Can you please double check if you have followed the steps as mentioned in the article in order to enable SSL for HDFS components ? https://community.hortonworks.com/articles/52875/enable-https-for-hdfs.html

AND

https://community.hortonworks.com/articles/52876/enable-https-for-yarn-and-mapreduce2.html

Also are you able to run service checks properly from ambari UI for HDFS?

View solution in original post

12 REPLIES 12

avatar
Rising Star

@Jay SenSharma

I was going through below mentioned link if i need to setup CA signed certificate. I am going to mention list of steps to be performed on 3 node cluster(excluding edge node). Please validate

  1. keytool -genkey -keyalg RSA -alias c6401 -keystore /tmp/keystore.jks -storepass bigdata -validity 360-keysize 2048 (To be generated for each Node(master, slave1 and slave2))
  2. keytool -certreq -alias c6401 -keyalg RSA -file /tmp/c6401.csr -keystore /tmp/keystore.jks -storepass bigdata( csr file to be generated for each node keystore.jks file)
  3. Nowget the singed cert from CA - file name is/tmp/c6401.crt( How to get this certificate for each node/or single node from CA)?
  4. Import the root cert to JKS first.(Ignoreif it already present) keytool -import-alias root -file /tmp/ca.crt -keystore /tmp/keystore.jks (How to get root cert)?
  5. Repeat step4 for intermediate cert if there is any.?
  6. Import signed cert into JKS. keytool -import-alias c6401 -file /tmp/c6401.crt -keystore /tmp/keystore.jks -storepass bigdata ( to be done for each node)?
  7. keytool -import-alias root -file /tmp/ca.crt -keystore /tmp/truststore.jks -storepass bigdata (To be done for each node)?

Kindly let me know if this is correct approach? Or is there any other link for multi node?

Thanks for great help!!

https://community.hortonworks.com/articles/52875/enable-https-for-hdfs.html

avatar
Super Guru
@rahul gulati

Check below things -

1. /etc/ambari-agent/conf/ambari-agent.ini has entry pointing to ambari server -> server=<ambari_host>

2. Check if iptables and selinux are stopped and disabled.

3. Try restarting the agent $ambari-agent restart

4. Check if there is issue with /etc/hosts file. Wrong/incorrect intry in /etc/hosts file can create such issue.

avatar
Rising Star

@Sagar Shimpi

Thanks for replying. I have checked the ambari-agent.ini file at all hosts and it is same and ambari-server IP is also correct. All my services were running before setting up SSL. I stopped HDFS, Yarm, Mapreduce and made ssl related changes as mentioned up, since then i am unable to bring up the services due to error mentioned in question.

Thanks

Rahul