Member since
10-20-2017
6
Posts
1
Kudos Received
0
Solutions
01-29-2018
07:57 AM
Hi, One of the edge nodes in the cluster is generating alerts once or twice a week due to missed heartbeat from agent. INFO 2018-01-29 07:04:38,554 logger.py:71 - call returned (0, '') INFO 2018-01-29 07:06:04,226 logger.py:71 - call[['test', '-w', '/']] {'sudo': True, 'timeout': 5} INFO 2018-01-29 07:06:04,233 logger.py:71 - call returned (0, '') As you can see, there is no logging for 1.5 minutes and it is causing ambari alert for this edge node. How can i track if there was any connectivity issue between server and agent. Sometimes more than 1 heartbeat interval is missing.
... View more
Labels:
- Labels:
-
Apache Ambari
01-15-2018
08:14 AM
Restarting the hdfs, yarn and mapreduce after some configuration change(not related to the post) did the trick.
... View more
01-09-2018
03:34 PM
Hi, I have changed my hadoop-env.sh and advanced log4j using ambari for making change to how hdfs-audit.log is rolled. I have added a new RollingFileAppender configuration in log4j as below, so that i can replace the DailyRollingFileAppender with this one. #Added below log4j.appender.RFAAUDIT=org.apache.log4j.RollingFileAppender log4j.appender.RFAAUDIT.File=${hadoop.log.dir}/hdfs-audit.log log4j.appender.RFAAUDIT.layout=org.apache.log4j.PatternLayout log4j.appender.RFAAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n log4j.appender.RFAAUDIT.MaxFileSize=100MB log4j.appender.RFAAUDIT.MaxBackupIndex=9 Now i am simply overwriting the advanced hadoop-env.sh with the new appender information Only replacing -Dhdfs.audit.logger=INFO,DRFAAUDIT with -Dhdfs.audit.logger=INFO,RFAAUDIT I have not touched the jobsummary logger -Dhadoop.mapreduce.jobsummary.logger=INFO,JSA But after restarting and monitoring the logs, i find that there is no issue with rolling of hdfs-audit.log. It is successfully rolling after 100MB. However my /var/log/hadoop-yarn/yarn/hadoop-mapreduce.jobsummary.log is not longer getting updated post this change. The log has stopped updating since this change. I don't know how the audit log settings have messed up the mapreduce jobsummary logging.
... View more
Labels:
- Labels:
-
Apache Hadoop
11-06-2017
04:07 PM
Can you telnet to the other zookeeper nodes on 2181. Looks like you tested the telnet on the localhost node, but you need to check it for other hosts. I had faced exactly same error message and i noticed that i was missing an entry in the hosts file to one of the zookeeper nodes.
... View more
10-20-2017
08:06 AM
1 Kudo
Hi, I had similar issue. I found below post useful in my case. error-installing-standalon-ambari-server.html
... View more