- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
how to repair Unhealthy Nodemanager ??
- Labels:
-
Apache Ambari
-
Apache YARN
Created 09-27-2016 09:34 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
how to repair Unhealthy Nodemanager ??
i restart Yarn service but i have 4 nodemanagers started and 1 unhealthy , when i try to ckeck
/var/log/hadoop/yarn i dont find any log , so how to repair Unhealthy Nodemanager
Created 09-27-2016 11:50 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
i found the solution go to
yarn.nodemanager.disk-health-checker.min-healthy-disks
and change the value to 0 and restart yarn and it gonna work.
Created 09-27-2016 09:35 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Mourad Chahri Can you check if you have enough disk available on the node ?
Created 09-27-2016 09:39 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Sandeep Nemuri yes i have enough space on disk
Created 09-27-2016 09:38 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Could you please check from Ambari - reason for unhealthy node?
Created 09-27-2016 09:40 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created 09-27-2016 09:43 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Mourad Chahri Can you please restart only the unhealthy nodemanager and check if its coming up correctly?
If it fails, please share the error message. You can find the error message from ambari start service dialogue window.
Please let me know if you have any questions regarding this. Happy to help.
Created 09-27-2016 09:48 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
yes i can restart the unhealthy nodemanager i have this on log
2016-09-27 09:44:32,687 - Group['hadoop'] {'ignore_failures': False}
2016-09-27 09:44:32,690 - Group['users'] {'ignore_failures': False}
2016-09-27 09:44:32,691 - User['hive'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': ['hadoop']}
2016-09-27 09:44:32,692 - User['mapred'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': ['hadoop']}
2016-09-27 09:44:32,693 - User['accumulo'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': ['hadoop']}
2016-09-27 09:44:32,694 - User['hbase'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': ['hadoop']}
2016-09-27 09:44:32,695 - User['ambari-qa'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': ['users']}
2016-09-27 09:44:32,696 - User['zookeeper'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': ['hadoop']}
2016-09-27 09:44:32,697 - User['tez'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': ['users']}
2016-09-27 09:44:32,698 - User['hdfs'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': ['hadoop']}
2016-09-27 09:44:32,699 - User['sqoop'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': ['hadoop']}
2016-09-27 09:44:32,700 - User['hcat'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': ['hadoop']}
2016-09-27 09:44:32,701 - User['yarn'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': ['hadoop']}
2016-09-27 09:44:32,702 - User['ams'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': ['hadoop']}
2016-09-27 09:44:32,703 - File['/var/lib/ambari-agent/data/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555}
2016-09-27 09:44:32,734 - Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa'] {'not_if': '(test $(id -u ambari-qa) -gt 1000) || (false)'}
2016-09-27 09:44:32,741 - Skipping Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa'] due to not_if
2016-09-27 09:44:32,742 - Directory['/tmp/hbase-hbase'] {'owner': 'hbase', 'recursive': True, 'mode': 0775, 'cd_access': 'a'}
2016-09-27 09:44:32,757 - File['/var/lib/ambari-agent/data/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555}
2016-09-27 09:44:32,759 - Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh hbase /home/hbase,/tmp/hbase,/usr/bin/hbase,/var/log/hbase,/tmp/hbase-hbase'] {'not_if': '(test $(id -u hbase) -gt 1000) || (false)'}
2016-09-27 09:44:32,766 - Skipping Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh hbase /home/hbase,/tmp/hbase,/usr/bin/hbase,/var/log/hbase,/tmp/hbase-hbase'] due to not_if
2016-09-27 09:44:32,767 - Group['hdfs'] {'ignore_failures': False}
2016-09-27 09:44:32,768 - User['hdfs'] {'ignore_failures': False, 'groups': ['hadoop', 'hdfs']}
2016-09-27 09:44:32,769 - Directory['/etc/hadoop'] {'mode': 0755}
2016-09-27 09:44:32,789 - File['/usr/hdp/current/hadoop-client/conf/hadoop-env.sh'] {'content': InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop'}
2016-09-27 09:44:32,807 - Execute[('setenforce', '0')] {'not_if': '(! which getenforce ) || (which getenforce && getenforce | grep -q Disabled)', 'sudo': True, 'only_if': 'test -f /selinux/enforce'}
2016-09-27 09:44:32,857 - Directory['/var/log/hadoop'] {'owner': 'root', 'mode': 0775, 'group': 'hadoop', 'recursive': True, 'cd_access': 'a'}
2016-09-27 09:44:32,879 - Directory['/var/run/hadoop'] {'owner': 'root', 'group': 'root', 'recursive': True, 'cd_access': 'a'}
2016-09-27 09:44:32,880 - Directory['/tmp/hadoop-hdfs'] {'owner': 'hdfs', 'recursive': True, 'cd_access': 'a'}
2016-09-27 09:44:32,888 - File['/usr/hdp/current/hadoop-client/conf/commons-logging.properties'] {'content': Template('commons-logging.properties.j2'), 'owner': 'hdfs'}
2016-09-27 09:44:32,891 - File['/usr/hdp/current/hadoop-client/conf/health_check'] {'content': Template('health_check.j2'), 'owner': 'hdfs'}
2016-09-27 09:44:32,896 - File['/usr/hdp/current/hadoop-client/conf/log4j.properties'] {'content': ..., 'owner': 'hdfs', 'group': 'hadoop', 'mode': 0644}
2016-09-27 09:44:32,909 - File['/usr/hdp/current/hadoop-client/conf/hadoop-metrics2.properties'] {'content': Template('hadoop-metrics2.properties.j2'), 'owner': 'hdfs'}
2016-09-27 09:44:32,919 - File['/usr/hdp/current/hadoop-client/conf/task-log4j.properties'] {'content': StaticFile('task-log4j.properties'), 'mode': 0755}
2016-09-27 09:44:32,921 - File['/usr/hdp/current/hadoop-client/conf/configuration.xsl'] {'owner': 'hdfs', 'group': 'hadoop'}
2016-09-27 09:44:32,929 - File['/etc/hadoop/conf/topology_mappings.data'] {'owner': 'hdfs', 'content': Template('topology_mappings.data.j2'), 'only_if': 'test -d /etc/hadoop/conf', 'group': 'hadoop'}
2016-09-27 09:44:32,941 - File['/etc/hadoop/conf/topology_script.py'] {'content': StaticFile('topology_script.py'), 'only_if': 'test -d /etc/hadoop/conf', 'mode': 0755}
2016-09-27 09:44:33,397 - Execute['export HADOOP_LIBEXEC_DIR=/usr/hdp/current/hadoop-client/libexec && /usr/hdp/current/hadoop-yarn-nodemanager/sbin/yarn-daemon.sh --config /usr/hdp/current/hadoop-client/conf stop nodemanager'] {'user': 'yarn'}
2016-09-27 09:44:38,656 - Directory['/hadoop/yarn/local'] {'group': 'hadoop', 'recursive': True, 'cd_access': 'a', 'ignore_failures': True, 'mode': 0775, 'owner': 'yarn'}
2016-09-27 09:44:38,659 - Directory['/hadoop/yarn/log'] {'group': 'hadoop', 'recursive': True, 'cd_access': 'a', 'ignore_failures': True, 'mode': 0775, 'owner': 'yarn'}
2016-09-27 09:44:38,659 - Execute[('chown', '-R', 'yarn', '/hadoop/yarn/local/usercache/ambari-qa')] {'sudo': True, 'only_if': 'test -d /hadoop/yarn/local/usercache/ambari-qa'}
Created 09-27-2016 09:49 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2016-09-27 09:44:39,168 - File['/usr/hdp/current/hadoop-client/conf/mapred-env.sh'] {'content': InlineTemplate(...), 'owner': 'hdfs'}
2016-09-27 09:44:39,172 - File['/usr/hdp/current/hadoop-client/conf/taskcontroller.cfg'] {'content': Template('taskcontroller.cfg.j2'), 'owner': 'hdfs'}
2016-09-27 09:44:39,179 - XmlConfig['mapred-site.xml'] {'owner': 'mapred', 'group': 'hadoop', 'conf_dir': '/usr/hdp/current/hadoop-client/conf', 'configuration_attributes': {}, 'configurations': ...}
2016-09-27 09:44:39,191 - Generating config: /usr/hdp/current/hadoop-client/conf/mapred-site.xml
2016-09-27 09:44:39,192 - File['/usr/hdp/current/hadoop-client/conf/mapred-site.xml'] {'owner': 'mapred', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': None, 'encoding': 'UTF-8'}
2016-09-27 09:44:39,239 - Writing File['/usr/hdp/current/hadoop-client/conf/mapred-site.xml'] because contents don't match
2016-09-27 09:44:39,239 - Changing owner for /usr/hdp/current/hadoop-client/conf/mapred-site.xml from 508 to mapred
2016-09-27 09:44:39,240 - XmlConfig['capacity-scheduler.xml'] {'owner': 'hdfs', 'group': 'hadoop', 'conf_dir': '/usr/hdp/current/hadoop-client/conf', 'configuration_attributes': {}, 'configurations': ...}
2016-09-27 09:44:39,253 - Generating config: /usr/hdp/current/hadoop-client/conf/capacity-scheduler.xml
2016-09-27 09:44:39,253 - File['/usr/hdp/current/hadoop-client/conf/capacity-scheduler.xml'] {'owner': 'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': None, 'encoding': 'UTF-8'}
2016-09-27 09:44:39,269 - Changing owner for /usr/hdp/current/hadoop-client/conf/capacity-scheduler.xml from 508 to hdfs
2016-09-27 09:44:39,269 - XmlConfig['ssl-client.xml'] {'owner': 'hdfs', 'group': 'hadoop', 'conf_dir': '/usr/hdp/current/hadoop-client/conf', 'configuration_attributes': {}, 'configurations': ...}
2016-09-27 09:44:39,282 - Generating config: /usr/hdp/current/hadoop-client/conf/ssl-client.xml
2016-09-27 09:44:39,282 - File['/usr/hdp/current/hadoop-client/conf/ssl-client.xml'] {'owner': 'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': None, 'encoding': 'UTF-8'}
2016-09-27 09:44:39,290 - Writing File['/usr/hdp/current/hadoop-client/conf/ssl-client.xml'] because contents don't match
2016-09-27 09:44:39,290 - Directory['/usr/hdp/current/hadoop-client/conf/secure'] {'owner': 'root', 'group': 'hadoop', 'recursive': True, 'cd_access': 'a'}
2016-09-27 09:44:39,312 - XmlConfig['ssl-client.xml'] {'owner': 'hdfs', 'group': 'hadoop', 'conf_dir': '/usr/hdp/current/hadoop-client/conf/secure', 'configuration_attributes': {}, 'configurations': ...}
2016-09-27 09:44:39,325 - Generating config: /usr/hdp/current/hadoop-client/conf/secure/ssl-client.xml
2016-09-27 09:44:39,325 - File['/usr/hdp/current/hadoop-client/conf/secure/ssl-client.xml'] {'owner': 'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': None, 'encoding': 'UTF-8'}
2016-09-27 09:44:39,340 - Writing File['/usr/hdp/current/hadoop-client/conf/secure/ssl-client.xml'] because contents don't match
2016-09-27 09:44:39,341 - XmlConfig['ssl-server.xml'] {'owner': 'hdfs', 'group': 'hadoop', 'conf_dir': '/usr/hdp/current/hadoop-client/conf', 'configuration_attributes': {}, 'configurations': ...}
2016-09-27 09:44:39,354 - Generating config: /usr/hdp/current/hadoop-client/conf/ssl-server.xml
2016-09-27 09:44:39,354 - File['/usr/hdp/current/hadoop-client/conf/ssl-server.xml'] {'owner': 'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': None, 'encoding': 'UTF-8'}
2016-09-27 09:44:39,363 - Writing File['/usr/hdp/current/hadoop-client/conf/ssl-server.xml'] because contents don't match
2016-09-27 09:44:39,364 - File['/usr/hdp/current/hadoop-client/conf/ssl-client.xml.example'] {'owner': 'mapred', 'group': 'hadoop'}
2016-09-27 09:44:39,364 - File['/usr/hdp/current/hadoop-client/conf/ssl-server.xml.example'] {'owner': 'mapred', 'group': 'hadoop'}
2016-09-27 09:44:39,366 - File['/var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid'] {'action': ['delete'], 'not_if': 'ls /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid >/dev/null 2>&1 && ps -p `cat /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid` >/dev/null 2>&1'}
2016-09-27 09:44:39,373 - Execute['ulimit -c unlimited; export HADOOP_LIBEXEC_DIR=/usr/hdp/current/hadoop-client/libexec && /usr/hdp/current/hadoop-yarn-nodemanager/sbin/yarn-daemon.sh --config /usr/hdp/current/hadoop-client/conf start nodemanager'] {'not_if': 'ls /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid >/dev/null 2>&1 && ps -p `cat /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid` >/dev/null 2>&1', 'user': 'yarn'}
2016-09-27 09:44:40,596 - Execute['ls /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid >/dev/null 2>&1 && ps -p `cat /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid` >/dev/null 2>&1'] {'not_if': 'ls /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid >/dev/null 2>&1 && ps -p `cat /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid` >/dev/null 2>&1', 'tries': 5, 'user': 'yarn', 'try_sleep': 1}
2016-09-27 09:44:40,798 - Skipping Execute['ls /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid >/dev/null 2>&1 && ps -p `cat /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid` >/dev/null 2>&1'] due to not_if
Created 09-27-2016 10:07 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2016-09-27 09:44:38,711 - Directory['/var/run/hadoop-yarn'] {'owner': 'yarn', 'group': 'hadoop', 'recursive': True, 'cd_access': 'a'}
2016-09-27 09:44:38,712 - Directory['/var/run/hadoop-yarn/yarn'] {'owner': 'yarn', 'group': 'hadoop', 'recursive': True, 'cd_access': 'a'}
2016-09-27 09:44:38,713 - Directory['/var/log/hadoop-yarn/yarn'] {'owner': 'yarn', 'group': 'hadoop', 'recursive': True, 'cd_access': 'a'}
2016-09-27 09:44:38,715 - Directory['/var/run/hadoop-mapreduce'] {'owner': 'mapred', 'group': 'hadoop', 'recursive': True, 'cd_access': 'a'}
2016-09-27 09:44:38,717 - Directory['/var/run/hadoop-mapreduce/mapred'] {'owner': 'mapred', 'group': 'hadoop', 'recursive': True, 'cd_access': 'a'}
2016-09-27 09:44:38,717 - Directory['/var/log/hadoop-mapreduce'] {'owner': 'mapred', 'group': 'hadoop', 'recursive': True, 'cd_access': 'a'}
2016-09-27 09:44:38,718 - Directory['/var/log/hadoop-mapreduce/mapred'] {'owner': 'mapred', 'group': 'hadoop', 'recursive': True, 'cd_access': 'a'}
2016-09-27 09:44:38,719 - Directory['/var/log/hadoop-yarn'] {'owner': 'yarn', 'ignore_failures': True, 'recursive': True, 'cd_access': 'a'}
2016-09-27 09:44:38,720 - XmlConfig['core-site.xml'] {'group': 'hadoop', 'conf_dir': '/usr/hdp/current/hadoop-client/conf', 'mode': 0644, 'configuration_attributes': {}, 'owner': 'hdfs', 'configurations': ...}
2016-09-27 09:44:38,752 - Generating config: /usr/hdp/current/hadoop-client/conf/core-site.xml
2016-09-27 09:44:38,752 - File['/usr/hdp/current/hadoop-client/conf/core-site.xml'] {'owner': 'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': 0644, 'encoding': 'UTF-8'}
2016-09-27 09:44:38,779 - Writing File['/usr/hdp/current/hadoop-client/conf/core-site.xml'] because contents don't match
2016-09-27 09:44:38,780 - XmlConfig['hdfs-site.xml'] {'group': 'hadoop', 'conf_dir': '/usr/hdp/current/hadoop-client/conf', 'mode': 0644, 'configuration_attributes': {'final': {'dfs.datanode.data.dir': 'true'}}, 'owner': 'hdfs', 'configurations': ...}
2016-09-27 09:44:38,793 - Generating config: /usr/hdp/current/hadoop-client/conf/hdfs-site.xml
2016-09-27 09:44:38,793 - File['/usr/hdp/current/hadoop-client/conf/hdfs-site.xml'] {'owner': 'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': 0644, 'encoding': 'UTF-8'}
2016-09-27 09:44:38,860 - Writing File['/usr/hdp/current/hadoop-client/conf/hdfs-site.xml'] because contents don't match
2016-09-27 09:44:38,861 - XmlConfig['mapred-site.xml'] {'group': 'hadoop', 'conf_dir': '/usr/hdp/current/hadoop-client/conf', 'mode': 0644, 'configuration_attributes': {}, 'owner': 'yarn', 'configurations': ...}
2016-09-27 09:44:38,874 - Generating config: /usr/hdp/current/hadoop-client/conf/mapred-site.xml
2016-09-27 09:44:38,874 - File['/usr/hdp/current/hadoop-client/conf/mapred-site.xml'] {'owner': 'yarn', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': 0644, 'encoding': 'UTF-8'}
2016-09-27 09:44:38,923 - Writing File['/usr/hdp/current/hadoop-client/conf/mapred-site.xml'] because contents don't match
2016-09-27 09:44:38,924 - Changing owner for /usr/hdp/current/hadoop-client/conf/mapred-site.xml from 501 to yarn
2016-09-27 09:44:38,924 - XmlConfig['yarn-site.xml'] {'group': 'hadoop', 'conf_dir': '/usr/hdp/current/hadoop-client/conf', 'mode': 0644, 'configuration_attributes': {}, 'owner': 'yarn', 'configurations': ...}
2016-09-27 09:44:38,937 - Generating config: /usr/hdp/current/hadoop-client/conf/yarn-site.xml
2016-09-27 09:44:38,937 - File['/usr/hdp/current/hadoop-client/conf/yarn-site.xml'] {'owner': 'yarn', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': 0644, 'encoding': 'UTF-8'}
2016-09-27 09:44:39,050 - Writing File['/usr/hdp/current/hadoop-client/conf/yarn-site.xml'] because contents don't match
2016-09-27 09:44:39,050 - XmlConfig['capacity-scheduler.xml'] {'group': 'hadoop', 'conf_dir': '/usr/hdp/current/hadoop-client/conf', 'mode': 0644, 'configuration_attributes': {}, 'owner': 'yarn', 'configurations': ...}
2016-09-27 09:44:39,063 - Generating config: /usr/hdp/current/hadoop-client/conf/capacity-scheduler.xml
2016-09-27 09:44:39,064 - File['/usr/hdp/current/hadoop-client/conf/capacity-scheduler.xml'] {'owner': 'yarn', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': 0644, 'encoding': 'UTF-8'}
2016-09-27 09:44:39,100 - Writing File['/usr/hdp/current/hadoop-client/conf/capacity-scheduler.xml'] because contents don't match
2016-09-27 09:44:39,101 - Changing owner for /usr/hdp/current/hadoop-client/conf/capacity-scheduler.xml from 506 to yarn
2016-09-27 09:44:39,101 - File['/etc/hadoop/conf/yarn.exclude'] {'owner': 'yarn', 'group': 'hadoop'}
2016-09-27 09:44:39,123 - File['/etc/security/limits.d/yarn.conf'] {'content': Template('yarn.conf.j2'), 'mode': 0644}
2016-09-27 09:44:39,127 - File['/etc/security/limits.d/mapreduce.conf'] {'content': Template('mapreduce.conf.j2'), 'mode': 0644}
2016-09-27 09:44:39,133 - File['/usr/hdp/current/hadoop-client/conf/yarn-env.sh'] {'content': InlineTemplate(...), 'owner': 'yarn', 'group': 'hadoop', 'mode': 0755}
2016-09-27 09:44:39,134 - Writing File['/usr/hdp/current/hadoop-client/conf/yarn-env.sh'] because contents don't match
2016-09-27 09:44:39,135 - File['/usr/hdp/current/hadoop-yarn-nodemanager/bin/container-executor'] {'group': 'hadoop', 'mode': 02050}
2016-09-27 09:44:39,143 - File['/usr/hdp/current/hadoop-client/conf/container-executor.cfg'] {'content': Template('container-executor.cfg.j2'), 'group': 'hadoop', 'mode': 0644}
2016-09-27 09:44:39,148 - Directory['/cgroups_test/cpu'] {'mode': 0755, 'group': 'hadoop', 'recursive': True, 'cd_access': 'a'}
Created 09-27-2016 11:49 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Mourad Chahri You can go to the ResourceManager UI. From there you should see a nodes link on the left side of the screen. If you click on that, you should see all of your NodeManagers and the reason for it being listed as unhealthy may be shown here. It is most likely due to yarn local dirs or log dirs. You may be hitting the disk threshold for this. There are a couple of parameters you can check for this.
yarn.nodemanager.disk-health-checker.min-healthy-disks
yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage
yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb
Finally, if that does not reveal the issue, you should look in /var/log/hadoop-yarn/yarn. Your previous comment shows you were looking in /var/log/hadoop/yarn which is not where the NodeManager log is located.
I hope this helps.