Support Questions
Find answers, ask questions, and share your expertise

Unable to start Namenode - New Installation

Explorer

Please advise how I can get this started.

6 Nodes in cluster

1 x Edge, 2 Name, 3 Data

stderr: /var/lib/ambari-agent/data/errors-125.txt

Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 348, in <module>
    NameNode().execute()
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 375, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 90, in start
    upgrade_suspended=params.upgrade_suspended, env=env)
  File "/usr/lib/ambari-agent/lib/ambari_commons/os_family_impl.py", line 89, in thunk
    return fn(*args, **kwargs)
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py", line 175, in namenode
    create_log_dir=True
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py", line 276, in service
    Execute(daemon_cmd, not_if=process_id_exists_command, environment=hadoop_env_exports)
  File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 166, in __init__
    self.env.run()
  File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 160, in run
    self.run_action(resource, action)
  File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 124, in run_action
    provider_action()
  File "/usr/lib/ambari-agent/lib/resource_management/core/providers/system.py", line 262, in action_run
    tries=self.resource.tries, try_sleep=self.resource.try_sleep)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 72, in inner
    result = function(command, **kwargs)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 102, in checked_call
    tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 150, in _call_wrapper
    result = _call(command, **kwargs_copy)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 303, in _call
    raise ExecutionFailed(err_msg, code, out, err)
resource_management.core.exceptions.ExecutionFailed: Execution of 'ambari-sudo.sh su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ;  /usr/hdp/2.6.5.0-292/hadoop/sbin/hadoop-daemon.sh --config /usr/hdp/2.6.5.0-292/hadoop/conf start namenode'' returned 1. starting namenode, logging to /var/log/hadoop/hdfs/hadoop-hdfs-namenode-omiprihdp02ap.mufep.net.out
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.

stdout: /var/lib/ambari-agent/data/output-125.txt

2018-10-21 10:07:42,380 - Stack Feature Version Info: Cluster Stack=2.6, Command Stack=None, Command Version=2.6.5.0-292 -> 2.6.5.0-292
2018-10-21 10:07:42,393 - Using hadoop conf dir: /usr/hdp/2.6.5.0-292/hadoop/conf
2018-10-21 10:07:42,519 - Stack Feature Version Info: Cluster Stack=2.6, Command Stack=None, Command Version=2.6.5.0-292 -> 2.6.5.0-292
2018-10-21 10:07:42,523 - Using hadoop conf dir: /usr/hdp/2.6.5.0-292/hadoop/conf
2018-10-21 10:07:42,524 - Group['hdfs'] {}
2018-10-21 10:07:42,525 - Group['hadoop'] {}
2018-10-21 10:07:42,525 - Group['users'] {}
2018-10-21 10:07:42,525 - User['zookeeper'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop'], 'uid': None}
2018-10-21 10:07:42,526 - User['ams'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop'], 'uid': None}
2018-10-21 10:07:42,527 - User['ambari-qa'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'users'], 'uid': None}
2018-10-21 10:07:42,527 - User['hdfs'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hdfs'], 'uid': None}
2018-10-21 10:07:42,528 - User['yarn'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop'], 'uid': None}
2018-10-21 10:07:42,528 - User['mapred'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop'], 'uid': None}
2018-10-21 10:07:42,529 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555}
2018-10-21 10:07:42,530 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa 0'] {'not_if': '(test $(id -u ambari-qa) -gt 1000) || (false)'}
2018-10-21 10:07:42,536 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa 0'] due to not_if
2018-10-21 10:07:42,537 - Group['hdfs'] {}
2018-10-21 10:07:42,537 - User['hdfs'] {'fetch_nonlocal_groups': True, 'groups': ['hdfs', u'hdfs']}
2018-10-21 10:07:42,538 - FS Type: 
2018-10-21 10:07:42,538 - Directory['/etc/hadoop'] {'mode': 0755}
2018-10-21 10:07:42,550 - File['/usr/hdp/2.6.5.0-292/hadoop/conf/hadoop-env.sh'] {'content': InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop'}
2018-10-21 10:07:42,550 - Directory['/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir'] {'owner': 'hdfs', 'group': 'hadoop', 'mode': 01777}
2018-10-21 10:07:42,563 - Execute[('setenforce', '0')] {'not_if': '(! which getenforce ) || (which getenforce && getenforce | grep -q Disabled)', 'sudo': True, 'only_if': 'test -f /selinux/enforce'}
2018-10-21 10:07:42,573 - Skipping Execute[('setenforce', '0')] due to not_if
2018-10-21 10:07:42,574 - Directory['/var/log/hadoop'] {'owner': 'root', 'create_parents': True, 'group': 'hadoop', 'mode': 0775, 'cd_access': 'a'}
2018-10-21 10:07:42,576 - Directory['/var/run/hadoop'] {'owner': 'root', 'create_parents': True, 'group': 'root', 'cd_access': 'a'}
2018-10-21 10:07:42,576 - Directory['/tmp/hadoop-hdfs'] {'owner': 'hdfs', 'create_parents': True, 'cd_access': 'a'}
2018-10-21 10:07:42,580 - File['/usr/hdp/2.6.5.0-292/hadoop/conf/commons-logging.properties'] {'content': Template('commons-logging.properties.j2'), 'owner': 'hdfs'}
2018-10-21 10:07:42,582 - File['/usr/hdp/2.6.5.0-292/hadoop/conf/health_check'] {'content': Template('health_check.j2'), 'owner': 'hdfs'}
2018-10-21 10:07:42,587 - File['/usr/hdp/2.6.5.0-292/hadoop/conf/log4j.properties'] {'content': InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop', 'mode': 0644}
2018-10-21 10:07:42,594 - File['/usr/hdp/2.6.5.0-292/hadoop/conf/hadoop-metrics2.properties'] {'content': InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop'}
2018-10-21 10:07:42,595 - File['/usr/hdp/2.6.5.0-292/hadoop/conf/task-log4j.properties'] {'content': StaticFile('task-log4j.properties'), 'mode': 0755}
2018-10-21 10:07:42,595 - File['/usr/hdp/2.6.5.0-292/hadoop/conf/configuration.xsl'] {'owner': 'hdfs', 'group': 'hadoop'}
2018-10-21 10:07:42,599 - File['/etc/hadoop/conf/topology_mappings.data'] {'owner': 'hdfs', 'content': Template('topology_mappings.data.j2'), 'only_if': 'test -d /etc/hadoop/conf', 'group': 'hadoop', 'mode': 0644}
2018-10-21 10:07:42,603 - File['/etc/hadoop/conf/topology_script.py'] {'content': StaticFile('topology_script.py'), 'only_if': 'test -d /etc/hadoop/conf', 'mode': 0755}
2018-10-21 10:07:42,834 - Using hadoop conf dir: /usr/hdp/2.6.5.0-292/hadoop/conf
2018-10-21 10:07:42,834 - Stack Feature Version Info: Cluster Stack=2.6, Command Stack=None, Command Version=2.6.5.0-292 -> 2.6.5.0-292
2018-10-21 10:07:42,851 - Using hadoop conf dir: /usr/hdp/2.6.5.0-292/hadoop/conf
2018-10-21 10:07:42,863 - Directory['/etc/security/limits.d'] {'owner': 'root', 'create_parents': True, 'group': 'root'}
2018-10-21 10:07:42,867 - File['/etc/security/limits.d/hdfs.conf'] {'content': Template('hdfs.conf.j2'), 'owner': 'root', 'group': 'root', 'mode': 0644}
2018-10-21 10:07:42,867 - XmlConfig['hadoop-policy.xml'] {'owner': 'hdfs', 'group': 'hadoop', 'conf_dir': '/usr/hdp/2.6.5.0-292/hadoop/conf', 'configuration_attributes': {}, 'configurations': ...}
2018-10-21 10:07:42,874 - Generating config: /usr/hdp/2.6.5.0-292/hadoop/conf/hadoop-policy.xml
2018-10-21 10:07:42,874 - File['/usr/hdp/2.6.5.0-292/hadoop/conf/hadoop-policy.xml'] {'owner': 'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': None, 'encoding': 'UTF-8'}
2018-10-21 10:07:42,881 - XmlConfig['ssl-client.xml'] {'owner': 'hdfs', 'group': 'hadoop', 'conf_dir': '/usr/hdp/2.6.5.0-292/hadoop/conf', 'configuration_attributes': {}, 'configurations': ...}
2018-10-21 10:07:42,886 - Generating config: /usr/hdp/2.6.5.0-292/hadoop/conf/ssl-client.xml
2018-10-21 10:07:42,886 - File['/usr/hdp/2.6.5.0-292/hadoop/conf/ssl-client.xml'] {'owner': 'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': None, 'encoding': 'UTF-8'}
2018-10-21 10:07:42,891 - Directory['/usr/hdp/2.6.5.0-292/hadoop/conf/secure'] {'owner': 'root', 'create_parents': True, 'group': 'hadoop', 'cd_access': 'a'}
2018-10-21 10:07:42,891 - XmlConfig['ssl-client.xml'] {'owner': 'hdfs', 'group': 'hadoop', 'conf_dir': '/usr/hdp/2.6.5.0-292/hadoop/conf/secure', 'configuration_attributes': {}, 'configurations': ...}
2018-10-21 10:07:42,897 - Generating config: /usr/hdp/2.6.5.0-292/hadoop/conf/secure/ssl-client.xml
2018-10-21 10:07:42,897 - File['/usr/hdp/2.6.5.0-292/hadoop/conf/secure/ssl-client.xml'] {'owner': 'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': None, 'encoding': 'UTF-8'}
2018-10-21 10:07:42,901 - XmlConfig['ssl-server.xml'] {'owner': 'hdfs', 'group': 'hadoop', 'conf_dir': '/usr/hdp/2.6.5.0-292/hadoop/conf', 'configuration_attributes': {}, 'configurations': ...}
2018-10-21 10:07:42,907 - Generating config: /usr/hdp/2.6.5.0-292/hadoop/conf/ssl-server.xml
2018-10-21 10:07:42,907 - File['/usr/hdp/2.6.5.0-292/hadoop/conf/ssl-server.xml'] {'owner': 'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': None, 'encoding': 'UTF-8'}
2018-10-21 10:07:42,912 - XmlConfig['hdfs-site.xml'] {'owner': 'hdfs', 'group': 'hadoop', 'conf_dir': '/usr/hdp/2.6.5.0-292/hadoop/conf', 'configuration_attributes': {u'final': {u'dfs.support.append': u'true', u'dfs.datanode.data.dir': u'true', u'dfs.namenode.http-address': u'true', u'dfs.namenode.name.dir': u'true', u'dfs.webhdfs.enabled': u'true', u'dfs.datanode.failed.volumes.tolerated': u'true'}}, 'configurations': ...}
2018-10-21 10:07:42,918 - Generating config: /usr/hdp/2.6.5.0-292/hadoop/conf/hdfs-site.xml
2018-10-21 10:07:42,918 - File['/usr/hdp/2.6.5.0-292/hadoop/conf/hdfs-site.xml'] {'owner': 'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': None, 'encoding': 'UTF-8'}
2018-10-21 10:07:42,950 - XmlConfig['core-site.xml'] {'group': 'hadoop', 'conf_dir': '/usr/hdp/2.6.5.0-292/hadoop/conf', 'mode': 0644, 'configuration_attributes': {u'final': {u'fs.defaultFS': u'true'}}, 'owner': 'hdfs', 'configurations': ...}
2018-10-21 10:07:42,956 - Generating config: /usr/hdp/2.6.5.0-292/hadoop/conf/core-site.xml
2018-10-21 10:07:42,956 - File['/usr/hdp/2.6.5.0-292/hadoop/conf/core-site.xml'] {'owner': 'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': 0644, 'encoding': 'UTF-8'}
2018-10-21 10:07:42,971 - Writing File['/usr/hdp/2.6.5.0-292/hadoop/conf/core-site.xml'] because contents don't match
2018-10-21 10:07:42,972 - File['/usr/hdp/2.6.5.0-292/hadoop/conf/slaves'] {'content': Template('slaves.j2'), 'owner': 'hdfs'}
2018-10-21 10:07:42,972 - Stack Feature Version Info: Cluster Stack=2.6, Command Stack=None, Command Version=2.6.5.0-292 -> 2.6.5.0-292
2018-10-21 10:07:42,977 - Directory['/grid/0/hadoop/hdfs/namenode'] {'owner': 'hdfs', 'group': 'hadoop', 'create_parents': True, 'mode': 0755, 'cd_access': 'a'}
2018-10-21 10:07:42,978 - Skipping setting up secure ZNode ACL for HFDS as it's supported only for NameNode HA mode.
2018-10-21 10:07:42,980 - Called service start with upgrade_type: None
2018-10-21 10:07:42,980 - Ranger Hdfs plugin is not enabled
2018-10-21 10:07:42,981 - File['/etc/hadoop/conf/dfs.exclude'] {'owner': 'hdfs', 'content': Template('exclude_hosts_list.j2'), 'group': 'hadoop'}
2018-10-21 10:07:42,982 - Writing File['/etc/hadoop/conf/dfs.exclude'] because it doesn't exist
2018-10-21 10:07:42,982 - Changing owner for /etc/hadoop/conf/dfs.exclude from 0 to hdfs
2018-10-21 10:07:42,982 - Changing group for /etc/hadoop/conf/dfs.exclude from 0 to hadoop
2018-10-21 10:07:42,982 - /grid/0/hadoop/hdfs/namenode/namenode-formatted/ exists. Namenode DFS already formatted
2018-10-21 10:07:42,982 - Directory['/grid/0/hadoop/hdfs/namenode/namenode-formatted/'] {'create_parents': True}
2018-10-21 10:07:42,982 - Options for start command are: 
2018-10-21 10:07:42,983 - Directory['/var/run/hadoop'] {'owner': 'hdfs', 'group': 'hadoop', 'mode': 0755}
2018-10-21 10:07:42,983 - Changing owner for /var/run/hadoop from 0 to hdfs
2018-10-21 10:07:42,983 - Changing group for /var/run/hadoop from 0 to hadoop
2018-10-21 10:07:42,983 - Directory['/var/run/hadoop/hdfs'] {'owner': 'hdfs', 'group': 'hadoop', 'create_parents': True}
2018-10-21 10:07:42,983 - Creating directory Directory['/var/run/hadoop/hdfs'] since it doesn't exist.
2018-10-21 10:07:42,983 - Changing owner for /var/run/hadoop/hdfs from 0 to hdfs
2018-10-21 10:07:42,983 - Changing group for /var/run/hadoop/hdfs from 0 to hadoop
2018-10-21 10:07:42,983 - Directory['/var/log/hadoop/hdfs'] {'owner': 'hdfs', 'group': 'hadoop', 'create_parents': True}
2018-10-21 10:07:42,983 - Creating directory Directory['/var/log/hadoop/hdfs'] since it doesn't exist.
2018-10-21 10:07:43,001 - Changing owner for /var/log/hadoop/hdfs from 0 to hdfs
2018-10-21 10:07:43,002 - Changing group for /var/log/hadoop/hdfs from 0 to hadoop
2018-10-21 10:07:43,002 - File['/var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid'] {'action': ['delete'], 'not_if': 'ambari-sudo.sh  -H -E test -f /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid && ambari-sudo.sh  -H -E pgrep -F /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid'}
2018-10-21 10:07:43,008 - Execute['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ;  /usr/hdp/2.6.5.0-292/hadoop/sbin/hadoop-daemon.sh --config /usr/hdp/2.6.5.0-292/hadoop/conf start namenode''] {'environment': {'HADOOP_LIBEXEC_DIR': '/usr/hdp/2.6.5.0-292/hadoop/libexec'}, 'not_if': 'ambari-sudo.sh  -H -E test -f /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid && ambari-sudo.sh  -H -E pgrep -F /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid'}
2018-10-21 10:07:47,161 - Execute['find /var/log/hadoop/hdfs -maxdepth 1 -type f -name '*' -exec echo '==> {} <==' \; -exec tail -n 40 {} \;'] {'logoutput': True, 'ignore_failures': True, 'user': 'hdfs'}
==> /var/log/hadoop/hdfs/hadoop-hdfs-namenode-omiprihdp02ap.mufep.net.out <==
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
ulimit -a for user hdfs
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 768540
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 128000
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 65536
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
==> /var/log/hadoop/hdfs/gc.log-201810211007 <==
OpenJDK 64-Bit Server VM (25.191-b12) for linux-amd64 JRE (1.8.0_191-b12), built on Oct  9 2018 08:21:41 by "mockbuild" with gcc 4.8.5 20150623 (Red Hat 4.8.5-28)
Memory: 4k page, physical 197551312k(193685116k free), swap 16777212k(16777212k free)
CommandLine flags: -XX:CMSInitiatingOccupancyFraction=70 -XX:ErrorFile=/var/log/hadoop/hdfs/hs_err_pid%p.log -XX:InitialHeapSize=1073741824 -XX:MaxHeapSize=1073741824 -XX:MaxNewSize=134217728 -XX:MaxTenuringThreshold=6 -XX:NewSize=134217728 -XX:OldPLABSize=16 -XX:OnOutOfMemoryError="/usr/hdp/current/hadoop-hdfs-namenode/bin/kill-name-node" -XX:OnOutOfMemoryError="/usr/hdp/current/hadoop-hdfs-namenode/bin/kill-name-node" -XX:OnOutOfMemoryError="/usr/hdp/current/hadoop-hdfs-namenode/bin/kill-name-node" -XX:ParallelGCThreads=8 -XX:+PrintGC -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+UseParNewGC 
2018-10-21T10:07:44.027+0200: 0.845: [GC (Allocation Failure) 2018-10-21T10:07:44.027+0200: 0.845: [ParNew: 104960K->9369K(118016K), 0.0093381 secs] 104960K->9369K(1035520K), 0.0094477 secs] [Times: user=0.03 sys=0.01, real=0.01 secs] 
2018-10-21T10:07:45.458+0200: 2.276: [GC (Allocation Failure) 2018-10-21T10:07:45.458+0200: 2.276: [ParNew: 114329K->7011K(118016K), 0.0160670 secs] 114329K->9901K(1035520K), 0.0161554 secs] [Times: user=0.07 sys=0.01, real=0.01 secs] 
Heap
 par new generation   total 118016K, used 63374K [0x00000000c0000000, 0x00000000c8000000, 0x00000000c8000000)
  eden space 104960K,  53% used [0x00000000c0000000, 0x00000000c370acc8, 0x00000000c6680000)
  from space 13056K,  53% used [0x00000000c6680000, 0x00000000c6d58e20, 0x00000000c7340000)
  to   space 13056K,   0% used [0x00000000c7340000, 0x00000000c7340000, 0x00000000c8000000)
 concurrent mark-sweep generation total 917504K, used 2890K [0x00000000c8000000, 0x0000000100000000, 0x0000000100000000)
 Metaspace       used 23023K, capacity 23310K, committed 23544K, reserved 1071104K
  class space    used 2612K, capacity 2689K, committed 2764K, reserved 1048576K
==> /var/log/hadoop/hdfs/hadoop-hdfs-namenode-omiprihdp02ap.mufep.net.log <==
2018-10-21 10:07:45,785 INFO  namenode.FSEditLog (JournalSet.java:selectInputStreams(274)) - Skipping jas JournalAndStream(mgr=FileJournalManager(root=/grid/0/hadoop/hdfs/namenode), stream=null) since it's disabled
2018-10-21 10:07:45,785 WARN  namenode.FSNamesystem (FSNamesystem.java:loadFromDisk(726)) - Encountered exception loading fsimage
java.io.IOException: Gap in transactions. Expected to be able to read up until at least txid 90719 but unable to find any edit logs containing txid 90719
	at org.apache.hadoop.hdfs.server.namenode.FSEditLog.checkForGaps(FSEditLog.java:1660)
	at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1618)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:661)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:303)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1077)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:724)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:697)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:761)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:1001)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:985)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1710)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1778)
2018-10-21 10:07:45,787 INFO  mortbay.log (Slf4jLog.java:info(67)) - Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@omiprihdp02ap.mufep.net:50070
2018-10-21 10:07:45,888 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(211)) - Stopping NameNode metrics system...
2018-10-21 10:07:45,889 INFO  impl.MetricsSinkAdapter (MetricsSinkAdapter.java:publishMetricsFromQueue(141)) - timeline thread interrupted.
2018-10-21 10:07:45,890 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(217)) - NameNode metrics system stopped.
2018-10-21 10:07:45,890 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(606)) - NameNode metrics system shutdown complete.
2018-10-21 10:07:45,890 ERROR namenode.NameNode (NameNode.java:main(1783)) - Failed to start namenode.
java.io.IOException: Gap in transactions. Expected to be able to read up until at least txid 90719 but unable to find any edit logs containing txid 90719
	at org.apache.hadoop.hdfs.server.namenode.FSEditLog.checkForGaps(FSEditLog.java:1660)
	at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1618)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:661)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:303)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1077)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:724)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:697)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:761)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:1001)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:985)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1710)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1778)
2018-10-21 10:07:45,891 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1
2018-10-21 10:07:45,893 INFO  namenode.NameNode (LogAdapter.java:info(47)) - SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at omiprihdp02ap.mufep.net/10.6.7.22
************************************************************/
2018-10-21 10:07:45,893 INFO  timeline.HadoopTimelineMetricsSink (AbstractTimelineMetricsSink.java:getCurrentCollectorHost(278)) - No live collector to send metrics to. Metrics to be sent will be discarded. This message will be skipped for the next 20 times.
==> /var/log/hadoop/hdfs/SecurityAuth.audit <==
==> /var/log/hadoop/hdfs/hdfs-audit.log <==

Command failed after 1 tries

When trying to start anything (IPtables, SElinux etc have been disabled)

- Connection failed to http:IP:8042 (<urlopen error [Errno 111] Connection refused>)

---I get this error for many ports on all various hosts

5 REPLIES 5

Mentor

@Sherrine Green Thompson
Looks like you have memory issues

"-XX:OnOutOfMemoryError="/usr/hdp/current/hadoop-hdfs-namenode/bin/kill-name-node"

Check the NameNode Java heap size can you adjust that and revert!

Explorer

thanks Geoffrey

I increased the memory but I still get the following error when trying to start the the services;

Connection failed to http:IP:8042 (<urlopen error [Errno 111] Connection refused>)

92947-screen-shot-2018-10-22-at-80924-pm.png

92947-screen-shot-2018-10-22-at-80924-pm.png

Explorer

starting namenode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-namenode-<fqdn>.out

/usr/hdp/2.6.5.0-292/hadoop/sbin/hadoop-daemon.sh: line 171: /var/run/hadoop/hadoop/hadoop-hadoop-namenode.pid: No such file or directory

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".

SLF4J: Defaulting to no-operation (NOP) logger implementation

SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.

[hadoop@omiprihdp02ap ~]$

Mentor

@Sherrine Green Thompson

Whats you HDP version? Can you check and share the logs on the NameNode host in /var/log/hadoop/hdfs ?

Explorer

Hi Geoffrey - I reinstalled Ambari and HDFS and that fixed the Issue - thank you