Created 10-21-2018 10:08 PM
Please advise how I can get this started.
6 Nodes in cluster
1 x Edge, 2 Name, 3 Data
stderr: /var/lib/ambari-agent/data/errors-125.txt
Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 348, in <module> NameNode().execute() File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 375, in execute method(env) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 90, in start upgrade_suspended=params.upgrade_suspended, env=env) File "/usr/lib/ambari-agent/lib/ambari_commons/os_family_impl.py", line 89, in thunk return fn(*args, **kwargs) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py", line 175, in namenode create_log_dir=True File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py", line 276, in service Execute(daemon_cmd, not_if=process_id_exists_command, environment=hadoop_env_exports) File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 166, in __init__ self.env.run() File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/ambari-agent/lib/resource_management/core/providers/system.py", line 262, in action_run tries=self.resource.tries, try_sleep=self.resource.try_sleep) File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 72, in inner result = function(command, **kwargs) File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 102, in checked_call tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy) File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 150, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 303, in _call raise ExecutionFailed(err_msg, code, out, err) resource_management.core.exceptions.ExecutionFailed: Execution of 'ambari-sudo.sh su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ; /usr/hdp/2.6.5.0-292/hadoop/sbin/hadoop-daemon.sh --config /usr/hdp/2.6.5.0-292/hadoop/conf start namenode'' returned 1. starting namenode, logging to /var/log/hadoop/hdfs/hadoop-hdfs-namenode-omiprihdp02ap.mufep.net.out SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
stdout: /var/lib/ambari-agent/data/output-125.txt
2018-10-21 10:07:42,380 - Stack Feature Version Info: Cluster Stack=2.6, Command Stack=None, Command Version=2.6.5.0-292 -> 2.6.5.0-292 2018-10-21 10:07:42,393 - Using hadoop conf dir: /usr/hdp/2.6.5.0-292/hadoop/conf 2018-10-21 10:07:42,519 - Stack Feature Version Info: Cluster Stack=2.6, Command Stack=None, Command Version=2.6.5.0-292 -> 2.6.5.0-292 2018-10-21 10:07:42,523 - Using hadoop conf dir: /usr/hdp/2.6.5.0-292/hadoop/conf 2018-10-21 10:07:42,524 - Group['hdfs'] {} 2018-10-21 10:07:42,525 - Group['hadoop'] {} 2018-10-21 10:07:42,525 - Group['users'] {} 2018-10-21 10:07:42,525 - User['zookeeper'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop'], 'uid': None} 2018-10-21 10:07:42,526 - User['ams'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop'], 'uid': None} 2018-10-21 10:07:42,527 - User['ambari-qa'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'users'], 'uid': None} 2018-10-21 10:07:42,527 - User['hdfs'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hdfs'], 'uid': None} 2018-10-21 10:07:42,528 - User['yarn'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop'], 'uid': None} 2018-10-21 10:07:42,528 - User['mapred'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop'], 'uid': None} 2018-10-21 10:07:42,529 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555} 2018-10-21 10:07:42,530 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa 0'] {'not_if': '(test $(id -u ambari-qa) -gt 1000) || (false)'} 2018-10-21 10:07:42,536 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa 0'] due to not_if 2018-10-21 10:07:42,537 - Group['hdfs'] {} 2018-10-21 10:07:42,537 - User['hdfs'] {'fetch_nonlocal_groups': True, 'groups': ['hdfs', u'hdfs']} 2018-10-21 10:07:42,538 - FS Type: 2018-10-21 10:07:42,538 - Directory['/etc/hadoop'] {'mode': 0755} 2018-10-21 10:07:42,550 - File['/usr/hdp/2.6.5.0-292/hadoop/conf/hadoop-env.sh'] {'content': InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop'} 2018-10-21 10:07:42,550 - Directory['/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir'] {'owner': 'hdfs', 'group': 'hadoop', 'mode': 01777} 2018-10-21 10:07:42,563 - Execute[('setenforce', '0')] {'not_if': '(! which getenforce ) || (which getenforce && getenforce | grep -q Disabled)', 'sudo': True, 'only_if': 'test -f /selinux/enforce'} 2018-10-21 10:07:42,573 - Skipping Execute[('setenforce', '0')] due to not_if 2018-10-21 10:07:42,574 - Directory['/var/log/hadoop'] {'owner': 'root', 'create_parents': True, 'group': 'hadoop', 'mode': 0775, 'cd_access': 'a'} 2018-10-21 10:07:42,576 - Directory['/var/run/hadoop'] {'owner': 'root', 'create_parents': True, 'group': 'root', 'cd_access': 'a'} 2018-10-21 10:07:42,576 - Directory['/tmp/hadoop-hdfs'] {'owner': 'hdfs', 'create_parents': True, 'cd_access': 'a'} 2018-10-21 10:07:42,580 - File['/usr/hdp/2.6.5.0-292/hadoop/conf/commons-logging.properties'] {'content': Template('commons-logging.properties.j2'), 'owner': 'hdfs'} 2018-10-21 10:07:42,582 - File['/usr/hdp/2.6.5.0-292/hadoop/conf/health_check'] {'content': Template('health_check.j2'), 'owner': 'hdfs'} 2018-10-21 10:07:42,587 - File['/usr/hdp/2.6.5.0-292/hadoop/conf/log4j.properties'] {'content': InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop', 'mode': 0644} 2018-10-21 10:07:42,594 - File['/usr/hdp/2.6.5.0-292/hadoop/conf/hadoop-metrics2.properties'] {'content': InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop'} 2018-10-21 10:07:42,595 - File['/usr/hdp/2.6.5.0-292/hadoop/conf/task-log4j.properties'] {'content': StaticFile('task-log4j.properties'), 'mode': 0755} 2018-10-21 10:07:42,595 - File['/usr/hdp/2.6.5.0-292/hadoop/conf/configuration.xsl'] {'owner': 'hdfs', 'group': 'hadoop'} 2018-10-21 10:07:42,599 - File['/etc/hadoop/conf/topology_mappings.data'] {'owner': 'hdfs', 'content': Template('topology_mappings.data.j2'), 'only_if': 'test -d /etc/hadoop/conf', 'group': 'hadoop', 'mode': 0644} 2018-10-21 10:07:42,603 - File['/etc/hadoop/conf/topology_script.py'] {'content': StaticFile('topology_script.py'), 'only_if': 'test -d /etc/hadoop/conf', 'mode': 0755} 2018-10-21 10:07:42,834 - Using hadoop conf dir: /usr/hdp/2.6.5.0-292/hadoop/conf 2018-10-21 10:07:42,834 - Stack Feature Version Info: Cluster Stack=2.6, Command Stack=None, Command Version=2.6.5.0-292 -> 2.6.5.0-292 2018-10-21 10:07:42,851 - Using hadoop conf dir: /usr/hdp/2.6.5.0-292/hadoop/conf 2018-10-21 10:07:42,863 - Directory['/etc/security/limits.d'] {'owner': 'root', 'create_parents': True, 'group': 'root'} 2018-10-21 10:07:42,867 - File['/etc/security/limits.d/hdfs.conf'] {'content': Template('hdfs.conf.j2'), 'owner': 'root', 'group': 'root', 'mode': 0644} 2018-10-21 10:07:42,867 - XmlConfig['hadoop-policy.xml'] {'owner': 'hdfs', 'group': 'hadoop', 'conf_dir': '/usr/hdp/2.6.5.0-292/hadoop/conf', 'configuration_attributes': {}, 'configurations': ...} 2018-10-21 10:07:42,874 - Generating config: /usr/hdp/2.6.5.0-292/hadoop/conf/hadoop-policy.xml 2018-10-21 10:07:42,874 - File['/usr/hdp/2.6.5.0-292/hadoop/conf/hadoop-policy.xml'] {'owner': 'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': None, 'encoding': 'UTF-8'} 2018-10-21 10:07:42,881 - XmlConfig['ssl-client.xml'] {'owner': 'hdfs', 'group': 'hadoop', 'conf_dir': '/usr/hdp/2.6.5.0-292/hadoop/conf', 'configuration_attributes': {}, 'configurations': ...} 2018-10-21 10:07:42,886 - Generating config: /usr/hdp/2.6.5.0-292/hadoop/conf/ssl-client.xml 2018-10-21 10:07:42,886 - File['/usr/hdp/2.6.5.0-292/hadoop/conf/ssl-client.xml'] {'owner': 'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': None, 'encoding': 'UTF-8'} 2018-10-21 10:07:42,891 - Directory['/usr/hdp/2.6.5.0-292/hadoop/conf/secure'] {'owner': 'root', 'create_parents': True, 'group': 'hadoop', 'cd_access': 'a'} 2018-10-21 10:07:42,891 - XmlConfig['ssl-client.xml'] {'owner': 'hdfs', 'group': 'hadoop', 'conf_dir': '/usr/hdp/2.6.5.0-292/hadoop/conf/secure', 'configuration_attributes': {}, 'configurations': ...} 2018-10-21 10:07:42,897 - Generating config: /usr/hdp/2.6.5.0-292/hadoop/conf/secure/ssl-client.xml 2018-10-21 10:07:42,897 - File['/usr/hdp/2.6.5.0-292/hadoop/conf/secure/ssl-client.xml'] {'owner': 'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': None, 'encoding': 'UTF-8'} 2018-10-21 10:07:42,901 - XmlConfig['ssl-server.xml'] {'owner': 'hdfs', 'group': 'hadoop', 'conf_dir': '/usr/hdp/2.6.5.0-292/hadoop/conf', 'configuration_attributes': {}, 'configurations': ...} 2018-10-21 10:07:42,907 - Generating config: /usr/hdp/2.6.5.0-292/hadoop/conf/ssl-server.xml 2018-10-21 10:07:42,907 - File['/usr/hdp/2.6.5.0-292/hadoop/conf/ssl-server.xml'] {'owner': 'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': None, 'encoding': 'UTF-8'} 2018-10-21 10:07:42,912 - XmlConfig['hdfs-site.xml'] {'owner': 'hdfs', 'group': 'hadoop', 'conf_dir': '/usr/hdp/2.6.5.0-292/hadoop/conf', 'configuration_attributes': {u'final': {u'dfs.support.append': u'true', u'dfs.datanode.data.dir': u'true', u'dfs.namenode.http-address': u'true', u'dfs.namenode.name.dir': u'true', u'dfs.webhdfs.enabled': u'true', u'dfs.datanode.failed.volumes.tolerated': u'true'}}, 'configurations': ...} 2018-10-21 10:07:42,918 - Generating config: /usr/hdp/2.6.5.0-292/hadoop/conf/hdfs-site.xml 2018-10-21 10:07:42,918 - File['/usr/hdp/2.6.5.0-292/hadoop/conf/hdfs-site.xml'] {'owner': 'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': None, 'encoding': 'UTF-8'} 2018-10-21 10:07:42,950 - XmlConfig['core-site.xml'] {'group': 'hadoop', 'conf_dir': '/usr/hdp/2.6.5.0-292/hadoop/conf', 'mode': 0644, 'configuration_attributes': {u'final': {u'fs.defaultFS': u'true'}}, 'owner': 'hdfs', 'configurations': ...} 2018-10-21 10:07:42,956 - Generating config: /usr/hdp/2.6.5.0-292/hadoop/conf/core-site.xml 2018-10-21 10:07:42,956 - File['/usr/hdp/2.6.5.0-292/hadoop/conf/core-site.xml'] {'owner': 'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': 0644, 'encoding': 'UTF-8'} 2018-10-21 10:07:42,971 - Writing File['/usr/hdp/2.6.5.0-292/hadoop/conf/core-site.xml'] because contents don't match 2018-10-21 10:07:42,972 - File['/usr/hdp/2.6.5.0-292/hadoop/conf/slaves'] {'content': Template('slaves.j2'), 'owner': 'hdfs'} 2018-10-21 10:07:42,972 - Stack Feature Version Info: Cluster Stack=2.6, Command Stack=None, Command Version=2.6.5.0-292 -> 2.6.5.0-292 2018-10-21 10:07:42,977 - Directory['/grid/0/hadoop/hdfs/namenode'] {'owner': 'hdfs', 'group': 'hadoop', 'create_parents': True, 'mode': 0755, 'cd_access': 'a'} 2018-10-21 10:07:42,978 - Skipping setting up secure ZNode ACL for HFDS as it's supported only for NameNode HA mode. 2018-10-21 10:07:42,980 - Called service start with upgrade_type: None 2018-10-21 10:07:42,980 - Ranger Hdfs plugin is not enabled 2018-10-21 10:07:42,981 - File['/etc/hadoop/conf/dfs.exclude'] {'owner': 'hdfs', 'content': Template('exclude_hosts_list.j2'), 'group': 'hadoop'} 2018-10-21 10:07:42,982 - Writing File['/etc/hadoop/conf/dfs.exclude'] because it doesn't exist 2018-10-21 10:07:42,982 - Changing owner for /etc/hadoop/conf/dfs.exclude from 0 to hdfs 2018-10-21 10:07:42,982 - Changing group for /etc/hadoop/conf/dfs.exclude from 0 to hadoop 2018-10-21 10:07:42,982 - /grid/0/hadoop/hdfs/namenode/namenode-formatted/ exists. Namenode DFS already formatted 2018-10-21 10:07:42,982 - Directory['/grid/0/hadoop/hdfs/namenode/namenode-formatted/'] {'create_parents': True} 2018-10-21 10:07:42,982 - Options for start command are: 2018-10-21 10:07:42,983 - Directory['/var/run/hadoop'] {'owner': 'hdfs', 'group': 'hadoop', 'mode': 0755} 2018-10-21 10:07:42,983 - Changing owner for /var/run/hadoop from 0 to hdfs 2018-10-21 10:07:42,983 - Changing group for /var/run/hadoop from 0 to hadoop 2018-10-21 10:07:42,983 - Directory['/var/run/hadoop/hdfs'] {'owner': 'hdfs', 'group': 'hadoop', 'create_parents': True} 2018-10-21 10:07:42,983 - Creating directory Directory['/var/run/hadoop/hdfs'] since it doesn't exist. 2018-10-21 10:07:42,983 - Changing owner for /var/run/hadoop/hdfs from 0 to hdfs 2018-10-21 10:07:42,983 - Changing group for /var/run/hadoop/hdfs from 0 to hadoop 2018-10-21 10:07:42,983 - Directory['/var/log/hadoop/hdfs'] {'owner': 'hdfs', 'group': 'hadoop', 'create_parents': True} 2018-10-21 10:07:42,983 - Creating directory Directory['/var/log/hadoop/hdfs'] since it doesn't exist. 2018-10-21 10:07:43,001 - Changing owner for /var/log/hadoop/hdfs from 0 to hdfs 2018-10-21 10:07:43,002 - Changing group for /var/log/hadoop/hdfs from 0 to hadoop 2018-10-21 10:07:43,002 - File['/var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid'] {'action': ['delete'], 'not_if': 'ambari-sudo.sh -H -E test -f /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid && ambari-sudo.sh -H -E pgrep -F /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid'} 2018-10-21 10:07:43,008 - Execute['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ; /usr/hdp/2.6.5.0-292/hadoop/sbin/hadoop-daemon.sh --config /usr/hdp/2.6.5.0-292/hadoop/conf start namenode''] {'environment': {'HADOOP_LIBEXEC_DIR': '/usr/hdp/2.6.5.0-292/hadoop/libexec'}, 'not_if': 'ambari-sudo.sh -H -E test -f /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid && ambari-sudo.sh -H -E pgrep -F /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid'} 2018-10-21 10:07:47,161 - Execute['find /var/log/hadoop/hdfs -maxdepth 1 -type f -name '*' -exec echo '==> {} <==' \; -exec tail -n 40 {} \;'] {'logoutput': True, 'ignore_failures': True, 'user': 'hdfs'} ==> /var/log/hadoop/hdfs/hadoop-hdfs-namenode-omiprihdp02ap.mufep.net.out <== SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. ulimit -a for user hdfs core file size (blocks, -c) unlimited data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 768540 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 128000 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 65536 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited ==> /var/log/hadoop/hdfs/gc.log-201810211007 <== OpenJDK 64-Bit Server VM (25.191-b12) for linux-amd64 JRE (1.8.0_191-b12), built on Oct 9 2018 08:21:41 by "mockbuild" with gcc 4.8.5 20150623 (Red Hat 4.8.5-28) Memory: 4k page, physical 197551312k(193685116k free), swap 16777212k(16777212k free) CommandLine flags: -XX:CMSInitiatingOccupancyFraction=70 -XX:ErrorFile=/var/log/hadoop/hdfs/hs_err_pid%p.log -XX:InitialHeapSize=1073741824 -XX:MaxHeapSize=1073741824 -XX:MaxNewSize=134217728 -XX:MaxTenuringThreshold=6 -XX:NewSize=134217728 -XX:OldPLABSize=16 -XX:OnOutOfMemoryError="/usr/hdp/current/hadoop-hdfs-namenode/bin/kill-name-node" -XX:OnOutOfMemoryError="/usr/hdp/current/hadoop-hdfs-namenode/bin/kill-name-node" -XX:OnOutOfMemoryError="/usr/hdp/current/hadoop-hdfs-namenode/bin/kill-name-node" -XX:ParallelGCThreads=8 -XX:+PrintGC -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+UseParNewGC 2018-10-21T10:07:44.027+0200: 0.845: [GC (Allocation Failure) 2018-10-21T10:07:44.027+0200: 0.845: [ParNew: 104960K->9369K(118016K), 0.0093381 secs] 104960K->9369K(1035520K), 0.0094477 secs] [Times: user=0.03 sys=0.01, real=0.01 secs] 2018-10-21T10:07:45.458+0200: 2.276: [GC (Allocation Failure) 2018-10-21T10:07:45.458+0200: 2.276: [ParNew: 114329K->7011K(118016K), 0.0160670 secs] 114329K->9901K(1035520K), 0.0161554 secs] [Times: user=0.07 sys=0.01, real=0.01 secs] Heap par new generation total 118016K, used 63374K [0x00000000c0000000, 0x00000000c8000000, 0x00000000c8000000) eden space 104960K, 53% used [0x00000000c0000000, 0x00000000c370acc8, 0x00000000c6680000) from space 13056K, 53% used [0x00000000c6680000, 0x00000000c6d58e20, 0x00000000c7340000) to space 13056K, 0% used [0x00000000c7340000, 0x00000000c7340000, 0x00000000c8000000) concurrent mark-sweep generation total 917504K, used 2890K [0x00000000c8000000, 0x0000000100000000, 0x0000000100000000) Metaspace used 23023K, capacity 23310K, committed 23544K, reserved 1071104K class space used 2612K, capacity 2689K, committed 2764K, reserved 1048576K ==> /var/log/hadoop/hdfs/hadoop-hdfs-namenode-omiprihdp02ap.mufep.net.log <== 2018-10-21 10:07:45,785 INFO namenode.FSEditLog (JournalSet.java:selectInputStreams(274)) - Skipping jas JournalAndStream(mgr=FileJournalManager(root=/grid/0/hadoop/hdfs/namenode), stream=null) since it's disabled 2018-10-21 10:07:45,785 WARN namenode.FSNamesystem (FSNamesystem.java:loadFromDisk(726)) - Encountered exception loading fsimage java.io.IOException: Gap in transactions. Expected to be able to read up until at least txid 90719 but unable to find any edit logs containing txid 90719 at org.apache.hadoop.hdfs.server.namenode.FSEditLog.checkForGaps(FSEditLog.java:1660) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1618) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:661) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:303) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1077) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:724) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:697) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:761) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:1001) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:985) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1710) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1778) 2018-10-21 10:07:45,787 INFO mortbay.log (Slf4jLog.java:info(67)) - Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@omiprihdp02ap.mufep.net:50070 2018-10-21 10:07:45,888 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(211)) - Stopping NameNode metrics system... 2018-10-21 10:07:45,889 INFO impl.MetricsSinkAdapter (MetricsSinkAdapter.java:publishMetricsFromQueue(141)) - timeline thread interrupted. 2018-10-21 10:07:45,890 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(217)) - NameNode metrics system stopped. 2018-10-21 10:07:45,890 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(606)) - NameNode metrics system shutdown complete. 2018-10-21 10:07:45,890 ERROR namenode.NameNode (NameNode.java:main(1783)) - Failed to start namenode. java.io.IOException: Gap in transactions. Expected to be able to read up until at least txid 90719 but unable to find any edit logs containing txid 90719 at org.apache.hadoop.hdfs.server.namenode.FSEditLog.checkForGaps(FSEditLog.java:1660) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1618) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:661) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:303) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1077) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:724) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:697) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:761) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:1001) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:985) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1710) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1778) 2018-10-21 10:07:45,891 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1 2018-10-21 10:07:45,893 INFO namenode.NameNode (LogAdapter.java:info(47)) - SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at omiprihdp02ap.mufep.net/10.6.7.22 ************************************************************/ 2018-10-21 10:07:45,893 INFO timeline.HadoopTimelineMetricsSink (AbstractTimelineMetricsSink.java:getCurrentCollectorHost(278)) - No live collector to send metrics to. Metrics to be sent will be discarded. This message will be skipped for the next 20 times. ==> /var/log/hadoop/hdfs/SecurityAuth.audit <== ==> /var/log/hadoop/hdfs/hdfs-audit.log <== Command failed after 1 tries
When trying to start anything (IPtables, SElinux etc have been disabled)
- Connection failed to http:IP:8042 (<urlopen error [Errno 111] Connection refused>)
---I get this error for many ports on all various hosts
Created 10-22-2018 08:04 AM
@Sherrine Green Thompson
Looks like you have memory issues
"-XX:OnOutOfMemoryError="/usr/hdp/current/hadoop-hdfs-namenode/bin/kill-name-node"
Check the NameNode Java heap size can you adjust that and revert!
Created on 10-23-2018 04:06 AM - edited 08-17-2019 08:32 PM
thanks Geoffrey
I increased the memory but I still get the following error when trying to start the the services;
Connection failed to http:IP:8042 (<urlopen error [Errno 111] Connection refused>)
Created 10-23-2018 05:31 AM
starting namenode, logging to /var/log/hadoop/hadoop/hadoop-hadoop-namenode-<fqdn>.out
/usr/hdp/2.6.5.0-292/hadoop/sbin/hadoop-daemon.sh: line 171: /var/run/hadoop/hadoop/hadoop-hadoop-namenode.pid: No such file or directory
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
[hadoop@omiprihdp02ap ~]$
Created 10-23-2018 12:02 PM
Whats you HDP version? Can you check and share the logs on the NameNode host in /var/log/hadoop/hdfs ?
Created 10-29-2018 02:29 PM
Hi Geoffrey - I reinstalled Ambari and HDFS and that fixed the Issue - thank you