Member since
08-02-2018
46
Posts
1
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4111 | 08-09-2018 03:05 AM |
02-01-2019
10:58 AM
@Geoffrey Shelton Okot @Jean-François Vandemoortele Does this issue resolved if yes. Can you please suggest me the steps. Thanks in advance.
... View more
11-20-2018
02:45 PM
Thank you.........it works for me
... View more
11-20-2018
02:24 PM
if i want to give resource manager web ui (8088) to others. do we need to add /etc/hosts entries in their laptops ?
... View more
11-20-2018
11:18 AM
Hi Everyone, I have created 6 nodes cluster 3 masters and 3 workers on aws using private ip's and vpn etc..
there is no public ip is assigned for the instance. now i am able to ping the instances using private ip's only when i connect through vpn. i have changed hostnames and i have added all private ip's and aliases in /etc/hosts file. my ambari is in one of the master nodes lets say "master01.abc.com". but when i try to access ambari using "master01.abc.com:8080" it is not giving me the access. when i try to use private ip instead of master01.abc.com i am able to access to it. now my question is how can i get the ambari access through "master01.abc.com:8080". Please help me how to resolve this. Thanks in advance
... View more
Labels:
- Labels:
-
Apache Ambari
10-25-2018
02:55 PM
Thank you so much @Akhil S Naik I have changed mpack from hdf 3.1 to hdf 3.2. My nifi installation is completed without upgrading HDP.
... View more
10-25-2018
10:13 AM
Thank you @Akhil S Naik I have a doubt i've upgraded to ambari2.7. 1) Do i need to upgrade HDP if i want to install HDF 3.2 or can i proceed wih HDF installation directly without upgrading HDP???? Please suggest me the possibility. Thanks in advance....
... View more
10-24-2018
07:54 PM
Hi everyone, I am having 4 node cluster hdp installed. i am trying to install nifi using ambari it is throwing an error. followed steps: 1)i have upgraded ambari. now the current version is ambari 2.7, hdp version is 2.6.1.0 and HDF 3.1.2.0 using below link i have upgraded ambari.. https://docs.hortonworks.com/HDPDocuments/HDF3/HDF-3.2.0/installing-hdf-on-hdp/content/hdf-upgrade-ambari-and-hdp.html 2)Installed M-pack using ambari server. and i am able to see nifi in my ambari add services list. here comes the problem when i try to add a nifi service in one of the nodes in the cluster it is throwing an error. I am attaching the error here. std_err Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/common-services/NIFI/1.0.0/package/scripts/nifi.py", line 231, in <module>
Master().execute()
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 353, in execute
method(env)
File "/var/lib/ambari-agent/cache/common-services/NIFI/1.0.0/package/scripts/nifi.py", line 56, in install
import params
File "/var/lib/ambari-agent/cache/common-services/NIFI/1.0.0/package/scripts/params.py", line 284, in <module>
for host in config['clusterHostInfo']['zookeeper_hosts']:
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/config_dictionary.py", line 73, in __getattr__
raise Fail("Configuration parameter '" + self.name + "' was not found in configurations dictionary!")
resource_management.core.exceptions.Fail: Configuration parameter 'zookeeper_hosts' was not found in configurations dictionary! std_out 2018-10-24 19:33:00,075 - Stack Feature Version Info: Cluster Stack=2.6, Command Stack=None, Command Version=None -> 2.6
2018-10-24 19:33:00,089 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf
2018-10-24 19:33:00,096 - Group['livy'] {}
2018-10-24 19:33:00,097 - Group['spark'] {}
2018-10-24 19:33:00,097 - Group['hdfs'] {}
2018-10-24 19:33:00,098 - Group['zeppelin'] {}
2018-10-24 19:33:00,098 - Group['hadoop'] {}
2018-10-24 19:33:00,099 - Group['nifi'] {}
2018-10-24 19:33:00,099 - Group['users'] {}
2018-10-24 19:33:00,099 - Group['knox'] {}
2018-10-24 19:33:00,101 - User['hive'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2018-10-24 19:33:00,108 - User['infra-solr'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2018-10-24 19:33:00,110 - User['zookeeper'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2018-10-24 19:33:00,112 - User['ams'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2018-10-24 19:33:00,117 - User['tez'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop', 'users'], 'uid': None}
2018-10-24 19:33:00,119 - User['zeppelin'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['zeppelin', 'hadoop'], 'uid': None}
2018-10-24 19:33:00,121 - User['nifi'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['nifi'], 'uid': None}
2018-10-24 19:33:00,123 - User['livy'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['livy', 'hadoop'], 'uid': None}
2018-10-24 19:33:00,128 - User['spark'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['spark', 'hadoop'], 'uid': None}
2018-10-24 19:33:00,131 - User['ambari-qa'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop', 'users'], 'uid': None}
2018-10-24 19:33:00,132 - User['kafka'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2018-10-24 19:33:00,138 - User['hdfs'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hdfs', 'hadoop'], 'uid': None}
2018-10-24 19:33:00,140 - User['sqoop'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2018-10-24 19:33:00,142 - User['yarn'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2018-10-24 19:33:00,147 - User['mapred'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2018-10-24 19:33:00,149 - User['hbase'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2018-10-24 19:33:00,151 - User['knox'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop', 'knox'], 'uid': None}
2018-10-24 19:33:00,153 - User['hcat'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2018-10-24 19:33:00,158 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555}
2018-10-24 19:33:00,160 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa 0'] {'not_if': '(test $(id -u ambari-qa) -gt 1000) || (false)'}
2018-10-24 19:33:00,167 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa 0'] due to not_if
2018-10-24 19:33:00,168 - Directory['/tmp/hbase-hbase'] {'owner': 'hbase', 'create_parents': True, 'mode': 0775, 'cd_access': 'a'}
2018-10-24 19:33:00,169 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555}
2018-10-24 19:33:00,171 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555}
2018-10-24 19:33:00,172 - call['/var/lib/ambari-agent/tmp/changeUid.sh hbase'] {}
2018-10-24 19:33:00,187 - call returned (0, '1015')
2018-10-24 19:33:00,188 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh hbase /home/hbase,/tmp/hbase,/usr/bin/hbase,/var/log/hbase,/tmp/hbase-hbase 1015'] {'not_if': '(test $(id -u hbase) -gt 1000) || (false)'}
2018-10-24 19:33:00,195 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh hbase /home/hbase,/tmp/hbase,/usr/bin/hbase,/var/log/hbase,/tmp/hbase-hbase 1015'] due to not_if
2018-10-24 19:33:00,195 - Group['hdfs'] {}
2018-10-24 19:33:00,196 - User['hdfs'] {'fetch_nonlocal_groups': True, 'groups': ['hdfs', 'hadoop', u'hdfs']}
2018-10-24 19:33:00,197 - FS Type: HDFS
2018-10-24 19:33:00,197 - Directory['/etc/hadoop'] {'mode': 0755}
2018-10-24 19:33:00,236 - File['/usr/hdp/current/hadoop-client/conf/hadoop-env.sh'] {'content': InlineTemplate(...), 'owner': 'root', 'group': 'hadoop'}
2018-10-24 19:33:00,239 - Directory['/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir'] {'owner': 'hdfs', 'group': 'hadoop', 'mode': 01777}
2018-10-24 19:33:00,272 - Repository['HDP-UTILS-2.6.1.0-129'] {'append_to_file': False, 'base_url': 'http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.21/repos/centos7', 'action': ['create'], 'components': [u'HDP-UTILS', 'main'], 'repo_template': '[{{repo_id}}]\nname={{repo_id}}\n{% if mirror_list %}mirrorlist={{mirror_list}}{% else %}baseurl={{base_url}}{% endif %}\n\npath=/\nenabled=1\ngpgcheck=0', 'repo_file_name': 'HDP-2.6.1.0-129', 'mirror_list': ''}
2018-10-24 19:33:00,290 - File['/etc/yum.repos.d/HDP-2.6.1.0-129.repo'] {'content': '[HDP-UTILS-2.6.1.0-129]\nname=HDP-UTILS-2.6.1.0-129\nbaseurl=http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.21/repos/centos7\n\npath=/\nenabled=1\ngpgcheck=0'}
2018-10-24 19:33:00,291 - Writing File['/etc/yum.repos.d/HDP-2.6.1.0-129.repo'] because contents don't match
2018-10-24 19:33:00,291 - Repository['HDP-2.6.1.0-129'] {'append_to_file': True, 'base_url': 'http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.6.1.0', 'action': ['create'], 'components': [u'HDP', 'main'], 'repo_template': '[{{repo_id}}]\nname={{repo_id}}\n{% if mirror_list %}mirrorlist={{mirror_list}}{% else %}baseurl={{base_url}}{% endif %}\n\npath=/\nenabled=1\ngpgcheck=0', 'repo_file_name': 'HDP-2.6.1.0-129', 'mirror_list': ''}
2018-10-24 19:33:00,301 - File['/etc/yum.repos.d/HDP-2.6.1.0-129.repo'] {'content': '[HDP-UTILS-2.6.1.0-129]\nname=HDP-UTILS-2.6.1.0-129\nbaseurl=http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.21/repos/centos7\n\npath=/\nenabled=1\ngpgcheck=0\n[HDP-2.6.1.0-129]\nname=HDP-2.6.1.0-129\nbaseurl=http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.6.1.0\n\npath=/\nenabled=1\ngpgcheck=0'}
2018-10-24 19:33:00,301 - Writing File['/etc/yum.repos.d/HDP-2.6.1.0-129.repo'] because contents don't match
2018-10-24 19:33:00,302 - Repository['HDF-2.6.1.0-129'] {'append_to_file': True, 'base_url': 'http://public-repo-1.hortonworks.com/HDF/centos7/3.x/updates/3.1.2.0', 'action': ['create'], 'components': [u'HDF', 'main'], 'repo_template': '[{{repo_id}}]\nname={{repo_id}}\n{% if mirror_list %}mirrorlist={{mirror_list}}{% else %}baseurl={{base_url}}{% endif %}\n\npath=/\nenabled=1\ngpgcheck=0', 'repo_file_name': 'HDP-2.6.1.0-129', 'mirror_list': ''}
2018-10-24 19:33:00,311 - File['/etc/yum.repos.d/HDP-2.6.1.0-129.repo'] {'content': '[HDP-UTILS-2.6.1.0-129]\nname=HDP-UTILS-2.6.1.0-129\nbaseurl=http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.21/repos/centos7\n\npath=/\nenabled=1\ngpgcheck=0\n[HDP-2.6.1.0-129]\nname=HDP-2.6.1.0-129\nbaseurl=http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.6.1.0\n\npath=/\nenabled=1\ngpgcheck=0\n[HDF-2.6.1.0-129]\nname=HDF-2.6.1.0-129\nbaseurl=http://public-repo-1.hortonworks.com/HDF/centos7/3.x/updates/3.1.2.0\n\npath=/\nenabled=1\ngpgcheck=0'}
2018-10-24 19:33:00,311 - Writing File['/etc/yum.repos.d/HDP-2.6.1.0-129.repo'] because contents don't match
2018-10-24 19:33:00,312 - Package['unzip'] {'retry_on_repo_unavailability': False, 'retry_count': 5}
2018-10-24 19:33:00,530 - Skipping installation of existing package unzip
2018-10-24 19:33:00,534 - Package['curl'] {'retry_on_repo_unavailability': False, 'retry_count': 5}
2018-10-24 19:33:00,558 - Skipping installation of existing package curl
2018-10-24 19:33:00,558 - Package['hdp-select'] {'retry_on_repo_unavailability': False, 'retry_count': 5}
2018-10-24 19:33:00,587 - Skipping installation of existing package hdp-select
2018-10-24 19:33:00,601 - The repository with version 2.6.1.0-129 for this command has been marked as resolved. It will be used to report the version of the component which was installed
2018-10-24 19:33:00,621 - Skipping stack-select on NIFI because it does not exist in the stack-select package structure.
2018-10-24 19:33:01,101 - Stack Feature Version Info: Cluster Stack=2.6, Command Stack=None, Command Version=None -> 2.6
2018-10-24 19:33:01,151 - The repository with version 2.6.1.0-129 for this command has been marked as resolved. It will be used to report the version of the component which was installed
2018-10-24 19:33:01,201 - Skipping stack-select on NIFI because it does not exist in the stack-select package structure.
Command failed after 1 tries please help me out to troubleshoot the issue. Thanks in advance....
... View more
Labels:
09-10-2018
05:35 AM
Hi, I am unable to connect from beeline to hive meatstore. i am attaching the error which was thrown. Please help me resolve this issue. Thanks in advance [centos@e1 ~]$ beeline
Beeline version 1.2.1.spark2 by Apache Hive
beeline> !connect jdbc:hive2://server1.abc.com:10000
Connecting to jdbc:hive2://server1.abc.com:10000
Enter username for jdbc:hive2://server1.abc.com:10000: hive
Enter password for jdbc:hive2://server1.abc.com:10000: ********
2018-09-10 12:26:25 INFO Utils:310 - Supplied authorities: server1.abc.com:10000
2018-09-10 12:26:25 INFO Utils:397 - Resolved authority: server1.abc.com:10000
2018-09-10 12:26:25 INFO HiveConnection:203 - Will try to open client transport with JDBC Uri: jdbc:hive2://server1.abc.com:10000
2018-09-10 12:26:25 ERROR HiveConnection:593 - Error opening session
org.apache.thrift.TApplicationException: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null, configuration:{use:database=default})
at org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79)
at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_OpenSession(TCLIService.java:156)
at org.apache.hive.service.cli.thrift.TCLIService$Client.OpenSession(TCLIService.java:143)
at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:583)
at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:192)
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:208)
at org.apache.hive.beeline.DatabaseConnection.connect(DatabaseConnection.java:142)
at org.apache.hive.beeline.DatabaseConnection.getConnection(DatabaseConnection.java:207)
at org.apache.hive.beeline.Commands.connect(Commands.java:1149)
at org.apache.hive.beeline.Commands.connect(Commands.java:1070)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:52)
at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:970)
at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:813)
at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:771)
at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:484)
at org.apache.hive.beeline.BeeLine.main(BeeLine.java:467)
Error: Could not establish connection to jdbc:hive2://server1.abc.com:10000: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null, configuration:{use:database=default}) (state=08S01,code=0)
0: jdbc:hive2://server1.abc.com:10000 (closed)>
... View more
Labels:
- Labels:
-
Hive
-
HiveOnSpark
08-19-2018
12:22 AM
please tell me how to get the logs
... View more
08-18-2018
09:50 AM
Hi everyone i have 6 node cluster 2 masters and 3 workers and 1 edge node i am able to access hbase shell in 5 nodes but i am getting the following error in edge node when i give $hbase shell Please help me to resolve this issue thanks in advance [root@edge01 ~]$ hbase shell
2018-08-18 16:37:24,202 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
2018-08-18 16:37:42,355 ERROR [main] zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 4 attempts
2018-08-18 16:37:42,356 WARN [main] zookeeper.ZKUtil: hconnection-0x10b1a7510x0, quorum=localhost:2181, baseZNode=/hbase Unable to set watcher on znode (/hbase/hbaseid)
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:220)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:419)
at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:105)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:919)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:657)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:218)
at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:119)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.jruby.javasupport.JavaMethod.invokeDirectWithExceptionHandling(JavaMethod.java:450)
at org.jruby.javasupport.JavaMethod.invokeStaticDirect(JavaMethod.java:362)
at org.jruby.java.invokers.StaticMethodInvoker.call(StaticMethodInvoker.java:58)
at org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:312)
at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:169)
at org.jruby.ast.CallOneArgNode.interpret(CallOneArgNode.java:57)
at org.jruby.ast.InstAsgnNode.interpret(InstAsgnNode.java:95)
at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:104)
at org.jruby.ast.BlockNode.interpret(BlockNode.java:71)
at org.jruby.evaluator.ASTInterpreter.INTERPRET_METHOD(ASTInterpreter.java:74)
at org.jruby.internal.runtime.methods.InterpretedMethod.call(InterpretedMethod.java:169)
at org.jruby.internal.runtime.methods.DefaultMethod.call(DefaultMethod.java:191)
at org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:302)
at org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:144)
at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:148)
at org.jruby.RubyClass.newInstance(RubyClass.java:822)
at org.jruby.RubyClass$i$newInstance.call(RubyClass$i$newInstance.gen:65535)
at org.jruby.internal.runtime.methods.JavaMethod$JavaMethodZeroOrNBlock.call(JavaMethod.java:249)
at org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:292)
at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:135)
at opt.cloudera.parcels.CDH_minus_5_dot_12_dot_0_minus_1_dot_cdh5_dot_12_dot_0_dot_p0_dot_29.lib.hbase.bin.$_dot_dot_.bin.hirb.__file__(/opt/cloudera/parcels/CDH-5.12.0-1.cdh5.12.0.p0.29/lib/hbase/bin/../bin/hirb.rb:131)
at opt.cloudera.parcels.CDH_minus_5_dot_12_dot_0_minus_1_dot_cdh5_dot_12_dot_0_dot_p0_dot_29.lib.hbase.bin.$_dot_dot_.bin.hirb.load(/opt/cloudera/parcels/CDH-5.12.0-1.cdh5.12.0.p0.29/lib/hbase/bin/../bin/hirb.rb)
at org.jruby.Ruby.runScript(Ruby.java:697)
at org.jruby.Ruby.runScript(Ruby.java:690)
at org.jruby.Ruby.runNormally(Ruby.java:597)
at org.jruby.Ruby.runFromMain(Ruby.java:446)
at org.jruby.Main.doRunFromMain(Main.java:369)
at org.jruby.Main.internalRun(Main.java:258)
at org.jruby.Main.run(Main.java:224)
at org.jruby.Main.run(Main.java:208)
at org.jruby.Main.main(Main.java:188)
2018-08-18 16:37:42,359 ERROR [main] zookeeper.ZooKeeperWatcher: hconnection-0x10b1a7510x0, quorum=localhost:2181, baseZNode=/hbase Received unexpected KeeperException, re-throwing exception
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:220)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:419)
at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:105)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:919)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:657)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:218)
at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:119)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.jruby.javasupport.JavaMethod.invokeDirectWithExceptionHandling(JavaMethod.java:450)
at org.jruby.javasupport.JavaMethod.invokeStaticDirect(JavaMethod.java:362)
at org.jruby.java.invokers.StaticMethodInvoker.call(StaticMethodInvoker.java:58)
at org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:312)
at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:169)
at org.jruby.ast.CallOneArgNode.interpret(CallOneArgNode.java:57)
at org.jruby.ast.InstAsgnNode.interpret(InstAsgnNode.java:95)
at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:104)
at org.jruby.ast.BlockNode.interpret(BlockNode.java:71)
at org.jruby.evaluator.ASTInterpreter.INTERPRET_METHOD(ASTInterpreter.java:74)
at org.jruby.internal.runtime.methods.InterpretedMethod.call(InterpretedMethod.java:169)
at org.jruby.internal.runtime.methods.DefaultMethod.call(DefaultMethod.java:191)
at org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:302)
at org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:144)
at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:148)
at org.jruby.RubyClass.newInstance(RubyClass.java:822)
at org.jruby.RubyClass$i$newInstance.call(RubyClass$i$newInstance.gen:65535)
at org.jruby.internal.runtime.methods.JavaMethod$JavaMethodZeroOrNBlock.call(JavaMethod.java:249)
at org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:292)
at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:135)
at opt.cloudera.parcels.CDH_minus_5_dot_12_dot_0_minus_1_dot_cdh5_dot_12_dot_0_dot_p0_dot_29.lib.hbase.bin.$_dot_dot_.bin.hirb.__file__(/opt/cloudera/parcels/CDH-5.12.0-1.cdh5.12.0.p0.29/lib/hbase/bin/../bin/hirb.rb:131)
at opt.cloudera.parcels.CDH_minus_5_dot_12_dot_0_minus_1_dot_cdh5_dot_12_dot_0_dot_p0_dot_29.lib.hbase.bin.$_dot_dot_.bin.hirb.load(/opt/cloudera/parcels/CDH-5.12.0-1.cdh5.12.0.p0.29/lib/hbase/bin/../bin/hirb.rb)
at org.jruby.Ruby.runScript(Ruby.java:697)
at org.jruby.Ruby.runScript(Ruby.java:690)
at org.jruby.Ruby.runNormally(Ruby.java:597)
at org.jruby.Ruby.runFromMain(Ruby.java:446)
at org.jruby.Main.doRunFromMain(Main.java:369)
at org.jruby.Main.internalRun(Main.java:258)
at org.jruby.Main.run(Main.java:224)
at org.jruby.Main.run(Main.java:208)
at org.jruby.Main.main(Main.java:188)
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.2.0-cdh5.12.0, rUnknown, Thu Jun 29 04:38:21 PDT 2017
hbase(main):001:0>
... View more
08-18-2018
09:41 AM
Can you please tell me the step by step procedure Thanks in advance
... View more
08-03-2018
10:41 AM
Hi every one, I am hiving the issue with my app timeline server when i try to restart it it is coming up but after some time it is going down and i found that in resource manager ui Apps pending is keep on increasing i am attaching screenshoti am attaching the log file can you please help me how to solve this 2018-08-03 10:26:38,430 INFO service.AbstractService (AbstractService.java:noteFailure(272)) - Service org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore failed in state INITED; cause: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: /hadoop/yarn/timeline/leveldb-timeline-store/starttime-ldb/001547.sst: Too many open files
org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: /hadoop/yarn/timeline/leveldb-timeline-store/starttime-ldb/001547.sst: Too many open files
at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200)
at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218)
at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168)
at org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.serviceInit(RollingLevelDBTimelineStore.java:324)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.serviceInit(EntityGroupFSTimelineStore.java:151)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceInit(ApplicationHistoryServer.java:104)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.launchAppHistoryServer(ApplicationHistoryServer.java:168)
at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:178)
2018-08-03 10:26:38,504 INFO service.AbstractService (AbstractService.java:noteFailure(272)) - Service EntityGroupFSTimelineStore failed in state INITED; cause: org.apache.hadoop.service.ServiceStateException: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: /hadoop/yarn/timeline/leveldb-timeline-store/starttime-ldb/001547.sst: Too many open files
org.apache.hadoop.service.ServiceStateException: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: /hadoop/yarn/timeline/leveldb-timeline-store/starttime-ldb/001547.sst: Too many open files
at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
at org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.serviceInit(EntityGroupFSTimelineStore.java:151)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceInit(ApplicationHistoryServer.java:104)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.launchAppHistoryServer(ApplicationHistoryServer.java:168)
at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:178)
Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: /hadoop/yarn/timeline/leveldb-timeline-store/starttime-ldb/001547.sst: Too many open files
at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200)
at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218)
at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168)
at org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.serviceInit(RollingLevelDBTimelineStore.java:324)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
... 7 more
2018-08-03 10:26:38,508 INFO timeline.EntityGroupFSTimelineStore (EntityGroupFSTimelineStore.java:serviceStop(297)) - Stopping EntityGroupFSTimelineStore
2018-08-03 10:26:38,509 INFO service.AbstractService (AbstractService.java:noteFailure(272)) - Service org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer failed in state INITED; cause: org.apache.hadoop.service.ServiceStateException: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: /hadoop/yarn/timeline/leveldb-timeline-store/starttime-ldb/001547.sst: Too many open files
org.apache.hadoop.service.ServiceStateException: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: /hadoop/yarn/timeline/leveldb-timeline-store/starttime-ldb/001547.sst: Too many open files
at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
at org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.serviceInit(EntityGroupFSTimelineStore.java:151)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceInit(ApplicationHistoryServer.java:104)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.launchAppHistoryServer(ApplicationHistoryServer.java:168)
at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:178)
Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: /hadoop/yarn/timeline/leveldb-timeline-store/starttime-ldb/001547.sst: Too many open files
at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200)
at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218)
at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168)
at org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.serviceInit(RollingLevelDBTimelineStore.java:324)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
... 7 more
2018-08-03 10:26:38,510 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(211)) - Stopping ApplicationHistoryServer metrics system...
2018-08-03 10:26:38,511 INFO impl.MetricsSinkAdapter (MetricsSinkAdapter.java:publishMetricsFromQueue(141)) - timeline thread interrupted.
2018-08-03 10:26:38,513 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(217)) - ApplicationHistoryServer metrics system stopped.
2018-08-03 10:26:38,513 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(606)) - ApplicationHistoryServer metrics system shutdown complete.
2018-08-03 10:26:38,513 FATAL applicationhistoryservice.ApplicationHistoryServer (ApplicationHistoryServer.java:launchAppHistoryServer(171)) - Error starting ApplicationHistoryServer
org.apache.hadoop.service.ServiceStateException: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: /hadoop/yarn/timeline/leveldb-timeline-store/starttime-ldb/001547.sst: Too many open files
at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
at org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.serviceInit(EntityGroupFSTimelineStore.java:151)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceInit(ApplicationHistoryServer.java:104)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.launchAppHistoryServer(ApplicationHistoryServer.java:168)
at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:178)
Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: /hadoop/yarn/timeline/leveldb-timeline-store/starttime-ldb/001547.sst: Too many open files
at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200)
at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218)
at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168)
at org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.serviceInit(RollingLevelDBTimelineStore.java:324)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
... 7 more
2018-08-03 10:26:38,515 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status -1
2018-08-03 10:26:38,515 INFO timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:run(416)) - Closing HadoopTimelineMetricSink. Flushing metrics to collector...
2018-08-03 10:26:38,561 INFO applicationhistoryservice.ApplicationHistoryServer (LogAdapter.java:info(45)) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down ApplicationHistoryServer at ip Thanks In advance
... View more
Labels:
- Labels:
-
Apache YARN
08-02-2018
04:50 AM
Hi every one, i have a 5 node cdh cluster.in my cluster i am observing that node managers are restarting continuously. i am not sure what is going on i attaching the stdout and stderr and roll log. can you please help me Stderr + exec /opt/cloudera/parcels/CDH-5.12.0-1.cdh5.12.0.p0.29/lib/hadoop-yarn/bin/yarn nodemanager
Aug 02, 2018 11:30:32 AM com.google.inject.servlet.InternalServletModule$BackwardsCompatibleServletContextProvider get
WARNING: You are attempting to use a deprecated API (specifically, attempting to @Inject ServletContext inside an eagerly created singleton. While we allow this for backwards compatibility, be warned that this MAY have unexpected behavior if you have more than one injector (with ServletModule) running in the same JVM. Please consult the Guice documentation at http://code.google.com/p/google-guice/wiki/Servlets for more information.
Aug 02, 2018 11:30:32 AM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.yarn.server.nodemanager.webapp.NMWebServices as a root resource class
Aug 02, 2018 11:30:32 AM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.yarn.webapp.GenericExceptionHandler as a provider class
Aug 02, 2018 11:30:32 AM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.yarn.server.nodemanager.webapp.JAXBContextResolver as a provider class
Aug 02, 2018 11:30:32 AM com.sun.jersey.server.impl.application.WebApplicationImpl _initiate
INFO: Initiating Jersey application, version 'Jersey: 1.9 09/02/2011 11:17 AM'
Aug 02, 2018 11:30:32 AM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
INFO: Binding org.apache.hadoop.yarn.server.nodemanager.webapp.JAXBContextResolver to GuiceManagedComponentProvider with the scope "Singleton"
Aug 02, 2018 11:30:32 AM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
INFO: Binding org.apache.hadoop.yarn.webapp.GenericExceptionHandler to GuiceManagedComponentProvider with the scope "Singleton"
Aug 02, 2018 11:30:33 AM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
INFO: Binding org.apache.hadoop.yarn.server.nodemanager.webapp.NMWebServices to GuiceManagedComponentProvider with the scope "Singleton" role log 11:48:42.105 AM INFO ContainerManagerImpl
Start request for container_1533205969497_0410_01_000001 by user dr.who
11:48:42.105 AM INFO ContainerManagerImpl
Creating a new application reference for app application_1533205969497_0410
11:48:42.105 AM INFO Application
Application application_1533205969497_0410 transitioned from NEW to INITING
11:48:42.106 AM INFO NMAuditLogger
USER=dr.who IP=172.31.24.227 OPERATION=Start Container Request TARGET=ContainerManageImpl RESULT=SUCCESS APPID=application_1533205969497_0410 CONTAINERID=container_1533205969497_0410_01_000001
11:48:42.108 AM INFO AppLogAggregatorImpl
rollingMonitorInterval is set as -1. The log rolling monitoring interval is disabled. The logs will be aggregated after this application is finished.
11:48:42.125 AM INFO Application
Adding container_1533205969497_0410_01_000001 to application application_1533205969497_0410
11:48:42.125 AM INFO Application
Application application_1533205969497_0410 transitioned from INITING to RUNNING
11:48:42.125 AM INFO Container
Container container_1533205969497_0410_01_000001 transitioned from NEW to LOCALIZED
11:48:42.125 AM INFO AuxServices
Got event CONTAINER_INIT for appId application_1533205969497_0410
11:48:42.125 AM INFO YarnShuffleService
Initializing container container_1533205969497_0410_01_000001
11:48:42.144 AM INFO Container
Container container_1533205969497_0410_01_000001 transitioned from LOCALIZED to RUNNING
11:48:42.147 AM INFO DefaultContainerExecutor
launchContainer: [bash, /data0/yarn/nm/usercache/dr.who/appcache/application_1533205969497_0410/container_1533205969497_0410_01_000001/default_container_executor.sh]
11:48:42.162 AM WARN DefaultContainerExecutor
Exit code from container container_1533205969497_0410_01_000001 is : 143
11:48:42.164 AM INFO Container
Container container_1533205969497_0410_01_000001 transitioned from RUNNING to EXITED_WITH_FAILURE
11:48:42.164 AM INFO ContainerLaunch
Cleaning up container container_1533205969497_0410_01_000001
11:48:42.181 AM INFO DefaultContainerExecutor
Deleting absolute path : /data0/yarn/nm/usercache/dr.who/appcache/application_1533205969497_0410/container_1533205969497_0410_01_000001
11:48:42.182 AM WARN NMAuditLogger
USER=dr.who OPERATION=Container Finished - Failed TARGET=ContainerImpl RESULT=FAILURE DESCRIPTION=Container failed with state: EXITED_WITH_FAILURE APPID=application_1533205969497_0410 CONTAINERID=container_1533205969497_0410_01_000001
11:48:42.182 AM INFO Container
Container container_1533205969497_0410_01_000001 transitioned from EXITED_WITH_FAILURE to DONE
11:48:42.182 AM INFO Application
Removing container_1533205969497_0410_01_000001 from application application_1533205969497_0410
11:48:42.182 AM INFO AppLogAggregatorImpl
Considering container container_1533205969497_0410_01_000001 for log-aggregation
11:48:42.182 AM INFO AuxServices
Got event CONTAINER_STOP for appId application_1533205969497_0410
11:48:42.182 AM INFO YarnShuffleService
Stopping container container_1533205969497_0410_01_000001
11:48:43.185 AM INFO NodeStatusUpdaterImpl
Removed completed containers from NM context: [container_1533205969497_0410_01_000001]
... View more
Labels:
- Labels:
-
YARN
07-26-2018
11:11 AM
Since it is a production cluster. does it effect to other services if i restart hive2 server
... View more
07-26-2018
09:29 AM
Hi everyone, In my cluster i am getting the alert on hive server 2 process connection failed.but the hive server2 is running. please find the log below Connection failed on host abc.covert.com:10000 (Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/alerts/alert_hive_thrift_port.py", line 211, in execute
ldap_password=ldap_password)
File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/hive_check.py", line 79, in check_thrift_port_sasl
timeout=check_command_timeout)
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 155, in __init__
self.env.run()
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run
self.run_action(resource, action)
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action
provider_action()
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 262, in action_run
tries=self.resource.tries, try_sleep=self.resource.try_ sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 72, in inner
result = function(command, **kwargs)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 102, in checked_call
tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_ kill_strategy)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 150, in _call_wrapper
result = _call(command, **kwargs_copy)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 297, in _call
raise ExecuteTimeoutException(err_ msg)
ExecuteTimeoutException: Execution of 'ambari-sudo.sh su ambari-qa -l -s /bin/bash -c 'export PATH='"'"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/var/lib/ambari-agent:/bin/:/usr/bin/:/usr/lib/hive/bin/:/usr/sbin/'"'"' ; ! beeline -u '"'"'jdbc:hive2://abc.covert.com:10000/;transportMode=binary;principal=hive/_HOST@COVERT.NET'"'"' -e '"'"''"'"' 2>&1| awk '"'"'{print}'"'"'|grep -i -e '"'"'Connection refused'"'"' -e '"'"'Invalid URL'"'"''' was killed due timeout after 60 seconds
) can you please help me how to get rid of this Thanks in advance
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Hive
07-24-2018
01:13 PM
Hi Every one, we are planning to upgrade our kerberised production cluster which is hdp 2.6 to hdp 3.0. can you please tell me step by step procedures and best practices since i am doing it for first time. Thanks in advance.
... View more
Labels:
07-16-2018
09:45 AM
Hi every one, I want to get the metrics of all services in my cluster. 1)If HDFS service is gone down. I want to know how much time
HDFS service was down. Eg:-In HDP cluster we will get only service uptime only like(
Name node uptime is 25 days).and if I restart the service the uptime will be calculated
from there onwards But I need to know how long the service was down and how
long the service is up and running I am asking about not only HDFS service but also I need to
generate the report for all the services(YARN,HBASE,KNOX,etc..) Can you please guide me how to get these uptime and down
times for all services. Thanks in advance
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Hadoop
07-11-2018
11:12 AM
1 Kudo
Please follow the link below https://community.hortonworks.com/articles/16763/cheat-sheet-and-tips-for-a-custom-install-of-horto.html
... View more
07-11-2018
10:24 AM
how can i find the HDFS uptime and downtime for the last 6 months. i found only current uptime of HDFS(namenode) on ambari UI .as soon as i restarted the service the uptime is caluculated from there onwards.
can you please tell me how to find the uptime and downtime of a HDFS from last 6 months Thanks in advance
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Hadoop
07-10-2018
12:20 PM
i found below information in gateway.log can any one help me to resolve this... 2018-07-10 09:55:07,535 ERROR hadoop.gateway (DefaultTopologyService.java:loadTopology(111)) - Failed to load topology /usr/hdp/2.6.1.0-129/knox/bin/../conf/topologies/sample.xml, retrying after 50ms: org.xml.sax.SAXParseException; lineNumber: 41; columnNumber: 76; The reference to entity "ServiceAccounts" must end with the ';' delimiter. 2018-07-10 09:55:07,588 ERROR digester3.Digester (Digester.java:fatalError(1541)) - Parse Fatal Error at line 41 column 76: The reference to entity "ServiceAccounts" must end with the ';' delimiter. org.xml.sax.SAXParseException; lineNumber: 41; columnNumber: 76; The reference to entity "ServiceAccounts" must end with the ';' delimiter. at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:203) at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177) at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:441) at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:368) at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1437) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEntityReference(XMLDocumentFragmentScannerImpl.java:1850) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:3067) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:606) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:510) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:848) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:777) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141) at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213) at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:649) at org.apache.commons.digester3.Digester.parse(Digester.java:1642) at org.apache.commons.digester3.Digester.parse(Digester.java:1701) at org.apache.hadoop.gateway.services.topology.impl.DefaultTopologyService.loadTopologyAttempt(DefaultTopologyService.java:124) at org.apache.hadoop.gateway.services.topology.impl.DefaultTopologyService.loadTopology(DefaultTopologyService.java:100) at org.apache.hadoop.gateway.services.topology.impl.DefaultTopologyService.loadTopologies(DefaultTopologyService.java:235) at org.apache.hadoop.gateway.services.topology.impl.DefaultTopologyService.reloadTopologies(DefaultTopologyService.java:320) at org.apache.hadoop.gateway.GatewayServer.start(GatewayServer.java:422) at org.apache.hadoop.gateway.GatewayServer.startGateway(GatewayServer.java:295) at org.apache.hadoop.gateway.GatewayServer.main(GatewayServer.java:148) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.gateway.launcher.Invoker.invokeMainMethod(Invoker.java:70) at org.apache.hadoop.gateway.launcher.Invoker.invoke(Invoker.java:39) at org.apache.hadoop.gateway.launcher.Command.run(Command.java:99) at org.apache.hadoop.gateway.launcher.Launcher.run(Launcher.java:69) at org.apache.hadoop.gateway.launcher.Launcher.main(Launcher.java:46) 2018-07-10 09:55:07,589 ERROR digester3.Digester (Digester.java:parse(1652)) - An error occurred while parsing XML from '(already loaded from stream)', see nested exceptions org.xml.sax.SAXParseException; lineNumber: 41; columnNumber: 76; The reference to entity "ServiceAccounts" must end with the ';' delimiter. at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1239) at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:649) at org.apache.commons.digester3.Digester.parse(Digester.java:1642) at org.apache.commons.digester3.Digester.parse(Digester.java:1701) at org.apache.hadoop.gateway.services.topology.impl.DefaultTopologyService.loadTopologyAttempt(DefaultTopologyService.java:124) at org.apache.hadoop.gateway.services.topology.impl.DefaultTopologyService.loadTopology(DefaultTopologyService.java:100) at org.apache.hadoop.gateway.services.topology.impl.DefaultTopologyService.loadTopologies(DefaultTopologyService.java:235) at org.apache.hadoop.gateway.services.topology.impl.DefaultTopologyService.reloadTopologies(DefaultTopologyService.java:320) at org.apache.hadoop.gateway.GatewayServer.start(GatewayServer.java:422) at org.apache.hadoop.gateway.GatewayServer.startGateway(GatewayServer.java:295) at org.apache.hadoop.gateway.GatewayServer.main(GatewayServer.java:148) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.gateway.launcher.Invoker.invokeMainMethod(Invoker.java:70) at org.apache.hadoop.gateway.launcher.Invoker.invoke(Invoker.java:39) at org.apache.hadoop.gateway.launcher.Command.run(Command.java:99) 2018-07-10 09:55:07,592 ERROR hadoop.gateway (DefaultTopologyService.java:loadTopologies(252)) - Failed to load topology /usr/hdp/2.6.1.0-129/knox/bin/../conf/topologies/sample.xml: org.xml.sax.SAXParseException; lineNumber: 41; columnNumber: 76; The reference to entity "ServiceAccounts" must end with the ';' delimiter.
... View more
07-10-2018
10:27 AM
@Felix Albani @Sindhu @Jay Kumar SenSharma @Geoffrey Shelton OkotHi, knox gateway is going down on daily basis.i found only knox-gc.log with toaday's time stamp in /var/log/knox/ directory. if you think that it is due to allocation failure can you please tell me how much memory i need to allocate for knox gateway. in my gateway.sh file i found the below line.do i need to change the values here or any other place and please let me know how much memory i should give to resolve this issue. APP_MEM_OPTS="-Xmx5g -XX:NewSize=3G -XX:MaxNewSize=3G -verbose:gc -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -Xloggc:/var/log/knox/knox-gc.log -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps" Thanks in advance
... View more
07-05-2018
10:06 AM
Hi in my cluster knox gateway was down.i found one issue in knox-gc.log. Please find the below error 2018-07-05T01:55:29.588+0000: 314055.583: [GC (Allocation Failure) 2018-07-05T01:55:29.887+0000: 314055.882: [ParNew: 2532112K->10894K(2831168K), 0.0440397 secs] 2532112K->10894K(2837312K), 0.3432294 secs] [Times: user=0.03 sys=0.00, real=0.35 secs] Please help me how to solve
... View more
- Tags:
- Knox
- knox-gateway
- logs
Labels:
- Labels:
-
Apache Knox
07-02-2018
12:18 PM
In my cluster the two namenodes going down. assume that server1 is having active namenode and server 2 is having standby name node. sometimes active namenode is going down and standby name node is taking the charge as a active. some times standby namenode is going down how to find the corrupted journal node and from where i need to get the journal node data(fsimage,editlogs) and where i need to paste the data
... View more
07-02-2018
11:14 AM
Hi everyone, i have 6 node cluster and my
standby namenode is going down continuously but when i start it is coming up
with out any issue i need to fix it permanently
can you please help Please find the log below 2018-07-01 22:44:01,939 INFO
authorize.ServiceAuthorizationManager
(ServiceAuthorizationManager.java:authorize(137)) - Authorization successful
for nn/server2.covert.com@COVERTHADOOP.NET (auth:KERBEROS) for
protocol=interface org.apache.hadoop.hdfs.protocol.ClientProtocol 2018-07-01 22:44:01,948 WARN
ipc.Server (Server.java:processResponse(1273)) - IPC Server handler 11 on 8020,
call org.apache.hadoop.ha.HAServiceProtocol.getServiceStatus from IP:8258
Call#4620178 Retry#0: output error 2018-07-01 22:44:01,949 INFO
ipc.Server (Server.java:run(2402)) - IPC Server handler 11 on 8020 caught an
exception java.nio.channels.ClosedChannelException at
sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:270) at
sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:461) at
org.apache.hadoop.ipc.Server.channelWrite(Server.java:2909) at
org.apache.hadoop.ipc.Server.access$2100(Server.java:138) at
org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:1223) at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:1295) at
org.apache.hadoop.ipc.Server$Connection.sendResponse(Server.java:2266) at
org.apache.hadoop.ipc.Server$Connection.access$400(Server.java:1375) at
org.apache.hadoop.ipc.Server$Call.sendResponse(Server.java:734) at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:2391) 2018-07-01 22:44:01,948 INFO
authorize.ServiceAuthorizationManager
(ServiceAuthorizationManager.java:authorize(137)) - Authorization successful
for hbase/server4.covert.com@COVERTHADOOP.NET (auth:KERBEROS) for
protocol=interface org.apache.hadoop.hdfs.protocol.ClientProtocol 2018-07-01 22:44:01,963 INFO
authorize.ServiceAuthorizationManager
(ServiceAuthorizationManager.java:authorize(137)) - Authorization successful
for hbase/server5.covert.com@COVERTHADOOP.NET (auth:KERBEROS) for
protocol=interface org.apache.hadoop.hdfs.protocol.ClientProtocol 2018-07-01 22:44:01,993 INFO
namenode.FSEditLog (FSEditLog.java:printStatistics(771)) - Number of
transactions: 43 Total time for transactions(ms): 22 Number of transactions
batched in Syncs: 0 Number of syncs: 42 SyncTimes(ms): 907 357 2018-07-01 22:44:02,144 WARN
client.QuorumJournalManager (IPCLoggerChannel.java:call(388)) - Remote journal IP:8485
failed to write txns 157817808-157817808. Will try to write to this JN again
after the next log roll. org.apache.hadoop.ipc.RemoteException(java.io.IOException):
IPC's epoch 518 is less than the last promised epoch 519 at
org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:428) at
org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:456) at
org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:351) at
org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:152) at
org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:158) at
org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25421) at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:2345) at
org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1554) at
org.apache.hadoop.ipc.Client.call(Client.java:1498) at
org.apache.hadoop.ipc.Client.call(Client.java:1398) at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) at
com.sun.proxy.$Proxy11.journal(Unknown Source) at
org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolTranslatorPB.journal(QJournalProtocolTranslatorPB.java:167) at
org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$7.call(IPCLoggerChannel.java:385) at
org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$7.call(IPCLoggerChannel.java:378) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at
java.lang.Thread.run(Thread.java:745) 2018-07-01 22:44:02,169 WARN
client.QuorumJournalManager (IPCLoggerChannel.java:call(388)) - Remote journal IP1:8485
failed to write txns 157817808-157817808. Will try to write to this JN again
after the next log roll. org.apache.hadoop.ipc.RemoteException(java.io.IOException):
IPC's epoch 518 is less than the last promised epoch 519 at
org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:428) at
org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:456) at
org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:351) at
org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:152) at
org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:158) at
org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25421) at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:422) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:2345) at
org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1554) at
org.apache.hadoop.ipc.Client.call(Client.java:1498) at
org.apache.hadoop.ipc.Client.call(Client.java:1398) at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) at
com.sun.proxy.$Proxy11.journal(Unknown Source) at
org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolTranslatorPB.journal(QJournalProtocolTranslatorPB.java:167) at
org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$7.call(IPCLoggerChannel.java:385) at
org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$7.call(IPCLoggerChannel.java:378) at
java.util.concurrent.FutureTask.run(FutureTask.java:266) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at
java.lang.Thread.run(Thread.java:745) 2018-07-01 22:44:02,177 WARN
client.QuorumJournalManager (IPCLoggerChannel.java:call(388)) - Remote journal IP2:8485
failed to write txns 157817808-157817808. Will try to write to this JN again
after the next log roll. org.apache.hadoop.ipc.RemoteException(java.io.IOException):
IPC's epoch 518 is less than the last promised epoch 519 at
org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:428) at
org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:456) at
org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:351) at
org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:152) at
org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:158) at
org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25421) at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:422) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:2345) at
org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1554) at
org.apache.hadoop.ipc.Client.call(Client.java:1498) at
org.apache.hadoop.ipc.Client.call(Client.java:1398) at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) at
com.sun.proxy.$Proxy11.journal(Unknown Source) at
org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolTranslatorPB.journal(QJournalProtocolTranslatorPB.java:167) at
org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$7.call(IPCLoggerChannel.java:385) at
org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$7.call(IPCLoggerChannel.java:378) at
java.util.concurrent.FutureTask.run(FutureTask.java:266) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 2018-07-01 22:44:02,182 FATAL
namenode.FSEditLog (JournalSet.java:mapJournalsAndReportErrors(398)) - Error:
flush failed for required journal (JournalAndStream(mgr=QJM to [IP1:8485, IP2:8485,
IP:8485], stream=QuorumOutputStream starting at txid 157817766)) org.apache.hadoop.hdfs.qjournal.client.QuorumException:
Got too many exceptions to achieve quorum size 2/3. 3 exceptions thrown: IP2:8485: IPC's epoch 518 is
less than the last promised epoch 519 at org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:428) at
org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:456) at
org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:351) at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:152) at
org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:158) at
org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25421) at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:422) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:2345) IP:8485: IPC's epoch 518 is
less than the last promised epoch 519 at org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:428) at
org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:456) at
org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:351) at
org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:152) at
org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:158) at
org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25421) at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:2345) IP1:8485: IPC's epoch 518 is
less than the last promised epoch 519 at
org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:428) at
org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:456) at
org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:351) at
org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:152) at
org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:158) at
org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25421) at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:422) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:2345) at
org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81) at
org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:223) at
org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:142) at
org.apache.hadoop.hdfs.qjournal.client.QuorumOutputStream.flushAndSync(QuorumOutputStream.java:107) at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:113) at
org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:107) at
org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream$8.apply(JournalSet.java:533) at
org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393) at
org.apache.hadoop.hdfs.server.namenode.JournalSet.access$100(JournalSet.java:57) at
org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream.flush(JournalSet.java:529) at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:707) at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:641) at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2691) at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2556) at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:736) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:408) at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:422) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:2345) 2018-07-01 22:44:02,182 WARN
client.QuorumJournalManager (QuorumOutputStream.java:abort(72)) - Aborting QuorumOutputStream
starting at txid 157817766 2018-07-01 22:44:02,199 INFO
util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1 2018-07-01 22:44:02,239 INFO
provider.AuditProviderFactory (AuditProviderFactory.java:run(516)) - ==>
JVMShutdownHook.run() 2018-07-01 22:44:02,239 INFO
provider.AuditProviderFactory (AuditProviderFactory.java:run(517)) -
JVMShutdownHook: Signalling async audit cleanup to start. 2018-07-01 22:44:02,239 INFO
provider.AuditProviderFactory (AuditProviderFactory.java:run(521)) -
JVMShutdownHook: Waiting up to 30 seconds for audit cleanup to finish. 2018-07-01 22:44:02,245 INFO
provider.AuditProviderFactory (AuditProviderFactory.java:run(492)) -
RangerAsyncAuditCleanup: Starting cleanup 2018-07-01 22:44:02,251 INFO
provider.BaseAuditHandler (BaseAuditHandler.java:logStatus(310)) - Audit Status
Log: name=hdfs.async.multi_dest.batch.hdfs, interval=03:01.906 minutes,
events=114, succcessCount=114, totalEvents=3188810, totalSuccessCount=3188810 2018-07-01 22:44:02,251 INFO
destination.HDFSAuditDestination (HDFSAuditDestination.java:logJSON(179)) -
Flushing HDFS audit. Event Size:30 2018-07-01 22:44:02,252 INFO
queue.AuditBatchQueue (AuditBatchQueue.java:runLogAudit(347)) - Exiting
consumerThread. Queue=hdfs.async.multi_dest.batch,
dest=hdfs.async.multi_dest.batch.hdfs 2018-07-01 22:44:02,252 INFO
queue.AuditBatchQueue (AuditBatchQueue.java:runLogAudit(351)) - Calling to stop
consumer. name=hdfs.async.multi_dest.batch,
consumer.name=hdfs.async.multi_dest.batch.hdfs 2018-07-01 22:44:03,967 INFO
BlockStateChange (UnderReplicatedBlocks.java:chooseUnderReplicatedBlocks(395))
- chooseUnderReplicatedBlocks selected 12 blocks at priority level 2; Total=12
Reset bookmarks? false 2018-07-01 22:44:03,967 INFO
BlockStateChange (BlockManager.java:computeReplicationWorkForBlocks(1647)) -
BLOCK* neededReplications = 3922, pendingReplications = 0. 2018-07-01 22:44:03,967 INFO
blockmanagement.BlockManager
(BlockManager.java:computeReplicationWorkForBlocks(1654)) - Blocks chosen but
could not be replicated = 12; of which 12 have no target, 0 have no source, 0
are UC, 0 are abandoned, 0 already have enough replicas. 2018-07-01 22:44:04,580 INFO
ipc.Server (Server.java:saslProcess(1573)) - Auth successful for
nn/server2.covert.com@COVERTHADOOP.NET (auth:KERBEROS) 2018-07-01 22:44:04,609 INFO
authorize.ServiceAuthorizationManager
(ServiceAuthorizationManager.java:authorize(137)) - Authorization successful
for nn/server2.covert.com@COVERTHADOOP.NET (auth:KERBEROS) for
protocol=interface org.apache.hadoop.ha.HAServiceProtocol 2018-07-01 22:44:04,797 INFO
ipc.Server (Server.java:saslProcess(1573)) - Auth successful for
nn/server2.covert.com@COVERTHADOOP.NET (auth:KERBEROS) 2018-07-01 22:44:04,817 INFO
authorize.ServiceAuthorizationManager (ServiceAuthorizationManager.java:authorize(137))
- Authorization successful for nn/server2.covert.com@COVERTHADOOP.NET
(auth:KERBEROS) for protocol=interface org.apache.hadoop.ha.HAServiceProtocol 2018-07-01 22:44:04,826 INFO
namenode.FSNamesystem (FSNamesystem.java:stopActiveServices(1272)) - Stopping
services started for active state 2018-07-01 22:44:04,826 ERROR
delegation.AbstractDelegationTokenSecretManager
(AbstractDelegationTokenSecretManager.java:run(659)) - ExpiredTokenRemover
received java.lang.InterruptedException: sleep interrupted 2018-07-01 22:44:04,832 INFO
namenode.FSNamesystem (FSNamesystem.java:run(5115)) - LazyPersistFileScrubber
was interrupted, exiting 2018-07-01 22:44:04,843 INFO
namenode.FSNamesystem (FSNamesystem.java:run(5029)) - NameNodeEditLogRoller was
interrupted, exiting 2018-07-01 22:44:08,757 ERROR
impl.CloudSolrClient (CloudSolrClient.java:requestWithRetryOnStaleState(903)) -
Request to collection ranger_audits failed due to (403)
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
from server at
http://server2.covert.com:8983/solr/ranger_audits_shard1_replica1: Expected
mime type application/octet-stream but got text/html. <html> <head> <meta
http-equiv="Content-Type" content="text/html;
charset=UTF-8"/> <title>Error 403 GSSException:
Failure unspecified at GSS-API level (Mechanism level: Request is a replay
(34))</title> </head> <body><h2>HTTP
ERROR 403</h2> <p>Problem accessing
/solr/ranger_audits_shard1_replica1/update. Reason: <pre> GSSException:
Failure unspecified at GSS-API level (Mechanism level: Request is a replay
(34))</pre></p><hr><i><small>Powered by
Jetty://</small></i><hr/>
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Hadoop
07-02-2018
10:49 AM
Hi everyone i have 6 node cluster and HA
enabled one of the hbase master was
down. when i was try to restart the hbase master it is coming up with out any
issue. can you please tell me how to
solve this. please find the log details
below 2018-07-02 06:13:14,943 INFO
[LruBlockCacheStatsExecutor] hfile.LruBlockCache: totalSize=2.19 MB,
freeSize=2.08 GB, max=2.08 GB, blockCount=0, accesses=0, hits=0, hitRatio=0,
cachingAccesses=0, cachingHits=0, cachingHitsRatio=0,evictions=238700,
evicted=0, evictedPerRun=0.0 2018-07-02 06:18:14,943 INFO
[LruBlockCacheStatsExecutor] hfile.LruBlockCache: totalSize=2.19 MB, freeSize=2.08
GB, max=2.08 GB, blockCount=0, accesses=0, hits=0, hitRatio=0,
cachingAccesses=0, cachingHits=0, cachingHitsRatio=0,evictions=238730,
evicted=0, evictedPerRun=0.0 2018-07-02 06:23:14,943 INFO
[LruBlockCacheStatsExecutor] hfile.LruBlockCache: totalSize=2.19 MB,
freeSize=2.08 GB, max=2.08 GB, blockCount=0, accesses=0, hits=0, hitRatio=0,
cachingAccesses=0, cachingHits=0, cachingHitsRatio=0,evictions=238760,
evicted=0, evictedPerRun=0.0 2018-07-02 06:28:14,943 INFO
[LruBlockCacheStatsExecutor] hfile.LruBlockCache: totalSize=2.19 MB,
freeSize=2.08 GB, max=2.08 GB, blockCount=0, accesses=0, hits=0, hitRatio=0,
cachingAccesses=0, cachingHits=0, cachingHitsRatio=0,evictions=238790,
evicted=0, evictedPerRun=0.0 2018-07-02 06:33:14,943 INFO
[LruBlockCacheStatsExecutor] hfile.LruBlockCache: totalSize=2.19 MB,
freeSize=2.08 GB, max=2.08 GB, blockCount=0, accesses=0, hits=0, hitRatio=0,
cachingAccesses=0, cachingHits=0, cachingHitsRatio=0,evictions=238820,
evicted=0, evictedPerRun=0.0 2018-07-02 06:38:14,943 INFO
[LruBlockCacheStatsExecutor] hfile.LruBlockCache: totalSize=2.19 MB,
freeSize=2.08 GB, max=2.08 GB, blockCount=0, accesses=0, hits=0, hitRatio=0,
cachingAccesses=0, cachingHits=0, cachingHitsRatio=0,evictions=238850,
evicted=0, evictedPerRun=0.0 2018-07-02 06:42:03,335 INFO
[master/server3.covert.com/IP:16000-SendThread(server3.covert.com:2181)]
zookeeper.ClientCnxn: Client session timed out, have not heard from server in
26680ms for sessionid 0x363c8a174c329a7, closing socket connection and attempting
reconnect 2018-07-02 06:42:03,469 INFO
[main-SendThread(server3.covert.com:2181)] zookeeper.ClientCnxn: Client session
timed out, have not heard from server in 26676ms for sessionid
0x163cb19387303c0, closing socket connection and attempting reconnect 2018-07-02 06:42:03,954 INFO
[timeline] timeline.HadoopTimelineMetricsSink: Unable to connect to collector,
http://server4.covert.com:6188/ws/v1/timeline/metrics This exceptions will be ignored
for next 100 times 2018-07-02 06:42:03,955 WARN
[timeline] timeline.HadoopTimelineMetricsSink: Unable to send metrics to
collector by address:http://server4.covert.com:6188/ws/v1/timeline/metrics 2018-07-02 06:42:04,008 INFO
[main-SendThread(server2.covert.com:2181)] client.ZooKeeperSaslClient: Client
will use GSSAPI as SASL mechanism. 2018-07-02 06:42:04,009 INFO
[main-SendThread(server2.covert.com:2181)] zookeeper.ClientCnxn: Opening socket
connection to server server2.covert.com/IP1:2181. Will attempt to
SASL-authenticate using Login Context section 'Client' 2018-07-02 06:42:04,036 INFO
[master/server3.covert.com/IP:16000-SendThread(server2.covert.com:2181)]
client.ZooKeeperSaslClient: Client will use GSSAPI as SASL mechanism. 2018-07-02 06:42:04,037 INFO
[master/server3.covert.com/IP:16000-SendThread(server2.covert.com:2181)]
zookeeper.ClientCnxn: Opening socket connection to server server2.covert.com/IP1:2181.
Will attempt to SASL-authenticate using Login Context section 'Client' 2018-07-02 06:42:35,085 WARN
[master/server3.covert.com/IP:16000] util.Sleeper: We slept 30621ms instead of
3000ms, this is likely due to a long garbage collecting pause and it's usually
bad, see http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired 2018-07-02 06:42:38,645 INFO
[main-SendThread(server2.covert.com:2181)] zookeeper.ClientCnxn: Socket
connection established to server2.covert.com/IP1:2181, initiating session 2018-07-02 06:42:38,647 INFO
[main-SendThread(server2.covert.com:2181)] zookeeper.ClientCnxn: Unable to reconnect
to ZooKeeper service, session 0x163cb19387303c0 has expired, closing socket
connection 2018-07-02 06:42:38,648 FATAL
[main-EventThread] master.HMaster: master:16000-0x163cb19387303c0,
quorum=server3.covert.com:2181,server1.covert.com:2181,server2.covert.com:2181,
baseZNode=/hbase-secure master:16000-0x163cb19387303c0 received expired from
ZooKeeper, aborting org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:592) at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:524) at
org.apache.hadoop.hbase.zookeeper.PendingWatcher.process(PendingWatcher.java:40) at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:534) at
org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510) 2018-07-02 06:42:38,650 INFO
[main-EventThread] regionserver.HRegionServer: STOPPED: master:16000-0x163cb19387303c0,
quorum=server3.covert.com:2181,server1.covert.com:2181,server2.covert.com:2181,
baseZNode=/hbase-secure master:16000-0x163cb19387303c0 received expired from
ZooKeeper, aborting 2018-07-02 06:42:38,650 INFO
[main-EventThread] zookeeper.ClientCnxn: EventThread shut down 2018-07-02 06:42:38,650 INFO
[master/server3.covert.com/IP:16000] regionserver.HRegionServer: Stopping
infoServer 2018-07-02 06:42:38,663 INFO
[master/server3.covert.com/IP:16000] mortbay.log: Stopped SelectChannelConnector@0.0.0.0:16010 2018-07-02 06:42:38,669 INFO
[master/server3.covert.com/IP:16000-SendThread(server2.covert.com:2181)]
zookeeper.ClientCnxn: Socket connection established to server2.covert.com/IP1:2181,
initiating session 2018-07-02 06:42:38,671 INFO
[master/server3.covert.com/IP:16000-SendThread(server2.covert.com:2181)]
zookeeper.ClientCnxn: Unable to reconnect to ZooKeeper service, session
0x363c8a174c329a7 has expired, closing socket connection 2018-07-02 06:42:38,671 WARN
[master/server3.covert.com/IP:16000-EventThread]
client.ConnectionManager$HConnectionImplementation: This client just lost it's
session with ZooKeeper, closing it. It will be recreated next time someone
needs it org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:592) at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:524) at org.apache.hadoop.hbase.zookeeper.PendingWatcher.process(PendingWatcher.java:40) at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:534) at
org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510) 2018-07-02 06:42:38,671 INFO
[master/server3.covert.com/IP:16000-EventThread]
client.ConnectionManager$HConnectionImplementation: Closing zookeeper
sessionid=0x363c8a174c329a7 2018-07-02 06:42:38,671 INFO
[master/server3.covert.com/IP:16000-EventThread] zookeeper.ClientCnxn:
EventThread shut down 2018-07-02 06:42:38,766 INFO
[master/server3.covert.com/IP:16000] regionserver.HRegionServer: stopping
server server3.covert.com,16000,1528124894011 2018-07-02 06:42:38,769 INFO
[master/server3.covert.com/IP:16000] regionserver.HRegionServer: stopping
server server3.covert.com,16000,1528124894011; all regions closed. 2018-07-02 06:42:38,769 INFO
[master/server3.covert.com/IP:16000] hbase.ChoreService: Chore service for:
server3.covert.com,16000,1528124894011 had [] on shutdown 2018-07-02 06:42:38,769 WARN
[master/server3.covert.com/IP:16000] zookeeper.ZKUtil:
master:16000-0x163cb19387303c0,
quorum=server3.covert.com:2181,server1.covert.com:2181,server2.covert.com:2181,
baseZNode=/hbase-secure Unable to get data of znode /hbase-secure/master org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for /hbase-secure/master at
org.apache.zookeeper.KeeperException.create(KeeperException.java:127) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at
org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155) at
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:354) at org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:622) at
org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:148) at
org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:267) at
org.apache.hadoop.hbase.master.HMaster.stopServiceThreads(HMaster.java:1249) at
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1118) at
java.lang.Thread.run(Thread.java:745) 2018-07-02 06:42:38,770 ERROR
[master/server3.covert.com/IP:16000] zookeeper.ZooKeeperWatcher:
master:16000-0x163cb19387303c0,
quorum=server3.covert.com:2181,server1.covert.com:2181,server2.covert.com:2181,
baseZNode=/hbase-secure Received unexpected KeeperException, re-throwing
exception org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for /hbase-secure/master at
org.apache.zookeeper.KeeperException.create(KeeperException.java:127) at
org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at
org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155) at
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:354) at
org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:622) at org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:148) at
org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:267) at
org.apache.hadoop.hbase.master.HMaster.stopServiceThreads(HMaster.java:1249) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1118) at
java.lang.Thread.run(Thread.java:745) 2018-07-02 06:42:38,770 ERROR
[master/server3.covert.com/IP:16000] master.ActiveMasterManager:
master:16000-0x163cb19387303c0, quorum=server3.covert.com:2181,server1.covert.com:2181,server2.covert.com:2181,
baseZNode=/hbase-secure Error deleting our own master address node org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for /hbase-secure/master at
org.apache.zookeeper.KeeperException.create(KeeperException.java:127) at
org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at
org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155) at
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:354) at
org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:622) at
org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:148) at org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:267) at
org.apache.hadoop.hbase.master.HMaster.stopServiceThreads(HMaster.java:1249) at
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1118) at java.lang.Thread.run(Thread.java:745) 2018-07-02 06:42:38,770 INFO
[master/server3.covert.com/IP:16000] ipc.RpcServer: Stopping server on 16000 2018-07-02 06:42:38,770 INFO
[master/server3.covert.com/IP:16000] token.AuthenticationTokenSecretManager:
Stopping leader election, because: SecretManager stopping 2018-07-02 06:42:38,770 INFO
[RpcServer.listener,port=16000] ipc.RpcServer: RpcServer.listener,port=16000:
stopping 2018-07-02 06:42:38,771 INFO
[RpcServer.responder] ipc.RpcServer: RpcServer.responder: stopped 2018-07-02 06:42:38,777 INFO
[RpcServer.responder] ipc.RpcServer: RpcServer.responder: stopping 2018-07-02 06:42:38,787 WARN
[master/server3.covert.com/IP:16000] regionserver.HRegionServer: Failed
deleting my ephemeral node org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for
/hbase-secure/rs/server3.covert.com,16000,1528124894011 at
org.apache.zookeeper.KeeperException.create(KeeperException.java:127) at
org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at
org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873) at
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.delete(RecoverableZooKeeper.java:178) at
org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:1222) at
org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:1211) at
org.apache.hadoop.hbase.regionserver.HRegionServer.deleteMyEphemeralNode(HRegionServer.java:1528) at
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1126) at
java.lang.Thread.run(Thread.java:745) 2018-07-02 06:42:38,791 INFO
[master/server3.covert.com/IP:16000] regionserver.HRegionServer: stopping
server server3.covert.com,16000,1528124894011; zookeeper connection closed. 2018-07-02 06:42:38,791 INFO
[master/server3.covert.com/IP:16000] regionserver.HRegionServer:
master/server3.covert.com/IP:16000 exiting
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache HBase
06-29-2018
12:25 PM
Hi, In my cluster standby name node is going down continuously.can you please help me where can i find the logs of the issue...
... View more
Labels:
- Labels:
-
Apache Hadoop
06-28-2018
11:31 AM
sudo netstat -tnlpa | grep 85623 tcp6 0 0 :::9000 :::* LISTEN 85623/./portainer
... View more
06-28-2018
11:23 AM
ps -ef | grep 9000 output: ~]$ ps -ef | grep 9000 myad+ 79718 76644 0 11:16 pts/0 00:00:00 grep --color=auto 9000 root 85623 1 0 Jun27 ? 00:00:03 ./portainer -p :9000 --data /home/myadmin/portainer_data
... View more