Created 10-25-2018 04:36 AM
After fresh installation I got the following critical error alert in YARN:
Title: ATSv2 HBase Application
Response:
Traceback (most recent call last): File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/alerts/alert_ats_hbase.py", line 183, in execute ats_hbase_app_info = make_valid_json(output) File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/alerts/alert_ats_hbase.py", line 226, in make_valid_json raise Fail("Couldn't validate the received output for JSON parsing.") Fail: Couldn't validate the received output for JSON parsing.
Created 06-03-2019 01:38 AM
HDP3.1,ambari 2.7, debian 9
2019-05-31 07:36:53,863 INFO zookeeper.ReadOnlyZKClient (ReadOnlyZKClient.java:run(315)) - 0x4d5650ae no activities for 60000 ms, close active connection. Will reconnect next time when there are new requests.
2019-05-31 07:37:53,542 INFO storage.HBaseTimelineReaderImpl (HBaseTimelineReaderImpl.java:run(170)) - Running HBase liveness monitor
2019-05-31 07:37:53,544 WARN storage.HBaseTimelineReaderImpl (HBaseTimelineReaderImpl.java:run(183)) - Got failure attempting to read from timeline storage, assuming HBase down
java.io.UncheckedIOException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the location for replica 0
at org.apache.hadoop.hbase.client.ResultScanner$1.hasNext(ResultScanner.java:55)
at org.apache.hadoop.yarn.server.timelineservice.storage.reader.TimelineEntityReader.readEntities(TimelineEntityReader.java:283)
at org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl$HBaseMonitor.run(HBaseTimelineReaderImpl.java:174)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the location for replica 0
at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:332)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:153)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:58)
at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:192)
at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:269)
at org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:437)
at org.apache.hadoop.hbase.client.ClientScanner.nextWithSyncCache(ClientScanner.java:312)
at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:597)
at org.apache.hadoop.hbase.client.ResultScanner$1.hasNext(ResultScanner.java:53)
... 9 more
Caused by: java.io.IOException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /atsv2-hbase-secure/meta-region-server
at org.apache.hadoop.hbase.client.ConnectionImplementation.get(ConnectionImplementation.java:2002)
at org.apache.hadoop.hbase.client.ConnectionImplementation.locateMeta(ConnectionImplementation.java:762)
at org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:729)
at org.apache.hadoop.hbase.client.ConnectionImplementation.relocateRegion(ConnectionImplementation.java:707)
at org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegionInMeta(ConnectionImplementation.java:911)
at org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:732)
at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:325)
Created 06-03-2019 10:27 AM
Create different queue for ats_hbase. ats_hbase can't work properly because it can't get enough resources.
Created 11-01-2020 10:26 PM
Hi all,
I'm not sure if this issue is considered solved. In case it helps, I explain how we did it.
We found the same error after removing several nodes of our kerberized cluster (Ambari 2.7.4 and HDP 3.1.4).
$ /usr/hdp/current/hadoop-yarn-client/bin/yarn app -status ats-hbase
20/11/02 07:04:39 INFO client.AHSProxy: Connecting to Application History server at XXXXX/YYY.YYY.YY.YY:10200
20/11/02 07:04:39 INFO client.AHSProxy: Connecting to Application History server at XXXXX/YYY.YYY.YY.YY:10200
ats-hbase Failed : HTTP error code : 500
Following this thread, we checked carefully the YARN configuration to ensure that all the variables were correctly scaled to the available nodes.
After that, we destroyed the yarn app:
$ yarn app -destroy ats-hbase
20/11/02 07:06:13 INFO client.AHSProxy: Connecting to Application History server at XXXXX/YYY.YYY.YY.YY:10200
20/11/02 07:06:13 INFO client.AHSProxy: Connecting to Application History server at XXXXX/YYY.YYY.YY.YY:10200
20/11/02 07:06:14 INFO client.ApiServiceClient: Successfully destroyed service ats-hbase
$ /usr/hdp/current/hadoop-yarn-client/bin/yarn app -status ats-hbase
20/11/02 07:06:19 INFO client.AHSProxy: Connecting to Application History server at XXXXX/YYY.YYY.YY.YY:10200
20/11/02 07:06:19 INFO client.AHSProxy: Connecting to Application History server at XXXXX/YYY.YYY.YY.YY:10200
Service ats-hbase not found
Thus, we restarted all the YARN service on ambari. Now, everything is running fine.
$ /usr/hdp/current/hadoop-yarn-client/bin/yarn app -status ats-hbase
20/11/02 07:09:02 INFO client.AHSProxy: Connecting to Application History server at XXXXX/YYY.YYY.YY.YY:10200
20/11/02 07:09:02 INFO client.AHSProxy: Connecting to Application History server at XXXXX/YYY.YYY.YY.YY:10200
{"name":"ats-hbase","id":"application_1604297264331_0001","artifact":{"id":"/hdp/apps/3.1.4.0-315/hbase/rm2/hbase.tar.gz","type":"TARBALL"},"lifetime":-1,"components":[{"name":"master","dependencies":[],"artifact":{"id":"/hdp/apps/3.1.4.0-315/hbase/rm2/hbase.tar.gz","type":"TARBALL"},"resource":{"cpus":1,"memory":"4096","additional":{}},"state":"STABLE","configuration":{"properties":{"yarn.service.container-failure.retry.max":"10","yarn.service.framework.path":"/hdp/apps/3.1.4.0-315/yarn/rm2/service-dep.tar.gz"},"env":{"HBASE_LOG_PREFIX":"hbase-$HBASE_IDENT_STRING-master-$HOSTNAME","HBASE_LOGFILE":"$HBASE_LOG_PREFIX.log","HBASE_MASTER_OPTS":"-Xms3276m -Xmx3276m -Djava.security.auth.login.config=/usr/hdp/3.1.4.0-315/hadoop/conf/embedded-yarn-ats-hbase/yarn_hbase_master_jaas.conf",
[...]