Created 11-01-2018 10:57 AM
Hello,
I installed a new (not an update) HDP 3.0.1 and seem to have many issues with the timeline server.
1) The first weird thing is that the Yarn tab in ambari keeps showing this error:
ATSv2 HBase Application The HBase application reported a 'STARTED' state. Check took 2.125s
2) The second issue seems to be with oozie. Running a job, it starts but stalls with the following log repeated hundreds of times
2018-11-01 11:15:37,842 INFO [Thread-82] org.apache.hadoop.yarn.event.AsyncDispatcher: Waiting for AsyncDispatcher to drain. Thread state is :WAITING
Then with:
2018-11-01 11:15:37,888 ERROR [Job ATS Event Dispatcher] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Exception while publishing configs on JOB_SUBMITTED Event for the job : job_1541066376053_0066 org.apache.hadoop.yarn.exceptions.YarnException: Failed while publishing entity at org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl$TimelineEntityDispatcher.dispatchEntities(TimelineV2ClientImpl.java:548) at org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl.putEntities(TimelineV2ClientImpl.java:149) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.publishConfigsOnJobSubmittedEvent(JobHistoryEventHandler.java:1254) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.processEventForNewTimelineService(JobHistoryEventHandler.java:1414) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleTimelineEvent(JobHistoryEventHandler.java:742) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.access$1200(JobHistoryEventHandler.java:93) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$ForwardingEventHandler.handle(JobHistoryEventHandler.java:1795) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$ForwardingEventHandler.handle(JobHistoryEventHandler.java:1791) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126) at java.lang.Thread.run(Thread.java:745) Caused by: com.sun.jersey.api.client.ClientHandlerException: java.net.SocketTimeoutException: Read timed out
3) In hadoop-yarn-timelineserver-${hostname}.log
I see, repeated many times:
2018-11-01 11:32:47,715 WARN timeline.EntityGroupFSTimelineStore (LogInfo.java:doParse(208)) - Error putting entity: dag_1541066376053_0144_2 (TEZ_DAG_ID): 6
4) In hadoop-yarn-timelinereader-${hostname}.log
I see, repeated many times:
Thu Nov 01 11:34:10 CET 2018, RpcRetryingCaller{globalStartTime=1541068444076, pause=1000, maxAttempts=4}, java.net.ConnectException: Call to /192.168.x.x:17020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: /192.168.x.x:17020 at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:145) at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:80) ... 3 more Caused by: java.net.ConnectException: Call to /192.168.x.x:17020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: /192.168.x.x:17020 at org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:165)
and indeed, there is nothing listening to port 17020 on 192.168.x.x.
5) I cannot find on any server a process named ats-hbase, this might be the reason for everything else.
The queue set is just yarn_hbase_system_service_queue_name=default, which has no limit which would prevent Hbase to start.
I am sure that something is very wrong here, and any help would be appreciated.
Created 01-11-2019 05:26 AM
In addition to these steps:
Created 01-11-2019 06:26 PM
Let me clarify, Yarn service does start and everything seems to be working, but this critical alert keeps appearing in the Ambari UI.
Created on 12-25-2019 05:33 PM - edited 12-25-2019 05:35 PM
Thanks@guillaume_roger , That worked for me
Created 01-11-2019 07:11 PM
my final solution is install hbase and my the real base as storage for both ats and ambari metrics .error cleared
Created 01-11-2019 10:44 PM
Thanks, based on your feedback, I followed instruction on this page to use main hbase service and alert is gone.
Created 06-10-2020 06:20 PM
Excuse me, how did you solve it?