Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Can not YARN Timeline Service V2.0 Reader

Can not YARN Timeline Service V2.0 Reader

Hello,

Most recently my YARN Timeline Service Reader is not starting anymore due to following error:

2018-12-08 12:59:18,852 INFO  [main] client.RpcRetryingCallerImpl: Call exception, tries=6, retries=6, started=4859 ms ago, cancelled=false, msg=Call to examples.foodscience-01.de/163.49.39.115:17020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: examples.foodscience-01.de/163.49.39.115:17020, details=row 'prod.timelineservice.entity' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=examples.foodscience-01.de,17020,1543619998977, seqNum=-1
2018-12-08 12:59:22,895 INFO  [main] client.RpcRetryingCallerImpl: Call exception, tries=7, retries=7, started=8902 ms ago, cancelled=false, msg=Call to examples.foodscience-01.de/163.49.39.115:17020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: examples.foodscience-01.de/163.49.39.115:17020, details=row 'prod.timelineservice.entity' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=examples.foodscience-01.de,17020,1543619998977, seqNum=-1
2018-12-08 12:59:32,955 INFO  [main] client.RpcRetryingCallerImpl: Call exception, tries=8, retries=8, started=18962 ms ago, cancelled=false, msg=Call to examples.foodscience-01.de/163.49.39.115:17020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: examples.foodscience-01.de/163.49.39.115:17020, details=row 'prod.timelineservice.entity' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=examples.foodscience-01.de,17020,1543619998977, seqNum=-1
2018-12-08 12:59:42,965 INFO  [main] client.RpcRetryingCallerImpl: Call exception, tries=9, retries=9, started=28972 ms ago, cancelled=false, msg=Call to examples.foodscience-01.de/163.49.39.115:17020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: examples.foodscience-01.de/163.49.39.115:17020, details=row 'prod.timelineservice.entity' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=examples.foodscience-01.de,17020,1543619998977, seqNum=-1
2018-12-08 12:59:53,064 INFO  [main] client.RpcRetryingCallerImpl: Call exception, tries=10, retries=10, started=39071 ms ago, cancelled=false, msg=Call to examples.foodscience-01.de/163.49.39.115:17020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: examples.foodscience-01.de/163.49.39.115:17020, details=row 'prod.timelineservice.entity' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=examples.foodscience-01.de,17020,1543619998977, seqNum=-1
2018-12-08 13:00:03,101 INFO  [main] client.RpcRetryingCallerImpl: Call exception, tries=11, retries=11, started=49108 ms ago, cancelled=false, msg=Call to examples.foodscience-01.de/163.49.39.115:17020 failed on connection exception: org.apache.hbase.th

It seems that HBase has a problem (although I am not using this service on Ambari).

Then I checked following log file hadoop-yarn-timelinereader-foodscience-01.log

Caused by: java.net.ConnectException: Call to examples.foodscience-01.de/163.49.39.115:17020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: examples.foodscience-01.de/163.49.39.115:17020
    at org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:165)
    at org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:390)
    at org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:95)
    at org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:410)
    at org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:406)
    at org.apache.hadoop.hbase.ipc.Call.callComplete(Call.java:103)
    at org.apache.hadoop.hbase.ipc.Call.setException(Call.java:118)
    at org.apache.hadoop.hbase.ipc.BufferCallBeforeInitHandler.userEventTriggered(BufferCallBeforeInitHandler.java:92)
    at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:329)
    at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:315)
    at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireUserEventTriggered(AbstractChannelHandlerContext.java:307)
    at org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline$HeadContext.userEventTriggered(DefaultChannelPipeline.java:1377)
    at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:329)
    at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:315)
    at org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.fireUserEventTriggered(DefaultChannelPipeline.java:929)
    at org.apache.hadoop.hbase.ipc.NettyRpcConnection.failInit(NettyRpcConnection.java:179)
    at org.apache.hadoop.hbase.ipc.NettyRpcConnection.access$500(NettyRpcConnection.java:71)
    at org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:269)
    at org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:263)
    at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
    at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:500)
    at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:479)
    at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)
    at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:122)
    at org.apache.hbase.thirdparty.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:327)
    at org.apache.hbase.thirdparty.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:343)
    at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:633)
    at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
    at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
    at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
    at org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
    at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
    ... 1 more
Caused by: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: examples.foodscience-01.de/163.49.39.115:17020
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at org.apache.hbase.thirdparty.io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:323)
    at org.apache.hbase.thirdparty.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340)
    ... 7 more
Caused by: java.net.ConnectException: Connection refused
    ... 11 more
2018-12-06 13:03:33,051 INFO  zookeeper.ReadOnlyZKClient (ReadOnlyZKClient.java:run(315)) - 0x4d465b11 no activities for 60000 ms, close active connection. Will reconnect next time when there are new requests.
2018-12-06 13:03:57,614 INFO  storage.HBaseTimelineReaderImpl (HBaseTimelineReaderImpl.java:run(170)) - Running HBase liveness monitor
2018-12-06 13:04:24,100 ERROR reader.TimelineReaderServer (LogAdapter.java:error(75)) - RECEIVED SIGNAL 15: SIGTERM
2018-12-06 13:04:24,116 INFO  handler.ContextHandler (ContextHandler.java:doStop(910)) - Stopped o.e.j.w.WebAppContext@12299890{/,null,UNAVAILABLE}{/timeline}
2018-12-06 13:04:24,125 INFO  server.AbstractConnector (AbstractConnector.java:doStop(318)) - Stopped ServerConnector@328af33d{HTTP/1.1,[http/1.1]}{0.0.0.0:8198}
2018-12-06 13:04:24,128 INFO  handler.ContextHandler (ContextHandler.java:doStop(910)) - Stopped o.e.j.s.ServletContextHandler@7d3e8655{/static,jar:file:/usr/hdp/3.0.0.0-1634/hadoop-yarn/hadoop-yarn-common-3.1.0.3.0.0.0-1634.jar!/webapps/static,UNAVAILABLE}
2018-12-06 13:04:24,128 INFO  handler.ContextHandler (ContextHandler.java:doStop(910)) - Stopped o.e.j.s.ServletContextHandler@7dfd3c81{/logs,file:///var/log/hadoop-yarn/yarn/,UNAVAILABLE}
2018-12-06 13:04:24,142 INFO  storage.HBaseTimelineReaderImpl (HBaseTimelineReaderImpl.java:serviceStop(108)) - closing the hbase Connection
2018-12-06 13:04:24,143 INFO  zookeeper.ReadOnlyZKClient (ReadOnlyZKClient.java:close(342)) - Close zookeeper connection 0x4d465b11 to examples.foodscience-01.de:2181,examples.foodscience-02.de:2181,examples.foodscience-03.de:2181
2018-12-06 13:04:24,143 WARN  storage.HBaseTimelineReaderImpl (HBaseTimelineReaderImpl.java:run(183)) - Got failure attempting to read from timeline storage, assuming HBase down
java.io.UncheckedIOException: java.io.InterruptedIOException
    at org.apache.hadoop.hbase.client.ResultScanner$1.hasNext(ResultScanner.java:55)
    at org.apache.hadoop.yarn.server.timelineservice.storage.reader.TimelineEntityReader.readEntities(TimelineEntityReader.java:283)
    at org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl$HBaseMonitor.run(HBaseTimelineReaderImpl.java:174)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.InterruptedIOException
    at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:246)
    at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:58)
    at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:192)
    at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:269)
    at org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:437)
    at org.apache.hadoop.hbase.client.ClientScanner.nextWithSyncCache(ClientScanner.java:312)
    at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:597)
    at org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegionInMeta(ConnectionImplementation.java:834)
    at org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:732)
    at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:325)
    at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:153)
    at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:58)
    at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:192)
    at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:269)
    at org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:437)
    at org.apache.hadoop.hbase.client.ClientScanner.nextWithSyncCache(ClientScanner.java:312)
    at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:597)
    at org.apache.hadoop.hbase.client.ResultScanner$1.hasNext(ResultScanner.java:53)
    ... 9 more
2018-12-06 13:04:24,153 INFO  zookeeper.ReadOnlyZKClient (ReadOnlyZKClient.java:close(342)) - Close zookeeper connection 0x5b7a5baa to examples.foodscience-01.de:2181,examples.foodscience-02.de:2181,examples.foodscience-03.de:2181
2018-12-06 13:04:24,155 INFO  reader.TimelineReaderServer (LogAdapter.java:info(51)) - SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down TimelineReaderServer at examples.foodscience-01.de/163.49.39.115

I dont know why this error appears when starting the timeline service. How can this be fixed?

Don't have an account?
Coming from Hortonworks? Activate your account here