Support Questions

Find answers, ask questions, and share your expertise

Timeline Service V2.0 Reader not starting

avatar
Contributor

Everything in my HDP is working except Timeline Service V2.0 Reader. When I try to start it I get the following error log:

Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/timelinereader.py", line 108, in <module>
    ApplicationTimelineReader().execute()
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 353, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/timelinereader.py", line 51, in start
    hbase(action='start')
  File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/hbase_service.py", line 80, in hbase
    createTables()
  File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/hbase_service.py", line 147, in createTables
    logoutput=True)
  File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 166, in __init__
    self.env.run()
  File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 160, in run
    self.run_action(resource, action)
  File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 124, in run_action
    provider_action()
  File "/usr/lib/ambari-agent/lib/resource_management/core/providers/system.py", line 263, in action_run
    returns=self.resource.returns)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 72, in inner
    result = function(command, **kwargs)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 102, in checked_call
    tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy, returns=returns)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 150, in _call_wrapper
    result = _call(command, **kwargs_copy)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 308, in _call
    raise ExecuteTimeoutException(err_msg)
resource_management.core.exceptions.ExecuteTimeoutException: Execution of 'ambari-sudo.sh su yarn-ats -l -s /bin/bash -c 'export  PATH='"'"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/local/texlive/2016/bin/x86_64-linux:/usr/local/texlive/2016/bin/x86_64-linux:/usr/local/texlive/2016/bin/x86_64-linux:/usr/lib64/qt-3.3/bin:/usr/local/texlive/2016/bin/x86_64-linux:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/opt/maven/bin:/root/bin:/opt/maven/bin:/opt/maven/bin:/var/lib/ambari-agent'"'"' ; sleep 10;export HBASE_CLASSPATH_PREFIX=/usr/hdp/3.0.0.0-1634/hadoop-yarn/timelineservice/*; /usr/hdp/3.0.0.0-1634/hbase/bin/hbase --config /usr/hdp/3.0.0.0-1634/hadoop/conf/embedded-yarn-ats-hbase org.apache.hadoop.yarn.server.timelineservice.storage.TimelineSchemaCreator -Dhbase.client.retries.number=35 -create -s'' was killed due timeout after 300 seconds
1 ACCEPTED SOLUTION

avatar

Hi @Daniel Zafar,

Can you please navigate to the host where Timeline Reader is installed and Install Hbase Client in that host and let me know the result.

Seeing the code base(the missing file is in hadoop-yarn-server-timelineservice-hbase-client) I have strong feeling this can fix the issue.

Code reference : https://github.com/hortonworks/hadoop-release/blob/HDP-3.0.0.0-1634-tag/hadoop-yarn-project/hadoop-y...

Please log in and accept the answer if you find it helpful 🙂

View solution in original post

18 REPLIES 18

avatar
Contributor

@Akhil S Naik That worked! Nice job!

avatar
New Contributor

Hello @Akhil S Naik, @Daniel Zafar

I am running on HDP-3.0.1 and Yarn 3.1.0 and ran into the same issue which says resource_management.core.exceptions.ExecuteTimeoutException: Execution of 'ambari-sudo.sh su yarn-ats -l -s /bin/bash -c 'export PATH='"'"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin:/var/lib/ambari-agent:/var/lib/ambari-agent'"'"' ; sleep 10;export HBASE_CLASSPATH_PREFIX=/usr/hdp/3.0.1.0-187/hadoop-yarn/timelineservice/*; /usr/hdp/3.0.1.0-187/hbase/bin/hbase --config /usr/hdp/3.0.1.0-187/hadoop/conf/embedded-yarn-ats-hbase org.apache.hadoop.yarn.server.timelineservice.storage.TimelineSchemaCreator -Dhbase.client.retries.number=35 -create -s'' was killed due timeout after 300 seconds

Also can you please specify how to install hbase client here? Is it as a part of hbase service through ambari or just the hbase client using the cli? Any guidance here please.

avatar
Expert Contributor

Was running into this exact same problem. Here is how I installed HBase Client from via the Ambari UI...

1. In the Ambari UI, go to Hosts then click the host you want to install the hbase client component on

2. In the list on components, you will have option to add more, see...

110111-1564178193294.png

3. From here I installed the HBase client

4. Then stopped and restarted the cluster via Ambari UI (got notification of stale configs (though not sure if this was my problem all along))


One thing that was weird is that I did not change any configs or install anything new on the host nodes between trying to restart and running into this error and up until now everything appeared to be working fine. @Akhil S Naik, is there any reason that you could think of why this would only be happening now?

avatar
Contributor

Myself I do not have any issues with class not found but:

client.ConnectionImplementation: Retrieve cluster id failed

may be any related so posting

018-10-19 22:48:33,152 INFO  [ReadOnlyZKClient-emltgh01.emtst.lpemrz.com:2181@0x38102d01] zookeeper.ZooKeeper: Client environment:java.library.path=:/usr/hdp/3.0.1.0-187/hadoop/lib/native/Linux-amd64-64:/usr/hdp/3.0.1.0-187/hadoop/lib/native
2018-10-19 22:48:33,152 INFO  [ReadOnlyZKClient-emltgh01.emtst.lpemrz.com:2181@0x38102d01] zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
2018-10-19 22:48:33,152 INFO  [ReadOnlyZKClient-emltgh01.emtst.lpemrz.com:2181@0x38102d01] zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
2018-10-19 22:48:33,152 INFO  [ReadOnlyZKClient-emltgh01.emtst.lpemrz.com:2181@0x38102d01] zookeeper.ZooKeeper: Client environment:os.name=Linux
2018-10-19 22:48:33,152 INFO  [ReadOnlyZKClient-emltgh01.emtst.lpemrz.com:2181@0x38102d01] zookeeper.ZooKeeper: Client environment:os.arch=amd64
2018-10-19 22:48:33,152 INFO  [ReadOnlyZKClient-emltgh01.emtst.lpemrz.com:2181@0x38102d01] zookeeper.ZooKeeper: Client environment:os.version=3.10.0-514.21.1.el7.x86_64
2018-10-19 22:48:33,152 INFO  [ReadOnlyZKClient-emltgh01.emtst.lpemrz.com:2181@0x38102d01] zookeeper.ZooKeeper: Client environment:user.name=yarn-ats
2018-10-19 22:48:33,152 INFO  [ReadOnlyZKClient-emltgh01.emtst.lpemrz.com:2181@0x38102d01] zookeeper.ZooKeeper: Client environment:user.home=/home/yarn-ats
2018-10-19 22:48:33,152 INFO  [ReadOnlyZKClient-emltgh01.emtst.lpemrz.com:2181@0x38102d01] zookeeper.ZooKeeper: Client environment:user.dir=/home/yarn-ats
2018-10-19 22:48:33,154 INFO  [ReadOnlyZKClient-emltgh01.emtst.lpemrz.com:2181@0x38102d01] zookeeper.ZooKeeper: Initiating client connection, connectString=emltgh01.emtst.lpemrz.com:2181 sessionTimeout=90000 watcher=org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$$Lambda$13/966280619@62068d2d
2018-10-19 22:48:33,171 INFO  [ReadOnlyZKClient-emltgh01.emtst.lpemrz.com:2181@0x38102d01-SendThread(emltgh01.emtst.lpemrz.com:2181)] zookeeper.ClientCnxn: Opening socket connection to server emltgh01.emtst.lpemrz.com/10.10.13.100:2181. Will not attempt to authenticate using SASL (unknown error)
2018-10-19 22:48:33,174 INFO  [ReadOnlyZKClient-emltgh01.emtst.lpemrz.com:2181@0x38102d01-SendThread(emltgh01.emtst.lpemrz.com:2181)] zookeeper.ClientCnxn: Socket connection established, initiating session, client: /10.10.13.100:38167, server: emltgh01.emtst.lpemrz.com/10.10.13.100:2181
2018-10-19 22:48:33,179 INFO  [ReadOnlyZKClient-emltgh01.emtst.lpemrz.com:2181@0x38102d01-SendThread(emltgh01.emtst.lpemrz.com:2181)] zookeeper.ClientCnxn: Session establishment complete on server emltgh01.emtst.lpemrz.com/10.10.13.100:2181, sessionid = 0x1668d9e70b70055, negotiated timeout = 40000
2018-10-19 22:48:33,188 WARN  [main] client.ConnectionImplementation: Retrieve cluster id failed
java.util.concurrent.ExecutionException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /atsv2-hbase-unsecure/hbaseid
	at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
	at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
	at org.apache.hadoop.hbase.client.ConnectionImplementation.retrieveClusterId(ConnectionImplementation.java:527)
	at org.apache.hadoop.hbase.client.ConnectionImplementation.<init>(ConnectionImplementation.java:287)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:219)
	at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:114)
	at org.apache.hadoop.yarn.server.timelineservice.storage.TimelineSchemaCreator.createAllTables(TimelineSchemaCreator.java:301)
	at org.apache.hadoop.yarn.server.timelineservice.storage.TimelineSchemaCreator.createAllSchemas(TimelineSchemaCreator.java:277)
	at org.apache.hadoop.yarn.server.timelineservice.storage.TimelineSchemaCreator.main(TimelineSchemaCreator.java:146)
Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /atsv2-hbase-unsecure/hbaseid
	at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
	at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
	at org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$ZKTask$1.exec(ReadOnlyZKClient.java:168)
	at org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient.run(ReadOnlyZKClient.java:323)
	at java.lang.Thread.run(Thread.java:745)

avatar
Rising Star

@Marek Martofel Following the steps in below link fixed this issue:

Enable System Service Mode On an Upgraded Cluster

avatar
Contributor
Thanks CIBI, It is worked for me..

avatar
Contributor

@ccibi75Thanks for the solution to resolve timeline server v2.0 start issue in HDP3.x. It worked!!!

avatar
Contributor

Thanks, ccibi75

This worked for me.

avatar
Contributor

Many thanks Cibi! It works now. Indeed yarn-system queue was with 0% capacity and is_hbase_system_service_launch was false.