Created 08-14-2018 07:06 PM
Hi,
I am using HDP3.0 and ambari 2.7 blueprint to install my cluster. Yarn's timeline service V2 reader cannot create zookeeper node:
at org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$ZKTask$1.exec(ReadOnlyZKClient.java:168) at org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient.run(ReadOnlyZKClient.java:323) at java.lang.Thread.run(Thread.java:745) 2018-08-14 17:59:55,827 INFO [main] client.RpcRetryingCallerImpl: Call exception, tries=6, retries=6, started=4142 ms ago, cancelled=false, msg=org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /atsv2-hbase-unsecure/meta-region-server, details=row 'prod.timelineservice.entity' on table 'hbase:meta' at null
I observed yarn is using timeline service 1.5 and timeline service reader v2. I am not sure if this is expected. My blueprint is using:
{ "name": "APP_TIMELINE_SERVER" },
Adding hbase client per this post did not help. Any idea? Thanks.
Created 08-15-2018 12:26 AM
Also please find the below properties values from your Blueprint file and change the values to a path which has proper permission:
Specially inside "yarn-hbase-env"
hbase_java_io_tmpdir yarn_hbase_java_io_tmpdir
.
Created 08-14-2018 11:47 PM
Looks like this is also caused by noexec /tmp like this post:
Suppressed: java.lang.UnsatisfiedLinkError: /tmp/liborg_apache_hbase_thirdparty_netty_transport_native_epoll_x86_644869269367588546881.so: /tmp/liborg_apache_hbase_thirdparty_netty_transport_ native_epoll_x86_644869269367588546881.so: failed to map segment from shared object: Operation not permitted at java.lang.ClassLoader$NativeLibrary.load(Native Method) at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941) at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824) at java.lang.Runtime.load0(Runtime.java:809) at java.lang.System.load(System.java:1086) at org.apache.hbase.thirdparty.io.netty.util.internal.NativeLibraryUtil.loadLibrary(NativeLibraryUtil.java:36) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hbase.thirdparty.io.netty.util.internal.NativeLibraryLoader$1.run(NativeLibraryLoader.java:263) at java.security.AccessController.doPrivileged(Native Method) at org.apache.hbase.thirdparty.io.netty.util.internal.NativeLibraryLoader.loadLibraryByHelper(NativeLibraryLoader.java:255) at org.apache.hbase.thirdparty.io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:233) ... 27 more
No sure what setting can be used to make yarn not use netty.
Created 08-15-2018 12:09 AM
In your blueprint you have added "APP_TIMELINE_SERVER", So can you please check if you have also added the "TIMELINE_READER" as following?
{ "name" : "APP_TIMELINE_SERVER" }, { "name" : "TIMELINE_READER" },
.
Created 09-26-2018 09:43 PM
I also hit the same issue. But modify hbase_java_io_tmpdir does not work for me...
Any other suggestion?
Created 08-15-2018 12:26 AM
Also please find the below properties values from your Blueprint file and change the values to a path which has proper permission:
Specially inside "yarn-hbase-env"
hbase_java_io_tmpdir yarn_hbase_java_io_tmpdir
.
Created 08-15-2018 05:10 PM
Thanks @Jay Kumar SenSharma
This setting worked:
"yarn-hbase-env": { "properties": { "hbase_java_io_tmpdir": "/u01/tmp" } }
These setting made TS reader v2 start. Progress!
However, after some time, it stopped again due to the issue described in this post. The reason is that the yarn hbase regional server cannot start (though the yarn hbase master starts). I guess it is still related to the fact that regional server is using /tmp.
In ambari UI, I cannot find a place to set yarn_hbase_java_io_tmpdir. Any idea?
Created 08-15-2018 11:15 PM
Good to know that with the help of "hbase_java_io_tmpdir" your issue partially solved.
Regarding the "yarn_hbase_java_io_tmpdir" property, you can find it inside the "Advanced yarn-hbase-env" as following:
Ambari UI --> Yarn --> Configs --> Advanced --> "Advanced yarn-hbase-env" --> "hbase-env template"
export HBASE_OPTS="$HBASE_OPTS -XX:+UseConcMarkSweepGC -XX:ErrorFile=$HBASE_LOG_DIR/hs_err_pid%p.log -Djava.io.tmpdir={{yarn_hbase_java_io_tmpdir}}"
.
So just replace the value of {{yarn_hbase_java_io_tmpdir}} with your desired value. The Default value of the "yarn_hbase_java_io_tmpdir" is calculated as following:
# grep 'yarn_hbase_java_io_tmpdir' /var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/params_linux.py yarn_hbase_java_io_tmpdir = default("/configurations/yarn-hbase-env/hbase_java_io_tmpdir", "/tmp")
Created 08-16-2018 03:49 AM
It turned out hbase_java_io_tmpdir is enough. The hbase regional server not starting issue is because our security setting disabled the yarn-ats user whose uid is out of a range. After creating this user in the desired range, ts reader v2 worked.
Created 08-16-2018 04:20 AM
Correct we do not need to worry about the "yarn_hbase_java_io_tmpdir" property as it is basically being controlled by the property "hbase_java_io_tmpdir" hence if we do not set the hbase_java_io_tmpdir property on our own then the default value for both the property will be "/tmp"
yarn_hbase_java_io_tmpdir = default("/configurations/yarn-hbase-env/hbase_java_io_tmpdir", "/tmp")
As your reader seems to be working fine now. hence it will be great if you ca mark this thread as answered so that it will be useful for other HCC users to know the details about these properties and can quickly browse the answers.
Created 09-26-2018 09:26 AM
I also hit the same issue. But modify hbase_java_io_tmpdir does not work for me...
Any other suggestion?
Created 12-04-2018 12:42 AM
Not sure if this is still relevant or not, but I had a similar issue (NoNode for /atsv2-hbase-secure/master, details=row 'prod.timelineservice.entity' on table 'hbase:meta' at null) . I fixed it by changing the yarn hbase site's 'zookeeper.znode.parent' value to a non-existent zookeeper path and then restarting YARN via Ambari.
Created 02-27-2019 01:41 PM
This works, great workaround.
Created 02-27-2019 01:41 PM
This works, great workaround.
Created 05-02-2019 11:18 PM
I've tried this and it doesn't seem to work. No matter what I change it to, I get errors in the log indicating it can't find znodes. ie:
2019-05-01 11:11:28,748 INFO zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server myserver.com/xx.xx.xxx.xxx:2181. Will not attempt to authenticate using SASL (unknown error)
2019-05-01 11:11:28,757 INFO zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established, initiating session, client: /10.87.130.196:51436, server:myserver.com/xx.xx.xxx.xxx:2181
2019-05-01 11:11:28,772 INFO zookeeper.ClientCnxn (ClientCnxn.java:onConnected(1279)) - Session establishment complete on server myserver.com/xx.xx.xxx.xxx:2181, sessionid = 0x36a72f6d9090007, negotiated timeout = 60000
2019-05-01 11:11:28,792 WARN client.ConnectionImplementation (ConnectionImplementation.java:retrieveClusterId(528)) - Retrieve cluster id failed
java.util.concurrent.ExecutionException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /atsv2-hbase-unsecure/hbaseid
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at org.apache.hadoop.hbase.client.ConnectionImplementation.retrieveClusterId(ConnectionImplementation.java:526)
at org.apache.hadoop.hbase.client.ConnectionImplementation.<init>(ConnectionImplementation.java:286)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:219)
at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:114)
at org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl.serviceInit(HBaseTimelineReaderImpl.java:88)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at org.apache.hadoop.yarn.server.timelineservice.reader.TimelineReaderServer.serviceInit(TimelineReaderServer.java:92)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at org.apache.hadoop.yarn.server.timelineservice.reader.TimelineReaderServer.startTimelineReaderServer(TimelineReaderServer.java:233)
at org.apache.hadoop.yarn.server.timelineservice.reader.TimelineReaderServer.main(TimelineReaderServer.java:246)
Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /atsv2-hbase-unsecure/hbaseid
at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$ZKTask$1.exec(ReadOnlyZKClient.java:164)
at org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient.run(ReadOnlyZKClient.java:321)
at java.lang.Thread.run(Thread.java:745)
2019-05-01 11:11:28,982 INFO common.HBaseTimelineStorageUtils (HBaseTimelineStorageUtils.java:getTimelineServiceHBaseConf(65)) - Using hbase configuration at file:///usr/hdp/3.1.0.0-78/hadoop/conf/embedded-yarn-ats-hbase/hbase-site.xml
2019-05-01 11:11:28,984 INFO zookeeper.ReadOnlyZKClient (ReadOnlyZKClient.java:<init>(130)) - Start read only zookeeper connection 0x534a5a98 to myserver.com:2181,myserver2.com:2181,myserver3.com:2181, session timeout 90000 ms, retries 6, retry interval 1000 ms, keep alive 60000 ms