Support Questions

Find answers, ask questions, and share your expertise

HDP3.0: timeline service V2 reader cannot create zookeeper nodes.

Contributor

Hi,

I am using HDP3.0 and ambari 2.7 blueprint to install my cluster. Yarn's timeline service V2 reader cannot create zookeeper node:

at org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$ZKTask$1.exec(ReadOnlyZKClient.java:168) at org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient.run(ReadOnlyZKClient.java:323) at java.lang.Thread.run(Thread.java:745) 2018-08-14 17:59:55,827 INFO [main] client.RpcRetryingCallerImpl: Call exception, tries=6, retries=6, started=4142 ms ago, cancelled=false, msg=org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /atsv2-hbase-unsecure/meta-region-server, details=row 'prod.timelineservice.entity' on table 'hbase:meta' at null

I observed yarn is using timeline service 1.5 and timeline service reader v2. I am not sure if this is expected. My blueprint is using:

{
    "name": "APP_TIMELINE_SERVER"
},

Adding hbase client per this post did not help. Any idea? Thanks.

1 ACCEPTED SOLUTION

Super Mentor

@Lian Jiang

Also please find the below properties values from your Blueprint file and change the values to a path which has proper permission:

Specially inside "yarn-hbase-env"

hbase_java_io_tmpdir
yarn_hbase_java_io_tmpdir

.

View solution in original post

13 REPLIES 13

Contributor

Looks like this is also caused by noexec /tmp like this post:







Suppressed: java.lang.UnsatisfiedLinkError: /tmp/liborg_apache_hbase_thirdparty_netty_transport_native_epoll_x86_644869269367588546881.so: /tmp/liborg_apache_hbase_thirdparty_netty_transport_

native_epoll_x86_644869269367588546881.so: failed to map segment from shared object: Operation not permitted

                at java.lang.ClassLoader$NativeLibrary.load(Native Method)

                at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)

                at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)

                at java.lang.Runtime.load0(Runtime.java:809)

                at java.lang.System.load(System.java:1086)

                at org.apache.hbase.thirdparty.io.netty.util.internal.NativeLibraryUtil.loadLibrary(NativeLibraryUtil.java:36)

                at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

                at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

                at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

                at java.lang.reflect.Method.invoke(Method.java:498)

                at org.apache.hbase.thirdparty.io.netty.util.internal.NativeLibraryLoader$1.run(NativeLibraryLoader.java:263)

                at java.security.AccessController.doPrivileged(Native Method)

                at org.apache.hbase.thirdparty.io.netty.util.internal.NativeLibraryLoader.loadLibraryByHelper(NativeLibraryLoader.java:255)

                at org.apache.hbase.thirdparty.io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:233)

                ... 27 more

No sure what setting can be used to make yarn not use netty.

Super Mentor

@Lian Jiang

In your blueprint you have added "APP_TIMELINE_SERVER", So can you please check if you have also added the "TIMELINE_READER" as following?

        {
          "name" : "APP_TIMELINE_SERVER"
        },
        {
          "name" : "TIMELINE_READER"
        },

.

New Contributor

I also hit the same issue. But modify hbase_java_io_tmpdir does not work for me...

Any other suggestion?

Super Mentor

@Lian Jiang

Also please find the below properties values from your Blueprint file and change the values to a path which has proper permission:

Specially inside "yarn-hbase-env"

hbase_java_io_tmpdir
yarn_hbase_java_io_tmpdir

.

Contributor

Thanks @Jay Kumar SenSharma

This setting worked:

"yarn-hbase-env": {
        "properties": {
          "hbase_java_io_tmpdir": "/u01/tmp"
        }
      }

These setting made TS reader v2 start. Progress!

However, after some time, it stopped again due to the issue described in this post. The reason is that the yarn hbase regional server cannot start (though the yarn hbase master starts). I guess it is still related to the fact that regional server is using /tmp.

In ambari UI, I cannot find a place to set yarn_hbase_java_io_tmpdir. Any idea?

Super Mentor

@Lian Jiang

Good to know that with the help of "hbase_java_io_tmpdir" your issue partially solved.

Regarding the "yarn_hbase_java_io_tmpdir" property, you can find it inside the "Advanced yarn-hbase-env" as following:

Ambari UI --> Yarn --> Configs --> Advanced --> "Advanced yarn-hbase-env" --> "hbase-env template"

      export HBASE_OPTS="$HBASE_OPTS -XX:+UseConcMarkSweepGC -XX:ErrorFile=$HBASE_LOG_DIR/hs_err_pid%p.log -Djava.io.tmpdir={{yarn_hbase_java_io_tmpdir}}"

.

So just replace the value of {{yarn_hbase_java_io_tmpdir}} with your desired value. The Default value of the "yarn_hbase_java_io_tmpdir" is calculated as following:

# grep 'yarn_hbase_java_io_tmpdir' /var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/params_linux.py
yarn_hbase_java_io_tmpdir = default("/configurations/yarn-hbase-env/hbase_java_io_tmpdir", "/tmp")

Contributor

It turned out hbase_java_io_tmpdir is enough. The hbase regional server not starting issue is because our security setting disabled the yarn-ats user whose uid is out of a range. After creating this user in the desired range, ts reader v2 worked.

Super Mentor

@Lian Jiang

Correct we do not need to worry about the "yarn_hbase_java_io_tmpdir" property as it is basically being controlled by the property "hbase_java_io_tmpdir" hence if we do not set the hbase_java_io_tmpdir property on our own then the default value for both the property will be "/tmp"

yarn_hbase_java_io_tmpdir = default("/configurations/yarn-hbase-env/hbase_java_io_tmpdir", "/tmp")

As your reader seems to be working fine now. hence it will be great if you ca mark this thread as answered so that it will be useful for other HCC users to know the details about these properties and can quickly browse the answers.

New Contributor

I also hit the same issue. But modify hbase_java_io_tmpdir does not work for me...

Any other suggestion?

New Contributor

Not sure if this is still relevant or not, but I had a similar issue (NoNode for /atsv2-hbase-secure/master, details=row 'prod.timelineservice.entity' on table 'hbase:meta' at null) . I fixed it by changing the yarn hbase site's 'zookeeper.znode.parent' value to a non-existent zookeeper path and then restarting YARN via Ambari.

Explorer

This works, great workaround.

Explorer

This works, great workaround.

New Contributor

I've tried this and it doesn't seem to work. No matter what I change it to, I get errors in the log indicating it can't find znodes. ie:

2019-05-01 11:11:28,748 INFO zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server myserver.com/xx.xx.xxx.xxx:2181. Will not attempt to authenticate using SASL (unknown error)

2019-05-01 11:11:28,757 INFO zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established, initiating session, client: /10.87.130.196:51436, server:myserver.com/xx.xx.xxx.xxx:2181

2019-05-01 11:11:28,772 INFO zookeeper.ClientCnxn (ClientCnxn.java:onConnected(1279)) - Session establishment complete on server myserver.com/xx.xx.xxx.xxx:2181, sessionid = 0x36a72f6d9090007, negotiated timeout = 60000

2019-05-01 11:11:28,792 WARN client.ConnectionImplementation (ConnectionImplementation.java:retrieveClusterId(528)) - Retrieve cluster id failed

java.util.concurrent.ExecutionException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /atsv2-hbase-unsecure/hbaseid

at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)

at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)

at org.apache.hadoop.hbase.client.ConnectionImplementation.retrieveClusterId(ConnectionImplementation.java:526)

at org.apache.hadoop.hbase.client.ConnectionImplementation.<init>(ConnectionImplementation.java:286)

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)

at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

at java.lang.reflect.Constructor.newInstance(Constructor.java:423)

at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:219)

at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:114)

at org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl.serviceInit(HBaseTimelineReaderImpl.java:88)

at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)

at org.apache.hadoop.yarn.server.timelineservice.reader.TimelineReaderServer.serviceInit(TimelineReaderServer.java:92)

at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)

at org.apache.hadoop.yarn.server.timelineservice.reader.TimelineReaderServer.startTimelineReaderServer(TimelineReaderServer.java:233)

at org.apache.hadoop.yarn.server.timelineservice.reader.TimelineReaderServer.main(TimelineReaderServer.java:246)

Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /atsv2-hbase-unsecure/hbaseid

at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)

at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)

at org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$ZKTask$1.exec(ReadOnlyZKClient.java:164)

at org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient.run(ReadOnlyZKClient.java:321)

at java.lang.Thread.run(Thread.java:745)

2019-05-01 11:11:28,982 INFO common.HBaseTimelineStorageUtils (HBaseTimelineStorageUtils.java:getTimelineServiceHBaseConf(65)) - Using hbase configuration at file:///usr/hdp/3.1.0.0-78/hadoop/conf/embedded-yarn-ats-hbase/hbase-site.xml

2019-05-01 11:11:28,984 INFO zookeeper.ReadOnlyZKClient (ReadOnlyZKClient.java:<init>(130)) - Start read only zookeeper connection 0x534a5a98 to myserver.com:2181,myserver2.com:2181,myserver3.com:2181, session timeout 90000 ms, retries 6, retry interval 1000 ms, keep alive 60000 ms

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.