Support Questions
Find answers, ask questions, and share your expertise

Resourcemanagers(HA) don't start

Solved Go to solution

Resourcemanagers(HA) don't start

After enabling Kerberos on the cluster(upgraded to HDP 2.5), everything was working fine. Then I installed Zeppelin, which asked me to restart few components. After the restart, both the resourcemanagers are not starting up.

2016-12-15 10:15:08,735 INFO zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:java.io.tmpdir=/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir 2016-12-15 10:15:08,735 INFO zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:java.compiler=<NA>

2016-12-15 10:15:08,735 INFO zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:os.name=Linux 2016-12-15 10:15:08,735 INFO zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:os.arch=amd64 2016-12-15 10:15:08,735 INFO zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:os.version=2.6.32-504.8.1.el6.x86_64 2016-12-15 10:15:08,735 INFO zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:user.name=yarn 2016-12-15 10:15:08,735 INFO zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:user.home=/home/yarn 2016-12-15 10:15:08,736 INFO zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:user.dir=/usr/hdp/2.5.0.0-1245/hadoop-yarn 2016-12-15 10:15:08,736 INFO zookeeper.ZooKeeper (ZooKeeper.java:<init>(438)) - Initiating client connection, connectString=xxx.com:2181,yyy.com :2181,zzz.com:2181 sessionTimeout=10000 watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@62ef27a8

2016-12-15 10:15:08,752 INFO zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server yyy.com/IP:2181. Will not attempt to authenticate using SASL (unknown error)

2016-12-15 10:15:08,757 INFO zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established to yyy.com/IP:2181, initiating session

2016-12-15 10:15:08,768 INFO zookeeper.ClientCnxn (ClientCnxn.java:onConnected(1279)) - Session establishment complete on server yyy.com/IP:2181, sessionid = 0x3 590197ed680104, negotiated timeout = 10000

2016-12-15 10:15:08,784 INFO service.AbstractService (AbstractService.java:noteFailure(272)) - Service org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService failed in state INITED; cause: java.io.IOException: Couldn't create /yarn-leader-election java.io.IOException: Couldn't create /yarn-leader-election at org.apache.hadoop.ha.ActiveStandbyElector.ensureParentZNode(ActiveStandbyElector.java:350) at org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.serviceInit(EmbeddedElectorService.java:96) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceInit(AdminService.java:152) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:281) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1228) Caused by: org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = NoAuth for /yarn-leader-election at org.apache.zookeeper.KeeperException.create(KeeperException.java:113) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) at org.apache.hadoop.ha.ActiveStandbyElector$3.run(ActiveStandbyElector.java:1000) at org.apache.hadoop.ha.ActiveStandbyElector$3.run(ActiveStandbyElector.java:997) at org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1041) at org.apache.hadoop.ha.ActiveStandbyElector.createWithRetries(ActiveStandbyElector.java:997) at org.apache.hadoop.ha.ActiveStandbyElector.ensureParentZNode(ActiveStandbyElector.java:344) ... 9 more

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Resourcemanagers(HA) don't start

Explorer

It looks like your RM doesn't have write access to the root znode, and it can't create /yarn-leader-election

Please ensure that you have proper ACL on /

View solution in original post

2 REPLIES 2

Re: Resourcemanagers(HA) don't start

Explorer

It looks like your RM doesn't have write access to the root znode, and it can't create /yarn-leader-election

Please ensure that you have proper ACL on /

View solution in original post

Re: Resourcemanagers(HA) don't start

Yes, that was the issue.

I changed the ACL to r instead of cdrwa, which was causing the issue. As soon i changed it back to cdrwa, resourcemanagers started.

Thanks a lot :)