Created on 02-06-2016 08:59 PM
Problem:
File"/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140,in _call_wrapper
result = _call(command,**kwargs_copy)
File"/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291,in _call
raiseFail(err_msg)
resource_management.core.exceptions.Fail:Execution of 'yarn resourcemanager -format-state-store' returned 255.15/10/2616:11:16 INFO resourcemanager.ResourceManager: STARTUP_MSG:
15/10/2616:11:17 INFO recovery.ZKRMStateStore: org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$VerifyActiveStatusThread thread interrupted!Exiting!
15/10/2616:11:17 INFO zookeeper.ZooKeeper:Session:0x150a4b3429b0002 closed
15/10/2616:11:17 FATAL resourcemanager.ResourceManager:Error starting ResourceManager
org.apache.zookeeper.KeeperException$NotEmptyException:KeeperErrorCode=Directorynot empty for/rmstore/ZKRMStateRoot/RMAppRoot
at org.apache.zookeeper.KeeperException.create(KeeperException.java:125)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873)
at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.recursiveDeleteWithRetriesHelper(ZKRMStateStore.java:1049)
at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.recursiveDeleteWithRetriesHelper(ZKRMStateStore.java:1045)
at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.access$500(ZKRMStateStore.java:89)
at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$10.run(ZKRMStateStore.java:1032)
at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$10.run(ZKRMStateStore.java:1029)
at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1104)
at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1125)
at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.deleteWithRetries(ZKRMStateStore.java:1029)
at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.deleteStore(ZKRMStateStore.java:825)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.deleteRMStateStore(ResourceManager.java:1267)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1190)
15/10/2616:11:17 INFO zookeeper.ClientCnxn:EventThread shut down
15/10/2616:11:17 INFO resourcemanager.ResourceManager: SHUTDOWN_MSG:
Solution:
Error details:
FATAL resourcemanager.ResourceManager:Error starting ResourceManager
org.apache.zookeeper.KeeperException$NotEmptyException:KeeperErrorCode=Directory not empty for /rmstore/ZKRMStateRoot/RMAppRoot
Please see this. In my case, I have all the application data sitting under that particular location
[zk: localhost:2181(CONNECTED) 2] ls /rmstore/ZKRMStateRoot/RMAppRoot
[application_1445593412630_0002, application_1445593412630_0001, application_1445366030467_0002, application_1445366030467_0001, application_1445366030467_0004, application_1445366030467_0003, application_1445593412630_0006, application_1445366030467_0005, application_1445593412630_0005, application_1445593412630_0004, application_1445593412630_0003, application_1445173693339_0006, application_1445173693339_0005, application_1445173693339_0004, application_1445173693339_0003, application_1445173693339_0002, application_1445173693339_0001, application_1445394313024_0004, application_1445394313024_0003, application_1445394313024_0002, application_1445394313024_0001, application_1445394313024_0008, application_1445394313024_0007, application_1445394313024_0006, application_1445394313024_0005]
[zk: localhost:2181(CONNECTED) 3] quit
Quitting...
[zk: localhost:2181(CONNECTED) 3] rmr /rmstore/ZKRMStateRoot/RMAppRoot
[zk: localhost:2181(CONNECTED) 4] ls /rmstore/ZKRMStateRoot/RMAppRoot
Node does not exist: /rmstore/ZKRMStateRoot/RMAppRoot
Restart Yarn and I got the location back
[zk: localhost:2181(CONNECTED) 6] ls /rmstore/ZKRMStateRoot/RMAppRoot
[]
[zk: localhost:2181(CONNECTED) 7]
[zk: localhost:2181(CONNECTED) 7] ls /rmstore/ZKRMStateRoot
[AMRMTokenSecretManagerRoot, RMAppRoot, EpochNode, RMDTSecretManagerRoot, RMVersionNode]
[zk: localhost:2181(CONNECTED) 8]
You can try this but if you are not sure or its prod then open support ticket. "Consult support before doing this in production"