Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Unable to start Node manager

Solved Go to solution
Highlighted

Unable to start Node manager

Contributor

Hi,

I am unable to start node manager on a node and get the attached error, any help is resolving this is much appreciated. 

Error starting NodeManager
org.apache.hadoop.service.ServiceStateException: org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 1 missing files; e.g.: /var/lib/hadoop-yarn/yarn-nm-recovery/yarn-nm-state/000042.sst
	at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartRecoveryStore(NodeManager.java:181)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:245)
	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:562)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:609)
Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 1 missing files; e.g.: /var/lib/hadoop-yarn/yarn-nm-recovery/yarn-nm-state/000042.sst
	at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200)
	at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218)
	at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168)
	at org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.openDatabase(NMLeveldbStateStoreService.java:950)
	at org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.initStorage(NMLeveldbStateStoreService.java:937)
	at org.apache.hadoop.yarn.server.nodemanager.recovery.NMStateStoreService.serviceInit(NMStateStoreService.java:210)
	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
	... 5 more

 

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Unable to start Node manager

Cloudera Employee

Hi,

The error message confirms that the LevelDB holding the YARN state store is corrupt :

org.apache.hadoop.service.ServiceStateException: org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 1 missing files; e.g.: /var/lib/hadoop-yarn/yarn-nm-recovery/yarn-nm-state/000042.sst

Solution is to clean up and recreate the state store database :

1. In CM make sure the affected NodeManager has status STOP
2. Backup the contents under /var/lib/hadoop-yarn/yarn-nm-recovery/yarn-nm-state to a different directory.
3. Delete all the contents under /var/lib/hadoop-yarn/yarn-nm-recovery/yarn-nm-state.
4. Start affected NodeManager.

1 REPLY 1

Re: Unable to start Node manager

Cloudera Employee

Hi,

The error message confirms that the LevelDB holding the YARN state store is corrupt :

org.apache.hadoop.service.ServiceStateException: org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 1 missing files; e.g.: /var/lib/hadoop-yarn/yarn-nm-recovery/yarn-nm-state/000042.sst

Solution is to clean up and recreate the state store database :

1. In CM make sure the affected NodeManager has status STOP
2. Backup the contents under /var/lib/hadoop-yarn/yarn-nm-recovery/yarn-nm-state to a different directory.
3. Delete all the contents under /var/lib/hadoop-yarn/yarn-nm-recovery/yarn-nm-state.
4. Start affected NodeManager.