Created 11-15-2016 02:33 PM
Created 11-17-2016 03:32 AM
If nodemanager.recovery.enabled is set to true, please set it to false and try starting the nodemanager. This should get you unblocked. NM is failing during recovery:
2016-11-16 23:49:26,076 INFO recovery.NMLeveldbStateStoreService (NMLeveldbStateStoreService.java:initStorage(927)) - Using state database at /var/log/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state for recovery 2016-11-16 23:49:26,299 INFO recovery.NMLeveldbStateStoreService$LeveldbLogger (NMLeveldbStateStoreService.java:log(969)) - Recovering log #362 2016-11-16 23:49:26,306 INFO recovery.NMLeveldbStateStoreService$LeveldbLogger (NMLeveldbStateStoreService.java:log(969)) - Delete type=3 #361 2016-11-16 23:49:26,306 INFO recovery.NMLeveldbStateStoreService$LeveldbLogger (NMLeveldbStateStoreService.java:log(969)) - Delete type=0 #362 2016-11-16 23:49:26,681 INFO service.AbstractService (AbstractService.java:noteFailure(272)) - Service org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService failed in state INITED; cause: org.iq80.leveldb.DBException: Invalid argument: not an sstable (bad magic number) org.iq80.leveldb.DBException: Invalid argument: not an sstable (bad magic number)
Created 11-15-2016 02:39 PM
Can you please share logs?
Created 11-15-2016 03:24 PM
https://drive.google.com/file/d/0B8mfisW8zEfLdmxqSXktZWtWcjA/view?usp=sharing https://drive.google.com/file/d/0B8mfisW8zEfLMzNkQ2Q0SE4zS0U/view?usp=sharing
Shared yarn resource manager logs. expecting any other logs ?
Created 11-15-2016 05:03 PM
through you log you have some corrupted files
hdfs fsck / double check with this command
Please find here how to check that : http://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hdfs-files/19216037#19216037
Created 11-16-2016 02:14 PM
No corrupt blocks.
Status: HEALTHY Total size: 2286120271 B Total dirs: 1826 Total files: 3378 Total symlinks: 0 (Files currently being written: 9) Total blocks (validated): 2905 (avg. block size 786960 B) (Total open file blocks (not validated): 3) Minimally replicated blocks: 2905 (100.0 %) Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 2905 (100.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor: 3 Average block replication: 1.0 Corrupt blocks: 0 Missing replicas: 13902 (82.71554 %) Number of data-nodes: 1 Number of racks: 1 FSCK ended at Wed Nov 16 14:10:34 UTC 2016 in 2531 milliseconds
Created 11-16-2016 04:17 PM
Try to restart ambari-agent , or provide logs for yarn-nodemanager /var/log/hadoop-yarn/yarn/yarn-yarn-nodemanager-xxx.com.log
Created 11-17-2016 12:10 AM
Created 11-17-2016 03:32 AM
If nodemanager.recovery.enabled is set to true, please set it to false and try starting the nodemanager. This should get you unblocked. NM is failing during recovery:
2016-11-16 23:49:26,076 INFO recovery.NMLeveldbStateStoreService (NMLeveldbStateStoreService.java:initStorage(927)) - Using state database at /var/log/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state for recovery 2016-11-16 23:49:26,299 INFO recovery.NMLeveldbStateStoreService$LeveldbLogger (NMLeveldbStateStoreService.java:log(969)) - Recovering log #362 2016-11-16 23:49:26,306 INFO recovery.NMLeveldbStateStoreService$LeveldbLogger (NMLeveldbStateStoreService.java:log(969)) - Delete type=3 #361 2016-11-16 23:49:26,306 INFO recovery.NMLeveldbStateStoreService$LeveldbLogger (NMLeveldbStateStoreService.java:log(969)) - Delete type=0 #362 2016-11-16 23:49:26,681 INFO service.AbstractService (AbstractService.java:noteFailure(272)) - Service org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService failed in state INITED; cause: org.iq80.leveldb.DBException: Invalid argument: not an sstable (bad magic number) org.iq80.leveldb.DBException: Invalid argument: not an sstable (bad magic number)
Created 11-17-2016 02:28 PM
Problem solved. Node manager is up after setting this flag as false. Thank you Mugdha.
Created 08-29-2023 12:56 PM
I am also Getting the error " localhost: ERROR: Cannot set priority of nodemanager process 41819". Can someone provide me answer