Support Questions

Find answers, ask questions, and share your expertise

NodeManager fails to start

avatar

Hello folks!

 

I have a cluster with 9 machines, running on CDH 6.2 (OnPremise). I have 3 master, 1 edge and 5 workers.

 

I am not able to up 2 of 5 NodeManager on workers. 3 of them are ok, and 2 of them give me a follow log (attach), without error but a Warning with "NullPointerException":.

 

When I put the NodeManager to run, on Cloudera Manager it doens't fail, but I got two alerts, as follow:

- NodeManager can not connect to ResourceManager

- ResourceManager could not connect to Web Server of NodeManager

 

Also, I can't access the /jmx of the server. And, when I run NodeManager by Cloudera Manager, my CPU going to use of 100%.

 

On that 2 workers, I have RegionServer and DataNode working fine, the problem is only with NodeManager.

 

Please, any suggest?

 

 

 

 

 

1 ACCEPTED SOLUTION

avatar

Hi guys!

 

Finally we solved the problem. To fix it, we moved all content from  "yarn.nodemanager.recovery.dir" config path to another one (i.e mv yarn-rm-recovery yarn-rm-recovery-backup) and we created yarn-rm-recovery again, grant permisison to yarn:hadoop to folder.

 

After that, we can start NodeManager with no error.

 

Thanks all!

View solution in original post

11 REPLIES 11

avatar

Hi guys!

 

Finally we solved the problem. To fix it, we moved all content from  "yarn.nodemanager.recovery.dir" config path to another one (i.e mv yarn-rm-recovery yarn-rm-recovery-backup) and we created yarn-rm-recovery again, grant permisison to yarn:hadoop to folder.

 

After that, we can start NodeManager with no error.

 

Thanks all!

avatar

That is great, thank you for sharing the solution! 

 

Best regards

 Miklos