Created 05-26-2022 09:39 AM
Hello folks!
I have a cluster with 9 machines, running on CDH 6.2 (OnPremise). I have 3 master, 1 edge and 5 workers.
I am not able to up 2 of 5 NodeManager on workers. 3 of them are ok, and 2 of them give me a follow log (attach), without error but a Warning with "NullPointerException":.
When I put the NodeManager to run, on Cloudera Manager it doens't fail, but I got two alerts, as follow:
- NodeManager can not connect to ResourceManager
- ResourceManager could not connect to Web Server of NodeManager
Also, I can't access the /jmx of the server. And, when I run NodeManager by Cloudera Manager, my CPU going to use of 100%.
On that 2 workers, I have RegionServer and DataNode working fine, the problem is only with NodeManager.
Please, any suggest?
Created 06-02-2022 08:47 AM
Hi guys!
Finally we solved the problem. To fix it, we moved all content from "yarn.nodemanager.recovery.dir" config path to another one (i.e mv yarn-rm-recovery yarn-rm-recovery-backup) and we created yarn-rm-recovery again, grant permisison to yarn:hadoop to folder.
After that, we can start NodeManager with no error.
Thanks all!
Created 06-02-2022 08:47 AM
Hi guys!
Finally we solved the problem. To fix it, we moved all content from "yarn.nodemanager.recovery.dir" config path to another one (i.e mv yarn-rm-recovery yarn-rm-recovery-backup) and we created yarn-rm-recovery again, grant permisison to yarn:hadoop to folder.
After that, we can start NodeManager with no error.
Thanks all!
Created 06-03-2022 12:48 AM
That is great, thank you for sharing the solution!
Best regards
Miklos