I am experiencing a little off behaviour from node manager and resource manager:
I have about 9-10 node managers in my cluster. HA mode is enabled for the resource manager.
Out of two nodes, RM is running on, whenever active RM runs on node1, my node managers keep exiting. However, this behavior is rare when RM is active on node2.
In this, I would just remove RM from node 1 and install it on another node, but as soon as I do it, my oozie jobs start getting killed with
Job tracker is not whitelisted on oozie server. Is that because by job history server is installed on node1 or something else?
Please suggest. I don't see out of memory errors in logs so not sure what's wrong with RM running on node1. I do have a lot of services like hue, oozie, sentry ,hive,sqoop running on node1 none of which show any problems.