Support Questions
Find answers, ask questions, and share your expertise

NodeManagers go down after a few minutes in HDPCA AWS Instance for no reason

New Contributor

Hi guys,

I'm playing around with the AWS instance HDPCA 2.3 and I have some issues when adding the node1.

I just installed the clients und for no explanable reason Ambari alerts the 3 NodeManagers down.

When I restart them, they are reported "running" for a few minutes and become red again.

yarn node -list sais, all tree are running.

Same for the ResourceManager Web UI.

The alert is about the nodemanager web service on port 8042.

After trying this with a new instance and having the same problem, I started my very own HDP installation on 6 vanilla CentOS instances. At some point, I had the same issues.

I don't have any idea, what might be the reason and where I can have a look for deeper analysis.

Any help would be much appreciated.

Thanks and bye,

Chris

11 REPLIES 11

New Contributor

Hi @Vinicius Higa Murakami,

unfortunately it wasn't that easy... the next time, I started my environment, I had these strange errors again.

But - after quite some desperate hours of trial and error - I got the point.

Whenever I started a brand new image, everything was fine. I didn't have any errors.

When I started the shutted down image the next day, it was ruined.

It occurs to me, that a solid termination of the HDP processes in ambari and a service ambari-agent stop with a service ambari-server stop would be a nicer approach and that helped.

When terminating the process in a correct manner, the restart will happen without any errors.

When just shutting down the AWS instance, it breaks.

That's it, plain and simple.

Well... the good point is: I learned a lot 😉

Thanks for your help!

Chris

Good one! Gotcha 🙂

I didn't know about this either.
Keep it up with your studies on HDPCA 😄