Support Questions

Find answers, ask questions, and share your expertise

YARN - Zookeeper failing a few moments after restart

avatar
Explorer

Good morning guys, thanks in advance for your help!

I have a project that fails. I'm trying to restart all the services manually but havent been able to.
I have a few questions and I'd really appreciate if you can give me some guidance because at this moment I'm kinda stuck.


1. How do I check what services need to be "up and running" before restarting the next one? Is there any place where I can see the dependency?
2. Do I need 2 ZooKeeper servers up and running? The first one is running in localhost but the 2nd one runs in a different machine. If I actually need them both, how can I check what was wrong in the second one?

110104-ambarierrors.png

1 ACCEPTED SOLUTION

avatar
Master Mentor

@Ray Teruya

OutOfMemoryError is a subclass of java.lang.VirtualMachineError; it’s thrown by the JVM when it encounters a problem related to utilizing resources. More specifically, the error occurs when the JVM spent too much time performing Garbage Collection and was only able to reclaim very little heap space.

110185-1564828294767.png

According to Java docs, by default, the JVM is configured to throw this error if the Java process spends more than 98% of its time doing GC and when only less than 2% of the heap is recovered in each run. In other words, this means that our application has exhausted nearly all the available memory and the Garbage Collector has spent too much time trying to clean it and failed repeatedly.

In this situation, users experience extreme slowness of the application. Certain operations, which usually complete in milliseconds, take more time to complete. This is because the CPU is using its entire capacity for Garbage Collection and hence cannot perform any other tasks.

Solution:


On HDP 3.x & 2.6.x depending on the memory available to the cluster check and increase the below

110193-1564829131326.png

You could throttle it to 2048 MB

HTH

View solution in original post

10 REPLIES 10

avatar
Master Mentor

@ray_teruya 

If you found this answer addressed your question, please take a moment to log in and click the "kudos" link on the answer.
 
That would be a great help to Community users to find the solution quickly for these kinds of errors.