Created 10-14-2015 05:34 AM
Ambari (2.1.1) is configued to run with 12 GBs of RAM in a small 4 node cluster. It still freezes up after running fine for a period of time and then just hangs.
Here are some of the errors seen in the log files -
ambari-server.log
12 Oct 2015 13:04:56,303 ERROR [qtp-client-18920] MetricsPropertyProvider:183 - Error getting timeline metrics. Can not connect to collector, socket error. 12 Oct 2015 13:19:16,643 ERROR [qtp-client-18897] MetricsPropertyProvider:183 - Error getting timeline metrics. Can not connect to collector, socket error. 12 Oct 2015 16:02:46,153 WARN [qtp-client-19308] nio:726 - handle failed 13 Oct 2015 02:19:55,555 WARN [Timer-0] ThreadPoolAsynchronousRunner:608 - com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector@4214238c -- APPARENT DEADLOCK!!! Creating emergency threads for unassigned pending tasks!
ambari-server.out
Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "alert-event-bus-3175" Exception in thread "alert-event-bus-3179" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "alert-event-bus-3179" Exception in thread "alert-event-bus-3178" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "alert-event-bus-3178" Exception in thread "alert-event-bus-3180" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "alert-event-bus-3180" Exception in thread "alert-event-bus-3181" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "alert-event-bus-3181" Exception in thread "alert-event-bus-3182" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "alert-event-bus-3182" Exception in thread "alert-event-bus-3183" Exception in thread "alert-event-bus-3184" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "alert-event-bus-3183" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "alert-event-bus-3184"
Thread Dump shows threads in following states (count - state)
3 java.lang.Thread.State: BLOCKED (on object monitor) 18 java.lang.Thread.State: RUNNABLE 9 java.lang.Thread.State: TIMED_WAITING (on object monitor) 15 java.lang.Thread.State: TIMED_WAITING (parking) 4 java.lang.Thread.State: WAITING (on object monitor) 10 java.lang.Thread.State: WAITING (parking)
An impetus consultant working on the effort notice that there were too many open connections to postgres DB.
Any ideas are appreciated.
Created 10-15-2015 01:19 AM
If you are seeing deadlock on Ambari 2.1.2 , it would be due to Ambari Views Instance creation.
Created 10-14-2015 07:14 AM
I have seen a similar issue a couple weeks ago. Ambari was running fine, but after some time I had to restart ambari-server because Ambari Metrics was spamming the ambari log.
If you have Ambari Metrics installed and enabled, could you please stop the service, restart ambari server and see if the problem still occurs? Also make sure you Ambari Metrics Service is configured correctly, especially the heap (usually too low by default). Check this link for heap tuning https://cwiki.apache.org/confluence/display/AMBARI/Configurations+-+Tuning
Created 10-14-2015 08:47 AM
You need to increase the memory settings for Ambari. I ran into this a while back with certain views.
I added/adjusted the following in:
/var/lib/ambari-server/ambari-env.sh
For "AMBARI_JVM_ARGS"
-Xmx4G -XX:MaxPermSize=512m
Created 10-14-2015 12:48 PM
FWIW, PermGen space has been removed from Java 8, this last param will generate a warning.
Created 10-14-2015 01:29 PM
Using views requires increasing both the Xmx and MaxPermSize, documentation mentioning that is located here: http://docs.hortonworks.com/HDPDocuments/Ambari-2.1.2.0/bk_ambari_views_guide/content/ch_using_ambar.... If you hit this dead lock again please capture the jstack output for the Ambari Server process and work with support to see what the issue is.
Created 10-15-2015 01:19 AM
If you are seeing deadlock on Ambari 2.1.2 , it would be due to Ambari Views Instance creation.
Created 06-02-2016 12:28 PM
A. Run Ambari Metrics in Distributed Mode rather than embedded If you are running with more than 3 nodes, I strongly suggest running in distributed mode and writing hbase.root.dir contents to hdfs directly, rather than to the local disk of a single node. This applies to already installed and running IOP clusters.