Member since
09-03-2019
1
Post
0
Kudos Received
0
Solutions
11-25-2019
09:03 AM
Hi @AstroPratik , First, in order for us to provide the best help, we need to make sure we have information about the issue you are observing. My guess is you are seeing the same health alert in Cloudera Manager, but we also need to confirm you are seeing the same messages in the agent log. If so, please follow the instructions to provide a thread dump via the SIGQUIT signal. The instructions I provided for the "kill -SIGQUIT" command only work in Cloudera Manager 5.x. If you are using CM 6, you can use the following: kill -SIGQUIT $(systemctl show -p MainPID cloudera-scm-agent.service 2>/dev/null | cut -d= -f2) If you do run the kill SIGQUIT make sure to run it a couple times so we can compare snapshots AND make sure you get the thread dump when the problem is occurring. NOTE: After reviewing the previous party's thread dump, it appears that a thread that is spawned to collect information for a diagnostic bundle is slow in processing; the thread that uploads service and host information to the Host and Service Monitor servers also seems to be slow. Since the process of obtaining a diagnostic bundle is something that does not happen often, it is likely that the bundle creation is triggering the old event. There are a number of possible causes for "firehose" trouble, though, so it is important that we understand the facts about your situation before making any judgements.
... View more