Support Questions

Find answers, ask questions, and share your expertise

Cloudera Manager agent bad healthy

avatar
Expert Contributor

my  CDH Cluster has been online more than half year,  all the machines and service is normal.

 

but recently , there is one host agent always under bad healthy, and Cloudera Manager will show the red color on this host.

 

then I check the agent log as below:

[16/Nov/2019 02:20:11 +0000] 1142 MainThread agent WARNING Long HB processing time: 9.52771306038
[16/Nov/2019 02:20:52 +0000] 1142 MainThread agent WARNING Long HB processing time: 34.8044610023
[16/Nov/2019 02:20:52 +0000] 1142 MainThread agent WARNING Delayed HB: 19s since last
[16/Nov/2019 02:21:14 +0000] 1142 MainThread agent WARNING Long HB processing time: 22.3401939869
[16/Nov/2019 02:21:14 +0000] 1142 MainThread agent WARNING Delayed HB: 7s since last
[16/Nov/2019 02:21:51 +0000] 1142 MainThread agent WARNING Long HB processing time: 36.7459609509
[16/Nov/2019 02:21:51 +0000] 1142 MainThread agent WARNING Delayed HB: 21s since last
[16/Nov/2019 02:22:24 +0000] 1142 MainThread agent WARNING Long HB processing time: 33.0430369377
[16/Nov/2019 02:22:24 +0000] 1142 MainThread agent WARNING Delayed HB: 18s since last
[16/Nov/2019 02:23:04 +0000] 1142 MainThread agent WARNING Long HB processing time: 40.0112349987
[16/Nov/2019 02:23:04 +0000] 1142 MainThread agent WARNING Delayed HB: 25s since last
[16/Nov/2019 02:23:35 +0000] 1142 MainThread agent WARNING Long HB processing time: 30.6218628883
[16/Nov/2019 02:23:35 +0000] 1142 MainThread agent WARNING Delayed HB: 15s since last
[16/Nov/2019 02:24:17 +0000] 1142 MainThread agent WARNING Long HB processing time: 42.0491158962
[16/Nov/2019 02:24:17 +0000] 1142 MainThread agent WARNING Delayed HB: 27s since last
[16/Nov/2019 02:24:55 +0000] 1142 MainThread heartbeat_tracker INFO HB stats (seconds): num:17 LIFE_MIN:0.01 min:0.02 mean:0.18 max:0.53 LIFE_MAX:1.67
[16/Nov/2019 02:24:56 +0000] 1142 MainThread agent WARNING Long HB processing time: 38.8551709652
[16/Nov/2019 02:24:56 +0000] 1142 MainThread agent WARNING Delayed HB: 23s since last
[16/Nov/2019 02:25:21 +0000] 1142 MainThread agent WARNING Long HB processing time: 25.7106430531
[16/Nov/2019 02:25:21 +0000] 1142 MainThread agent WARNING Delayed HB: 10s since last
[16/Nov/2019 02:29:03 +0000] 1142 MainThread agent WARNING Long HB processing time: 169.729107141
[16/Nov/2019 02:29:03 +0000] 1142 MainThread agent WARNING Delayed HB: 154s since last
[16/Nov/2019 02:33:56 +0000] 1142 MainThread agent WARNING Long HB processing time: 292.624284029
[16/Nov/2019 02:33:56 +0000] 1142 MainThread agent WARNING Delayed HB: 277s since last

 

I am going to check the server cpu load and other things, it seems nothing happened. then I restart agent, but it will happen hours later or one day later , so my questions is:

1) how this error happened ?

 

thanks

 

 

 

1 ACCEPTED SOLUTION

avatar
Guru

Hi @iamfromsky ,

 

Thanks for keeping us posted about the issue. 

 

Li Wang, Technical Solution Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Terms of Service

Community Guidelines

How to use the forum

View solution in original post

17 REPLIES 17

avatar
Expert Contributor

basically these node machines just install DataNode, impala, node manager, no others. 

 

I have changed impala log to WARN level, if this issue still happened  then I will change other DataNode and node manager log level. hope can find out root cause .

 

anything good and bad news will feed back to you , thanks.

avatar
Expert Contributor

after change Impala log level to WARN, the agent connectivity  issue happened frequency has been reduced. but still two servers happened, after stop agent and delete impala log, these the issue on these two servers hasn't happened again.

 

I will continue monitor the agent issue, and feedback to you.

 

 

avatar
Guru

Hi @iamfromsky ,

 

Thanks for keeping us posted about the issue. 

 

Li Wang, Technical Solution Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Terms of Service

Community Guidelines

How to use the forum

avatar
Expert Contributor
after monitor agent status more than ten days, I think this issue has been resolved by your solution. it seems this issue caused by impala logs, since I just have changed the impala log level to WARN, then it haven't happened again. thanks.

avatar
Guru

Hi @iamfromsky ,

 

Thanks for the update and glad to hear the issue was resolved. 

 

Thanks,

Li

Li Wang, Technical Solution Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Terms of Service

Community Guidelines

How to use the forum

avatar
New Contributor

@iamfromsky @lwang "change Impala log level to WARN"
what is the meaning of this can you please elaborate it?

avatar
Community Manager

Hi @Sonu, as this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post.



Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:

avatar
Explorer

jus restart impala role and fix this,bucause there have manny tcp conntion

WX20230803-174634.png