Support Questions

Find answers, ask questions, and share your expertise

Any suggestion to monitor yarn jobs?

avatar
Contributor

Hi,

Sometimes our yarn jobs take too long to complete and hold up the line for other pending jobs as our jobs are very time sensitive.


Can someone please suggest any best practices in regards to monitoring our yarn jobs?

 

Thanks,

5 REPLIES 5

avatar
Master Guru

@ryu Check out this link : https://docs.cloudera.com/documentation/enterprise/latest/topics/cm_dg_yarn_applications.html


Cheers!
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

avatar
Contributor

Thanks @GangWar  for the response.

I believe I have checked this site out before.

Maybe I missed it in the link you provided, but I am interested in a way to monitor yarn jobs and possibly send out some alert when a job is running too long or possibly send some report of the status of each job, maybe how long each job took to run for the day.

 

 

avatar
Master Collaborator

Hi @ryu,

Cloudera Manager trigger is what you need. 
You can create it here: CM -> YARN -> Status -> Create trigger -> Edit manually

 

Examples:

1) It will alert if there are more than 50 applications in the pending state

 

Expression:

IF (select total_apps_pending_across_yarn_pools WHERE entityName=$SERVICENAME and LAST( total_apps_pending_across_yarn_pools) > 50) DO health:concerning


Metric Evaluation Window: 10 minutes

 

2) It will alert if more than 5 applications are failing

 

Expression:

IF (select total_apps_failed_rate_across_yarn_pools WHERE entityName=$SERVICENAME and LAST( total_apps_failed_rate_across_yarn_pools) > 5) DO health:concerning

 

Here is the documentation about CM triggers:

http://www.cloudera.com/documentation/enterprise/latest/topics/cm_dg_triggers.html

 

Here is the documentation about CM reports:
https://docs.cloudera.com/documentation/enterprise/latest/topics/cm_dg_reports.html

avatar
Contributor

Thanks @jagadeesan for your response.

 

Currently we do not have any vendor support and I believe we need to be with Cloudera in order to use Cloudera Manager.

 

But is there any good solution that does not require any paid subscription?

 

Thanks,

avatar
Master Collaborator

Hi @ryu, then you might need to build some customize in-house monitoring scripts using Yarn APIs or other tools like Prometheus or Grafana for your use case. Please also refer to the below links for more insights
https://www.programmersought.com/article/61565532790/
http://rokroskar.github.io/monitoring-spark-on-hadoop-with-prometheus-and-grafana.html

https://www.linkedin.com/pulse/how-monitor-yarn-application-via-restful-api-wayne-zhu/