Created 02-02-2021 04:18 PM
Hi,
Sometimes our yarn jobs take too long to complete and hold up the line for other pending jobs as our jobs are very time sensitive.
Can someone please suggest any best practices in regards to monitoring our yarn jobs?
Thanks,
Created 02-03-2021 10:12 AM
@ryu Check out this link : https://docs.cloudera.com/documentation/enterprise/latest/topics/cm_dg_yarn_applications.html
Created 02-03-2021 03:54 PM
Thanks @GangWar for the response.
I believe I have checked this site out before.
Maybe I missed it in the link you provided, but I am interested in a way to monitor yarn jobs and possibly send out some alert when a job is running too long or possibly send some report of the status of each job, maybe how long each job took to run for the day.
Created 02-03-2021 10:09 PM
Hi @ryu,
Cloudera Manager trigger is what you need.
You can create it here: CM -> YARN -> Status -> Create trigger -> Edit manually
Examples:
1) It will alert if there are more than 50 applications in the pending state
Expression:
IF (select total_apps_pending_across_yarn_pools WHERE entityName=$SERVICENAME and LAST( total_apps_pending_across_yarn_pools) > 50) DO health:concerning
Metric Evaluation Window: 10 minutes
2) It will alert if more than 5 applications are failing
Expression:
IF (select total_apps_failed_rate_across_yarn_pools WHERE entityName=$SERVICENAME and LAST( total_apps_failed_rate_across_yarn_pools) > 5) DO health:concerning
Here is the documentation about CM triggers:
http://www.cloudera.com/documentation/enterprise/latest/topics/cm_dg_triggers.html
Here is the documentation about CM reports:
https://docs.cloudera.com/documentation/enterprise/latest/topics/cm_dg_reports.html
Created 02-04-2021 09:33 AM
Thanks @jagadeesan for your response.
Currently we do not have any vendor support and I believe we need to be with Cloudera in order to use Cloudera Manager.
But is there any good solution that does not require any paid subscription?
Thanks,
Created 04-10-2021 12:53 AM
Hi @ryu, then you might need to build some customize in-house monitoring scripts using Yarn APIs or other tools like Prometheus or Grafana for your use case. Please also refer to the below links for more insights
https://www.programmersought.com/article/61565532790/
http://rokroskar.github.io/monitoring-spark-on-hadoop-with-prometheus-and-grafana.html
https://www.linkedin.com/pulse/how-monitor-yarn-application-via-restful-api-wayne-zhu/