Sometimes our yarn jobs take too long to complete and hold up the line for other pending jobs as our jobs are very time sensitive.
Can someone please suggest any best practices in regards to monitoring our yarn jobs?
Thanks @GangWar for the response.
I believe I have checked this site out before.
Maybe I missed it in the link you provided, but I am interested in a way to monitor yarn jobs and possibly send out some alert when a job is running too long or possibly send some report of the status of each job, maybe how long each job took to run for the day.
Cloudera Manager trigger is what you need.
You can create it here: CM -> YARN -> Status -> Create trigger -> Edit manually
1) It will alert if there are more than 50 applications in the pending state
IF (select total_apps_pending_across_yarn_pools WHERE entityName=$SERVICENAME and LAST( total_apps_pending_across_yarn_pools) > 50) DO health:concerning
Metric Evaluation Window: 10 minutes
2) It will alert if more than 5 applications are failing
IF (select total_apps_failed_rate_across_yarn_pools WHERE entityName=$SERVICENAME and LAST( total_apps_failed_rate_across_yarn_pools) > 5) DO health:concerning
Here is the documentation about CM triggers:
Here is the documentation about CM reports:
Thanks @jagadeesan for your response.
Currently we do not have any vendor support and I believe we need to be with Cloudera in order to use Cloudera Manager.
But is there any good solution that does not require any paid subscription?
Hi @ryu, then you might need to build some customize in-house monitoring scripts using Yarn APIs or other tools like Prometheus or Grafana for your use case. Please also refer to the below links for more insights