Reply
Explorer
Posts: 12
Registered: ‎11-10-2016

Yarn

Hi ,

 

Is there a way that we can find out or do the analysis for the Containers  that get failed for the respective Job ,As there is not a possibility that failed containers will only be in Failed job it can be in job which didn't get fail after retry launch of container Job got completed.

 

By this Tsquery it will give all the job id with any of the task(map or reduce task) got failed :

select num_failed_tasks from YARN_APPLICATIONS where service_name = "yarn"  and num_failed_tasks >= 1.0

 

But for the above tsquery total tasks failed is not matching with the result of below tsquery as which gives result of all the containers failed on respective nodemanagers :

 

select counter_delta(containers_failed_rate)

 

Please let me know if by other means we can do analysis for container failed in  any job.

Announcements