Support Questions

Find answers, ask questions, and share your expertise

detecting yarn job failure

avatar

I am trying to run a yarn job from a shell script. IF the job fails i should makethe control flow in my script accordingly. IS there a way to get the application id of the submitted job and check it status and proceed further.

1 ACCEPTED SOLUTION

avatar
Master Guru

@ARUN

You can redirect console output to some file --> grep application ID from that output file --> use yarn command get the job information

#Run job

[hdfs@prodnode1 ~]$ /usr/hdp/current/hadoop-client/bin/hadoop jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples-2.7.1.2.4.2.0-258.jar pi 10 10 1>/tmp/op 2>/tmp/op &

#Grep Application ID

[hdfs@prodnode1 ~]$ grep 'Submitted application' /tmp/op |rev|cut -d' ' -f1|rev
application_1478509018160_0003
[hdfs@prodnode1 ~]$

#Get status

[hdfs@prodnode1 ~]$ yarn application -status application_1478509018160_0003
16/11/11 13:06:07 INFO impl.TimelineClientImpl: Timeline service address: http://prodnode3.openstacklocal:8188/ws/v1/timeline/
16/11/11 13:06:07 INFO client.RMProxy: Connecting to ResourceManager at prodnode3.openstacklocal/172.26.74.211:8050
Application Report :
Application-Id : application_1478509018160_0003
Application-Name : QuasiMonteCarlo
Application-Type : MAPREDUCE
User : hdfs
Queue : default
Start-Time : 1478869426329
Finish-Time : 1478869463505
Progress : 100%
State : FINISHED
Final-State : SUCCEEDED
Tracking-URL : http://prodnode3.openstacklocal:19888/jobhistory/job/job_1478509018160_0003
RPC Port : 42357
AM Host : prodnode1.openstacklocal
Aggregate Resource Allocation : 129970 MB-seconds, 228 vcore-seconds
Log Aggregation Status : SUCCEEDED
Diagnostics :
[hdfs@prodnode1 ~]$

Hope this information helps! 🙂

View solution in original post

1 REPLY 1

avatar
Master Guru

@ARUN

You can redirect console output to some file --> grep application ID from that output file --> use yarn command get the job information

#Run job

[hdfs@prodnode1 ~]$ /usr/hdp/current/hadoop-client/bin/hadoop jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples-2.7.1.2.4.2.0-258.jar pi 10 10 1>/tmp/op 2>/tmp/op &

#Grep Application ID

[hdfs@prodnode1 ~]$ grep 'Submitted application' /tmp/op |rev|cut -d' ' -f1|rev
application_1478509018160_0003
[hdfs@prodnode1 ~]$

#Get status

[hdfs@prodnode1 ~]$ yarn application -status application_1478509018160_0003
16/11/11 13:06:07 INFO impl.TimelineClientImpl: Timeline service address: http://prodnode3.openstacklocal:8188/ws/v1/timeline/
16/11/11 13:06:07 INFO client.RMProxy: Connecting to ResourceManager at prodnode3.openstacklocal/172.26.74.211:8050
Application Report :
Application-Id : application_1478509018160_0003
Application-Name : QuasiMonteCarlo
Application-Type : MAPREDUCE
User : hdfs
Queue : default
Start-Time : 1478869426329
Finish-Time : 1478869463505
Progress : 100%
State : FINISHED
Final-State : SUCCEEDED
Tracking-URL : http://prodnode3.openstacklocal:19888/jobhistory/job/job_1478509018160_0003
RPC Port : 42357
AM Host : prodnode1.openstacklocal
Aggregate Resource Allocation : 129970 MB-seconds, 228 vcore-seconds
Log Aggregation Status : SUCCEEDED
Diagnostics :
[hdfs@prodnode1 ~]$

Hope this information helps! 🙂