Created 08-26-2016 09:40 AM
we have batch hive load every day running on hive(tez) in IBM tivoli scheduler ..we can see in below screenshot have two type of kill 1) map and reducer have 100% completed
2)map and reducer not started
Can any one explain the difference of both...
The time when dag gets kills really depends. Basically the progress is calculated in task perspective, but there are still other things that might fail. For example, If DAG finishes all tasks but fails to commit the output, you get first case; if DAG fails to initialize vertices before scheduling any job, you get second case.
Sorry for late reply. Just finish weekends in US. To know the root cause, you need to understand why job fails. Please share some more information about the job(like query, logs, clusters info) so that we can diagnostic.
thanks you so much@zyang i really appriate your work..i have below info regarding vertex failure..
Exact failure occurred while running: hive -f /data/edw/prod/nazhdp/c720/bin/wrk/PID_Gen/pre_stage_to_delta.sql Please look into the above sql script Log : Loading data to table db_c720_stg.event_int_key Table db_c720_stg.event_int_key stats: [numFiles=33, numRows=523, totalSize=2033867, rawDataSize=7853] MapReduce Jobs Launched: Job 0: HDFS Read: 0 HDFS Write: 0 SUCCESS Job 1: HDFS Read: 0 HDFS Write: 0 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
Time taken: 36.036 seconds
Logging initialized using configuration in file:/etc/hive/conf.dist/hive-log4j.properties Warning: Map Join MAPJOIN[bigTable=?] in task 'Map 2' is a cross product Query ID = abic720prod_20160831050606_407ea339-96de-4b5b-85bb-1f808d0d7354 Total jobs = 1 Launching Job 1 out of 1
Status: Running (application id: application_1443521267046_245061) Map 1: -/- Map 2: -/- Map 3: -/- Map 4: -/- Map 5: -/- Map 6: -/- Reducer 7: 0/1
Status: Failed Vertex failed, vertexName=Map 5, vertexId=vertex_1443521267046_245061_1_03, diagnostics=[Vertex Input: a initializer failed.]
Vertex killed, vertexName=Map 4, vertexId=vertex_1443521267046_245061_1_04, diagnostics=[Vertex received Kill in INITED state.]
Vertex killed, vertexName=Map 1, vertexId=vertex_1443521267046_245061_1_06, diagnostics=[Vertex received Kill in INITED state.]
Vertex killed, vertexName=Map 3, vertexId=vertex_1443521267046_245061_1_05, diagnostics=[Vertex received Kill in INITED state.]
Vertex killed, vertexName=Reducer 7, vertexId=vertex_1443521267046_245061_1_01, diagnostics=[Vertex received Kill in INITED state.]
Vertex killed, vertexName=Map 6, vertexId=vertex_1443521267046_245061_1_02, diagnostics=[Vertex received Kill in INITED state.]
Vertex killed, vertexName=Map 2, vertexId=vertex_1443521267046_245061_1_00, diagnostics=[Vertex received Kill in INITED state.]
DAG failed due to vertex failure. failedVertices:1 killedVertices:6 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask c720DlyEvt.sh returned 2 for last run. Script Aborted.