Support Questions
Find answers, ask questions, and share your expertise

I have small concern on DAG flow and hive(tez) batch load failures(kill) scheduled using " IBM tivoli" scheduler ..

Rising Star

we have batch hive load every day running on hive(tez) in IBM tivoli scheduler ..we can see in below screenshot have two type of kill 1) map and reducer have 100% completed

2)map and reducer not started

Can any one explain the difference of both...

dag-failure.png

5 REPLIES 5

Explorer

The time when dag gets kills really depends. Basically the progress is calculated in task perspective, but there are still other things that might fail. For example, If DAG finishes all tasks but fails to commit the output, you get first case; if DAG fails to initialize vertices before scheduling any job, you get second case.

Rising Star

Thank you so much@zyang

Can you show me solution to avoid these failure...please

Rising Star

@zyang

Hi..Can you updated your answer ..please i just want to avoid this sistuation..

Explorer

Sorry for late reply. Just finish weekends in US. To know the root cause, you need to understand why job fails. Please share some more information about the job(like query, logs, clusters info) so that we can diagnostic.

Rising Star

thanks you so much@zyang i really appriate your work..i have below info regarding vertex failure..

Exact failure occurred while running: hive -f /data/edw/prod/nazhdp/c720/bin/wrk/PID_Gen/pre_stage_to_delta.sql Please look into the above sql script Log : Loading data to table db_c720_stg.event_int_key Table db_c720_stg.event_int_key stats: [numFiles=33, numRows=523, totalSize=2033867, rawDataSize=7853] MapReduce Jobs Launched: Job 0: HDFS Read: 0 HDFS Write: 0 SUCCESS Job 1: HDFS Read: 0 HDFS Write: 0 SUCCESS

Total MapReduce CPU Time Spent: 0 msec

OK

Time taken: 36.036 seconds

Logging initialized using configuration in file:/etc/hive/conf.dist/hive-log4j.properties Warning: Map Join MAPJOIN[42][bigTable=?] in task 'Map 2' is a cross product Query ID = abic720prod_20160831050606_407ea339-96de-4b5b-85bb-1f808d0d7354 Total jobs = 1 Launching Job 1 out of 1

Status: Running (application id: application_1443521267046_245061) Map 1: -/- Map 2: -/- Map 3: -/- Map 4: -/- Map 5: -/- Map 6: -/- Reducer 7: 0/1

Status: Failed Vertex failed, vertexName=Map 5, vertexId=vertex_1443521267046_245061_1_03, diagnostics=[Vertex Input: a initializer failed.]

Vertex killed, vertexName=Map 4, vertexId=vertex_1443521267046_245061_1_04, diagnostics=[Vertex received Kill in INITED state.]

Vertex killed, vertexName=Map 1, vertexId=vertex_1443521267046_245061_1_06, diagnostics=[Vertex received Kill in INITED state.]

Vertex killed, vertexName=Map 3, vertexId=vertex_1443521267046_245061_1_05, diagnostics=[Vertex received Kill in INITED state.]

Vertex killed, vertexName=Reducer 7, vertexId=vertex_1443521267046_245061_1_01, diagnostics=[Vertex received Kill in INITED state.]

Vertex killed, vertexName=Map 6, vertexId=vertex_1443521267046_245061_1_02, diagnostics=[Vertex received Kill in INITED state.]

Vertex killed, vertexName=Map 2, vertexId=vertex_1443521267046_245061_1_00, diagnostics=[Vertex received Kill in INITED state.]

DAG failed due to vertex failure. failedVertices:1 killedVertices:6 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask c720DlyEvt.sh returned 2 for last run. Script Aborted.