@kerra Thanks for the response. I do recreate the table before every bunch of operations and it is reproducible. Running compaction after every insert would be quite impractical. I have raised https://issues.apache.org/jira/browse/TEZ-3814 in hopes of getting some solution.
... View more
The MAP phase for Inserts into a bucketed table randomly fails with the error "Vertex <vertex_id> [Map 1] failed as task <task_id> failed after vertex succeeded.]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0". The task fails because it fails for all attempts with "<attempt_id> being failed for too many output errors. failureFraction=0.2, MAX_ALLOWED_OUTPUT_FAILURES_FRACTION=0.1, uniquefailedOutputReports=1, MAX_ALLOWED_OUTPUT_FAILURES=10, MAX_ALLOWED_TIME_FOR_TASK_READ_ERROR_SEC=300, readErrorTimespan=0" This happens more often if the table is ACID enabled and a delete operation is performed before the inserts. I have tried the following: Changed tez.am.launch.cmd-opts, tez.task.launch.cmd-opts and hive.tez.java.opts to use parallel GC. tez.runtime.shuffle.max.allowed.failed.fetch.fraction = 0.95 tez.runtime.shuffle.failed.check.since-last.completion=false tez.runtime.shuffle.fetch.buffer.percent = 0.1 tez.runtime.shuffle.memory.limit.percent = 0.25 tez.runtime.shuffle.ssl.enable=false Deleted ".../usercache/<user>/filecache" and ".../usercache/<user>/appcache" Please advise as to what might be a solution and if anyone else is able to successfully run large number of inserts on a bucketed table via Tez. @Namit Maheshwari @kerra @Deepesh @Ram Baskaran @Sindhu
... View more