Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Inserts into a bucketed table fail randomly with Hive on Tez

Inserts into a bucketed table fail randomly with Hive on Tez

New Contributor

The MAP phase for Inserts into a bucketed table randomly fails with the error "Vertex <vertex_id> [Map 1] failed as task <task_id> failed after vertex succeeded.]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0".

The task fails because it fails for all attempts with "<attempt_id> being failed for too many output errors. failureFraction=0.2, MAX_ALLOWED_OUTPUT_FAILURES_FRACTION=0.1, uniquefailedOutputReports=1, MAX_ALLOWED_OUTPUT_FAILURES=10, MAX_ALLOWED_TIME_FOR_TASK_READ_ERROR_SEC=300, readErrorTimespan=0"

This happens more often if the table is ACID enabled and a delete operation is performed before the inserts.

I have tried the following:

  • Changed tez.am.launch.cmd-opts, tez.task.launch.cmd-opts and hive.tez.java.opts to use parallel GC.
  • tez.runtime.shuffle.max.allowed.failed.fetch.fraction = 0.95
  • tez.runtime.shuffle.failed.check.since-last.completion=false
  • tez.runtime.shuffle.fetch.buffer.percent = 0.1
  • tez.runtime.shuffle.memory.limit.percent = 0.25
  • tez.runtime.shuffle.ssl.enable=false
  • Deleted ".../usercache/<user>/filecache" and ".../usercache/<user>/appcache"

Please advise as to what might be a solution and if anyone else is able to successfully run large number of inserts on a bucketed table via Tez.

@Namit Maheshwari @kerra @Deepesh @Ram Baskaran @Sindhu

2 REPLIES 2

Re: Inserts into a bucketed table fail randomly with Hive on Tez

New Contributor

Hi Anant,

One quick thing to check is to see if there are no delta files in the HDFS folders. If there are, please run compaction and make sure they are converted to base files before trying again.

Hope that helps.

Thanks,

Kiran

Highlighted

Re: Inserts into a bucketed table fail randomly with Hive on Tez

New Contributor

@kerra Thanks for the response. I do recreate the table before every bunch of operations and it is reproducible. Running compaction after every insert would be quite impractical. I have raised https://issues.apache.org/jira/browse/TEZ-3814 in hopes of getting some solution.

Don't have an account?
Coming from Hortonworks? Activate your account here