Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Tez with Transaction with Bucketing

Solved Go to solution
Highlighted

Tez with Transaction with Bucketing

Contributor

When a table is partitioned and bucketed and Transactions enabled on it , the number of map tasks launched by TEZ = 2 , while MR jobs still launches 72 Tasks (Table is about 17Gig). if transaction is not enabled , then the query is launching Correct number of Tez tasks, If there are any hints on why this may occur, please share.

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Tez with Transaction with Bucketing

Contributor

After compacting (Major compaction per partition) the tables (HDP 2.2.4.2-2) we got the right number of Tez mappers. So this appears to be a bug related to compaction.

Alex,

Yes, they are both submitting to the same queue.

The ACID Transactions is broken until further advisary. I am looking for more details of the reasons why it is broken. If you have details please send me a note.

6 REPLIES 6

Re: Tez with Transaction with Bucketing

@pbalasundaram@hortonworks.com

What version of Hive and HDP are you using?

Re: Tez with Transaction with Bucketing

Are the Tez jobs submitting to the same queue as MR jobs? (hive.server2.tez.default.queues, hive.server2.tez.sessions.per.default.queue)

How do Tez container settings compare with general YARN container settings?(tez.am.resource.memory.mb, tez.am.java.opts, hive.tez.container.size, hive.tez.java.opts)

Re: Tez with Transaction with Bucketing

@pbalasundaram@hortonworks.com, @Ryan Templeton and @ravi@hortonworks.com found same issue. I think @ravi is interacting with engineering in order to fix or workaround it.

Re: Tez with Transaction with Bucketing

Contributor

After compacting (Major compaction per partition) the tables (HDP 2.2.4.2-2) we got the right number of Tez mappers. So this appears to be a bug related to compaction.

Alex,

Yes, they are both submitting to the same queue.

The ACID Transactions is broken until further advisary. I am looking for more details of the reasons why it is broken. If you have details please send me a note.

Re: Tez with Transaction with Bucketing

Re: Tez with Transaction with Bucketing

Master Collaborator

A few things to note:

  • If the data was inserted into the table using Hive streaming there will be lot of small files, these get compacted into large files only on compaction. If you haven't enabled compactor in the metastore then they will not be compacted, in that case you will need to issue a compaction explicitly.
  • Recommendation for customers on HDP 2.2 is to not deploy Transactions in production, there are issues with bucketing going awry, meaning you could end up with potential data corruption. There will be fixes for these in the upcoming HDP 2.3 releases so stay tuned.

Bottomline, we should refrain from deploying Transactions in HDP 2.2 releases.

Don't have an account?
Coming from Hortonworks? Activate your account here