Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Hive ignores hive.auto.convert.join.noconditionaltask

avatar
Expert Contributor

Hello everybody,

we are facing a strange Hive behavior (we are using HDP 2.3.2). It seems that Hive ignores hive.auto.convert.join.noconditionaltask.size parameter. Indeed, it converts all the joins to MapJoin even if in our queries we have several joins on very large table (some TB). We have hive.auto.convert.join.noconditionaltask set to true and hive.auto.convert.join.noconditionaltask.size to the value of about 1,5 GB. We have Tez as execution engine and the tables are stored as ORC.

Does anybody have any idea about the reason of this Hive behavior?

Thanks,

Marco

1 ACCEPTED SOLUTION

avatar
Master Guru

Can you give us the query indicating which tables are big? Is CBO enabled? And did you run Analyze on the tables to provide statistics to the Optimizer? Without statistics he is essentially guessing and together with Where conditions and deep joins he is bound to make bad decisions. Although he should make some basic assumptions from the raw table size so its still a bit weird. But still please run ANALYZE and ANALYZE for columns on your tables and try again if you haven't done it yet.

https://cwiki.apache.org/confluence/display/Hive/StatsDev

https://cwiki.apache.org/confluence/display/Hive/Column+Statistics+in+Hive

View solution in original post

2 REPLIES 2

avatar
Master Guru

Can you give us the query indicating which tables are big? Is CBO enabled? And did you run Analyze on the tables to provide statistics to the Optimizer? Without statistics he is essentially guessing and together with Where conditions and deep joins he is bound to make bad decisions. Although he should make some basic assumptions from the raw table size so its still a bit weird. But still please run ANALYZE and ANALYZE for columns on your tables and try again if you haven't done it yet.

https://cwiki.apache.org/confluence/display/Hive/StatsDev

https://cwiki.apache.org/confluence/display/Hive/Column+Statistics+in+Hive

avatar
Expert Contributor

I can't give you the query since it's rather complex (about 1500 lines). Actually we haven't run ANALYZE for the columns... Asap we'll try and let you know. Thank you for your answer.