Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Hive ignores hive.auto.convert.join.noconditionaltask

avatar
Expert Contributor

Hello everybody,

we are facing a strange Hive behavior (we are using HDP 2.3.2). It seems that Hive ignores hive.auto.convert.join.noconditionaltask.size parameter. Indeed, it converts all the joins to MapJoin even if in our queries we have several joins on very large table (some TB). We have hive.auto.convert.join.noconditionaltask set to true and hive.auto.convert.join.noconditionaltask.size to the value of about 1,5 GB. We have Tez as execution engine and the tables are stored as ORC.

Does anybody have any idea about the reason of this Hive behavior?

Thanks,

Marco

1 ACCEPTED SOLUTION

avatar
Master Guru

Can you give us the query indicating which tables are big? Is CBO enabled? And did you run Analyze on the tables to provide statistics to the Optimizer? Without statistics he is essentially guessing and together with Where conditions and deep joins he is bound to make bad decisions. Although he should make some basic assumptions from the raw table size so its still a bit weird. But still please run ANALYZE and ANALYZE for columns on your tables and try again if you haven't done it yet.

https://cwiki.apache.org/confluence/display/Hive/StatsDev

https://cwiki.apache.org/confluence/display/Hive/Column+Statistics+in+Hive

View solution in original post

2 REPLIES 2

avatar
Master Guru

Can you give us the query indicating which tables are big? Is CBO enabled? And did you run Analyze on the tables to provide statistics to the Optimizer? Without statistics he is essentially guessing and together with Where conditions and deep joins he is bound to make bad decisions. Although he should make some basic assumptions from the raw table size so its still a bit weird. But still please run ANALYZE and ANALYZE for columns on your tables and try again if you haven't done it yet.

https://cwiki.apache.org/confluence/display/Hive/StatsDev

https://cwiki.apache.org/confluence/display/Hive/Column+Statistics+in+Hive

avatar
Expert Contributor

I can't give you the query since it's rather complex (about 1500 lines). Actually we haven't run ANALYZE for the columns... Asap we'll try and let you know. Thank you for your answer.