Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Bucket Map Join not getting called even after setting hive.optimize.bucketmapjoin to true

Bucket Map Join not getting called even after setting hive.optimize.bucketmapjoin to true

New Contributor

Hi,

I'm trying to explore bucket map join. Theoretically, it is an optimized map join where not all of the small table but required buckets of a small table are sent to every mapper.

I have created two tables user_plays_buck and small_user_subscription_buck, both having 16 buckets, bucketed on same key, which is also used as join key.

Even if I enable bucket map join, still only map join gets called.

ast-bucket-map-join.txt contains the AST generated after I attempt to perform bucket map join.

ast-map-join.txt contains the AST generated after I attempt to perform map join.

ddl-small-user-subscription-buck.txt contains DDL of small table.

ddl-user-plays-buck.txt contains DDL of bigger table.

small-user-subscription-buck-files.txt contains files present in the smaller table.

Any insight on this will be highly appreciated.

Regards,

Amit

3 REPLIES 3
Highlighted

Re: Bucket Map Join not getting called even after setting hive.optimize.bucketmapjoin to true

@Amit Ranjan

did you try the following steps in order?

create table abc(col0 string,col1 string,col2 string,col3 string,col4 string,col5 string,col6 string)
clustered by (col0) into 16 buckets; create table xyz(col0 string,col1 string,col2 string,col3 string,col4 string,col5 string,col6 string)
clustered by (col0) into 16 buckets; set hive.enforce.bucketing = true; 
insert OVERWRITE  table abc 
insert OVERWRITE  table xyz 

set hive.optimize.bucketmapjoin=true; 

explain select /*+ MAPJOIN(b2) */ abc.* from abc,xyz where abc.col0=xyz.col0 ;
Highlighted

Re: Bucket Map Join not getting called even after setting hive.optimize.bucketmapjoin to true

New Contributor

Yes Rajkumar,

I have tried that, but still it goes for Map Join only.

Bucket Map Join is not invoked.

Regards,

Amit

Highlighted

Re: Bucket Map Join not getting called even after setting hive.optimize.bucketmapjoin to true

Explorer

@Amit Ranjan

What is the value of hive.enforce.bucketing on your setup? It should be set to true. Can you try your explain query after setting hive.ignore.mapjoin.hint=false?

Don't have an account?
Coming from Hortonworks? Activate your account here