Support Questions
Find answers, ask questions, and share your expertise

Is SMB Join or SMB Map join Enabled in TEZ

The conversation of a join to SMB seems to be depending up on the execution engine. If I run the below commands on using MR

set hive.execution.engine=mr;

set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;

set hive.auto.convert.sortmerge.join=true;

set hive.optimize.bucketmapjoin=true;

set hive.optimize.bucketmapjoin.sortedmerge=true;

set hive.enforce.bucketing=true;

set hive.enforce.sorting=true;

set hive.auto.convert.join=true;

drop table key_value_large; drop table key_value_small;

create table key_value_large ( key int, value string ) partitioned by (ds string) CLUSTERED BY (key) SORTED BY (key ASC) INTO 8 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE;

create table key_value_small ( key int, value string ) partitioned by (ds string) CLUSTERED BY (key) SORTED BY (key ASC) INTO 4 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE;

explain extended select count(*) from key_value_large a JOIN key_value_small b ON a.key = b.key

I can see a 'Sorted Merge Bucket Map Join Operator' in the explain statement,But If I set the execution engine to TEZ.

set hive.execution.engine=tez;

And then run the same explain plan I get to see 'Map Join Operator' instead of SMB map join in the plan.

I could see in some of JIRA pages that SMB is not implemented in TEZ

http://mail-archives.apache.org/mod_mbox/hive-user/201508.mbox/%3c4D4BDAE9-F6A8-456F-A90A-A550D3C289...

Can someone if TEZ can run SMB join.

4 REPLIES 4

Re: Is SMB Join or SMB Map join Enabled in TEZ

@viswanath kammula

Share the explain plan for both execution engines tez and mr as:

explain <query>;

Re: Is SMB Join or SMB Map join Enabled in TEZ

You can see the plan below

Re: Is SMB Join or SMB Map join Enabled in TEZ

16381-mr.png

@Sindhu This is Explain for MR.

The Query is

explain select count(*) from key_value_large a JOIN key_value_small b ON a.key = b.key;

And I also had to do

set hive.enforce.sortmergebucketmapjoin=false; just for MR

Re: Is SMB Join or SMB Map join Enabled in TEZ

16383-tez.png

And this is the explain for TEZ