Support Questions
Find answers, ask questions, and share your expertise

How to properly perform SMB (Sort Merge Bucket) join in hive over tez execution engine?

New Contributor

I have a table which is bucketed and sorted by a bigint column and I am performing a self join over the same column. But when I print the query plan using explain or explain extended statement, the query plan shows HybridGraceHashJoin being performed on mapjoin phase and not resorting to sort merge bucket join.

Is there anything I need to do, other than setting these settings

set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
set hive.optimize.bucketmapjoin=true;
set hive.optimize.bucketmapjoin.sortedmerge=true;

help would be appreciated.

1 REPLY 1

Re: How to properly perform SMB (Sort Merge Bucket) join in hive over tez execution engine?

@Sindhu @Raja Sudhan

I am not sure if you can perform SMB join in TEZ.I could clearly see SMB join in explain plan ,when I was running it on MR but it wasn't showing up in tez. You can find my query below

https://community.hortonworks.com/questions/107180/is-smb-join-or-smb-map-join-enabled-in-tez.html#c...