Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive - join partition field to another table

Hive - join partition field to another table

New Contributor

Hi,

my join query is very slow.

I have 2 tables in Hive 

table_text partitioned by YEAR , MONTH, DAY

table_partition with only one record    

when I create a Hive select if partition field are setted with costant, the query i very fast but if partition field are in join with anoither table the query is very slow

 

I have used mapjoin ma is not risolutive

 

query very fast (with constant):

select count(*) from table_text where year=2020 and month=1 and day=1

 

query very slow :

1 = where query 

2 = same query with join

3 - same query with MAPJOIN

 

1) select count(*) from table_text t , table_partition p where t.year=p.year AND t.month=p.month AND t.day=p.day

 

2) select count(*) from table_text t join table_partition p on t.year=p.year AND t.month=p.month AND t.day=p.day

 

3) set hive.auto.convert.join=true;
set hive.auto.convert.join.noconditionaltask=true;
set hive.auto.convert.join.noconditionaltask.size=10000000;
set hive.auto.convert.join.use.nonstaged=true;

select /*+ MAPJOIN(p) */ count(*) from table_text t , table_partition p where t.year=p.year AND t.month=p.month AND t.day=p.day

 

select /*+ MAPJOIN(p) */ count(*) from table_text t , table_partition p where t.year=p.year AND t.month=p.month AND t.day=p.day

 

thanks, Seba

 

Don't have an account?
Coming from Hortonworks? Activate your account here