Reply
New Contributor
Posts: 7
Registered: ‎05-03-2014

Hive partition table not working as expected.

This is my data file.

 

>cat dataPart
first,manish,2015-01-01,india
second,sanish,2015-02-01,US
Third,tanish,2015-02-01,CANADA
fourth,canish,2015-01-01,CHINA
Fifth,ianish,2015-02-01,india
Sixth,uanish,2015-01-01,US

 

I executed following code.

 


create table logs(rank string, name string)
partitioned by (dt string, country string)
row format delimited
fields terminated by ','
STORED AS TEXTFILE;

load data local inpath '/home/cloudera/Desktop/dataPart'
into table logs
PARTITION(dt='2015-01-01', country='US');

 

When I am executing following command,

 

select * from logs;

 

geting following result,

 

hive> select * from logs;                  
OK
first    manish    2015-01-01    US
second    sanish    2015-01-01    US
Third    tanish    2015-01-01    US
fourth    canish    2015-01-01    US
Fifth    ianish    2015-01-01    US
Sixth    uanish    2015-01-01    US
Time taken: 3.719 seconds
hive>

 

Hive modified the data rather than partitioning it. Can someone let me know what is going wrong here.

Cloudera Employee
Posts: 30
Registered: ‎12-09-2014

Re: Hive partition table not working as expected.

This is static partioning, where in your statement you are forcing all records to be in the partition of value : (dt='2015-01-01', country='US')

 

What you might want is dynamic partioning.  Read about it here:  https://cwiki.apache.org/confluence/display/Hive/DynamicPartitions

 

Hope that helps

Szehon