Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Hive partitions based on date from timestamp

avatar

I have a raw data where i have a column for timestamp.

 

Now i want to move the file to hive and create partions based on date. Daily partitions.

 

 

From timestamp data how we will create date partions?

 

 

 

 

1 ACCEPTED SOLUTION

avatar
Super Collaborator
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
3 REPLIES 3

avatar
Super Guru
You can create hive external table to link to the data in HDFS, and then write data into another table which will be partitioned by date.

Use date functions in Hive to convert timestamp to the value you want:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions

INSERT OVERWRITE TABLE partitioned_table PARTITION (date_column) SELECT ...., to_date(timestamp_column) as date_column FROM source_table;

Something like this.

avatar

Hi,

 

We have a timestamp in a table and we want o take date out of it then we can write a select statement to_date(timestamp column) from table name.

 

 

The question is while creating partition i dont have seperate date column in the table i have only timestamp column.

 

 

For static Partitions:

 

create external table hju(starttime string,h string,m string,mv double,country string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE location '/srf/ji/cana/'

 

create table hjuk(h string,m string,mv double,country string)partitioned by (starttime string) location '/hji/lo/p/'

 

INSERT into table tblename PARTITION(starttime='2017-08-09') SELECT h,m,mv,country from tblname where to_date(starttime)='2017-08-09'

 

 

For dynamic partitions:

 

 

SET hive.exec.dynamic.partition=true; SET hive.exec.dynamic.partition.mode=nonstrict;

 

create table tblename parg(h string,m string,mv double,country string)partitioned by (starttime string) location '/hiloi/kil'

INSERT overwrite table tblename PARTITION(starttime) SELECT h,m,mv,country ,starttime from tblename

 

 

INFO : Starting task [Stage-2:STATS] in serial mode
INFO : Partition naparg1{starttime=2017-08-01 18:40:03.0} sta ts: [numFiles=1, numRows=1, totalSize=26, rawDataSize=25]
INFO : Partition arg1{starttime=2017-08-02 18:40:03.0} sta ts: [numFiles=1, numRows=1, totalSize=25, rawDataSize=24]
INFO : Partition aparg1{starttime=2017-08-03 18:40:03.0} sta ts: [numFiles=1, numRows=1, totalSize=24, rawDataSize=23]
INFO : Partition arg1{starttime=2017-08-04 18:40:03.0} sta ts: [numFiles=1, numRows=1, totalSize=24, rawDataSize=23]
INFO : Partition arg1{starttime=2017-08-05 18:40:03.0} sta ts: [numFiles=1, numRows=1, totalSize=24, rawDataSize=23]
INFO : Partition arg1{starttime=2017-08-06 18:40:03.0} sta ts: [numFiles=1, numRows=1, totalSize=25, rawDataSize=24]
INFO : Partition arg1{starttime=2017-08-09 18:40:03.0} sta ts: [numFiles=1, numRows=7, totalSize=172, rawDataSize=165]

 

 

avatar
Super Collaborator
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login