Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to change partition on existing Hive Table with data

How to change partition on existing Hive Table with data

Explorer

Hi,

 

We have an existing external Hive Table containing millions of rows partitioned by columnA of type string.

 

We want to change this to ColumnB of type timestamp

 

What's the most efficient way to go about this, considering we have all this rows of data already stored in the existing partition structure

 

6 REPLIES 6

Re: How to change partition on existing Hive Table with data

Guru
Hi,

The easiest way I can see is to:

1. drop the table (since it is external, data should NOT be dropped)
2. create a new table on top of it and specify as partitioned by ColumnA of type timestamp (the column name should remain the same as before, can't be changed to ColumnB, otherwise step 3 will not be able to pick it up)
3. run "msck repair table {tablename}" to recover the partitions

This assumes that the partition values will remain unchanged.

Hope this can help.
Highlighted

Re: How to change partition on existing Hive Table with data

Explorer

I was considering ....

 

1. Create new ext table with new partition

2.  insert into newtable select ... from oldtable ... to new hdfs location

3. drop old table and delete hdfs folders

 

problem here is... at some point both tables will have to exists

 

 

Re: How to change partition on existing Hive Table with data

Guru
Have you tried my suggestion, which won't need to do data copying?

Cheers
Eric

Re: How to change partition on existing Hive Table with data

Explorer

Hi EricL

 

ColumnA is of a different data type than ColumnB

 

ColumnA contains Department Names (string) and ColumnB contains TimeStamps (Date-Time)

 

the table is already paritioned by the department names which is strings

 

now we want to change and partition by the TimeStamp column (date-time)

 

could you explain your process little more

Re: How to change partition on existing Hive Table with data

Guru
Hi,

I thought you just update the partition column type. If they contain different data, how will you plan to manage the old partitions? This will affect how you implement the change.

Cheers
Eric

Re: How to change partition on existing Hive Table with data

Explorer

We do not want to keep the old partitions.

 

We just want to re-partition the data using the timestamps value.

 

The data only exists currently as partitioned by the string value

Don't have an account?
Coming from Hortonworks? Activate your account here