Created on 05-03-2019 01:02 PM - edited 09-16-2022 07:21 AM
Hi,
We have an existing external Hive Table containing millions of rows partitioned by columnA of type string.
We want to change this to ColumnB of type timestamp
What's the most efficient way to go about this, considering we have all this rows of data already stored in the existing partition structure
Created 05-07-2019 09:08 PM
Created 05-08-2019 06:52 AM
I was considering ....
1. Create new ext table with new partition
2. insert into newtable select ... from oldtable ... to new hdfs location
3. drop old table and delete hdfs folders
problem here is... at some point both tables will have to exists
Created 05-08-2019 11:33 PM
Created 05-09-2019 09:01 AM
Hi EricL
ColumnA is of a different data type than ColumnB
ColumnA contains Department Names (string) and ColumnB contains TimeStamps (Date-Time)
the table is already paritioned by the department names which is strings
now we want to change and partition by the TimeStamp column (date-time)
could you explain your process little more
Created 05-09-2019 05:08 PM
Created 05-10-2019 06:51 AM
We do not want to keep the old partitions.
We just want to re-partition the data using the timestamps value.
The data only exists currently as partitioned by the string value