We have an existing external Hive Table containing millions of rows partitioned by columnA of type string.
We want to change this to ColumnB of type timestamp
What's the most efficient way to go about this, considering we have all this rows of data already stored in the existing partition structure
I was considering ....
1. Create new ext table with new partition
2. insert into newtable select ... from oldtable ... to new hdfs location
3. drop old table and delete hdfs folders
problem here is... at some point both tables will have to exists
ColumnA is of a different data type than ColumnB
ColumnA contains Department Names (string) and ColumnB contains TimeStamps (Date-Time)
the table is already paritioned by the department names which is strings
now we want to change and partition by the TimeStamp column (date-time)
could you explain your process little more
We do not want to keep the old partitions.
We just want to re-partition the data using the timestamps value.
The data only exists currently as partitioned by the string value