- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
How to create partitions on existing Hive table?
Created ‎06-03-2019 01:41 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a table with 38M rows which is being updated daily. The table is in orc format and it is managed table.
I want to create partitions on that table but I couldn't find a way to alter existing non-partitioned table to create partitions in that table.
I tried searching all over the google but the only option I was able to find was Create a new partitioned table and insert data into it from the old table.
I was wondering if there was any way to create partitions on an existing table. My HIVE version is 3.1.
Thanks & Regards,
Kartik
Created ‎06-03-2019 09:31 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yep, create a new one defined the way you want the partitions to be and then insert into that new one using dynamic partitioning and you'll be good to go. Good luck and happy Hadooping.
Created ‎06-03-2019 09:31 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yep, create a new one defined the way you want the partitions to be and then insert into that new one using dynamic partitioning and you'll be good to go. Good luck and happy Hadooping.
Created ‎06-04-2019 07:39 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Okay. Thanks! But the insertion of data takes too much time for a table with 38M records. Is there any efficient way or trick to do the insertion process faster? @Lester Martin
Created ‎06-04-2019 01:06 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You could try to do a single INSERT INTO statement per partition and run as many of these simultaneously as your cluster has resources for.
