Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive 10TB table add partition performance issue

Highlighted

Hive 10TB table add partition performance issue

New Contributor

Techies,

Background - We have 10TB existing hive table which has been range partitioned on column A. Business case has changes which now require adding of partition column B in addition to Column A. Problem statement - Since data on hdfs is too huge and needs to be restructured to inherit the new partition column B, we are facing difficulty to copy over table onto backup and reingest using simple IMPALA INSERT OVERWRITE into main table.

We want to explore if there is/ are efficient way to handle adding over partition columns to such huge table

Don't have an account?
Coming from Hortonworks? Activate your account here