Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hve: Create uniform partitions from non uniformly partitioned data

Hve: Create uniform partitions from non uniformly partitioned data

New Contributor

I have some csv files that I imported from Vertica through Sqoop. I used a split-by column while importing the data that caused the data to be highly non uniform i.e. out of 300 part files, only 8 files have the data. Now I am creating ORC table from this data. I created an external table pointing to the data locationa and an ORC table. While doing an

"INSERT OVERWRITE table_orc SELECT * FROM table_csv;

It is taking forever because effectively only 8 mappers are running. Is there a way I can split the whole data into 120 equal parts and run the query?

Thanks in advance.