Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

orc small files Concatenate in Hive

Highlighted

orc small files Concatenate in Hive

Expert Contributor

Hi,

I have an ORC table which updates every 5 min in different date partitions, i want to run the CONCATENATE alter table command, but how do i run it on all partitions at once? with the below command i could do only on single partition:

ALTER TABEL tablename PARTITION (dt=20180109) CONCATENATE;

Thanks.

4 REPLIES 4

Re: orc small files Concatenate in Hive

New Contributor

I don't think there is a way to do it at once for all partitions, best you could is to specify multiple partitions like ALTER TABLE tableName PARTITION(dt=20180109, dt=20180110..) CONCATENATE. Please note that there are known issues with ALTER TABLE CONCATENATE in versions earlier than HDP 2.6 and it is not recommended to run CONCATENATE.

Re: orc small files Concatenate in Hive

Expert Contributor

@vgarg

Thanks for the answer.

I use 2.5.3 HDP, when i use concatenate once on the partition with many files, it only concatenates 1 or a few files each time, i have to do it multiple times to concatenate all to one large file.. was this an issue as well? could be please direct me to the issues of the concatenate in earlier versions?

Re: orc small files Concatenate in Hive

New Contributor

@vgarg / @PJ : I am using HDP 2.6 but still I am facing the same issue.

Re: orc small files Concatenate in Hive

Contributor