Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Compacting orc small files within HDFS directories and grouping them into larger files in an offline fashion

Highlighted

Compacting orc small files within HDFS directories and grouping them into larger files in an offline fashion

New Contributor

Hello,

I have a requirement for Compacting orc small files within HDFS directories and grouping them into larger files in an offline fashion.

Approach1 : Using SparkDataFrame we can achieve this feature ( I know)

Approach 2 : Using Hadoop, ORC libraries using Scala. ( I dont know)

Can you please help me on the Approach 2. It will be great help if you give the sample code / guidance.

Regards,

Rambabu

1 REPLY 1

Re: Compacting orc small files within HDFS directories and grouping them into larger files in an offline fashion

New Contributor

Anybody has idea?