Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to handle small file issue

How to handle small file issue

Contributor

Hi,

 

I'm referring  this below article;

https://community.cloudera.com/t5/Community-Articles/Create-Dynamic-Partitions-based-on-FlowFile-Con...

I'm trying to create pipeline in nifi while data coming realtime streaming based say some example kafka, while data put in hdfs in partitioned location, it may ended be with many small files at the same while querying im facing performance lag issue; can you please give some apporaches to resolve small files issue in nifi itself with orc file format;

 

1 REPLY 1
Highlighted

Re: How to handle small file issue

Cloudera Employee

@varun_rathinam 

You can use MergeContent processor if it fits for your use case. It is better way to handle small file issue in NiFi. Please refer below link for more details.

https://community.cloudera.com/t5/Support-Questions/Merge-Content-for-small-content-issue/td-p/16766...

 

BR,

Akash

Don't have an account?
Coming from Hortonworks? Activate your account here