Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

NiFi error - Too many open files

avatar
Expert Contributor

Hi All,

I'm running into an issue while trying to merge large number of small files in NiFi; I've about 800K files (350 MB) in the queue at MergeContent processor; I'm waiting to accumulate about 1.2 mil files, to merge them into 1 large file; but the MergeContent processor is throwing this error below;

MergeContent[id=3104122b-1077-115c-2e71-b264709ceb44] Failed to process bundle of 897788 files due to org.apache.nifi.processor.exception.FlowFileAccessException: Failed to read content of StandardFlowFileRecord[uuid=a2a32c84-f633-4a7a-8b82-2ba5547db9af,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1498156308912-3769, container=default, section=697], offset=429054, length=436953],offset=104885,name=9b425a01-a759-42b6-bcf6-67f9bc79c871,size=302]; rolling back sessions: org.apache.nifi.processor.exception.FlowFileAccessException: Failed to read content of StandardFlowFileRecord[uuid=a2a32c84-f633-4a7a-8b82-2ba5547db9af,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1498156308912-3769, container=default, section=697], offset=429054, length=436953],offset=104885,name=9b425a01-a759-42b6-bcf6-67f9bc79c871,size=302]
2017-06-22 13:37:49,515 ERROR [NiFi logging handler] org.apache.nifi.StdErr Caused by: java.io.FileNotFoundException: /data1/apache-nifi/content_repository/676/1498156300076-3748 (Too m
any open files)
2017-06-22 13:37:49,516 ERROR [NiFi logging handler] org.apache.nifi.StdErr 	at java.io.FileInputStream.open0(Native Method)
2017-06-22 13:37:49,516 ERROR [NiFi logging handler] org.apache.nifi.StdErr 	at java.io.FileInputStream.open(FileInputStream.java:195)
2017-06-22 13:37:49,516 ERROR [NiFi logging handler] org.apache.nifi.StdErr 	at java.io.FileInputStream.<init>(FileInputStream.java:138)

I'm thinking that it's suggesting that I'm over some kind of threshold;

Would you please let me know which of the content repository properties I should increase, to allow more files to wait in the queue, to be merged.

nifi.properties:

# Content Repository
nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository
nifi.content.claim.max.appendable.size=10 MB
nifi.content.claim.max.flow.files=100
# nifi.content.repository.directory.default=./content_repository
nifi.content.repository.directory.default=/data1/apache-nifi/content_repository
nifi.content.repository.archive.max.retention.period=12 hours
nifi.content.repository.archive.max.usage.percentage=50%
nifi.content.repository.archive.enabled=true
nifi.content.repository.always.sync=false
nifi.content.viewer.url=/nifi-content-viewer/
1 ACCEPTED SOLUTION

avatar

Hi @Raj B,

I'd certainly recommend you to use multiple successive MergeContent processors instead of one. If your trigger is the size: you want to end with a file of 100MB, then I'd use a first MergeContent to merge small files into files of 10MB and then another one to merge into one file of 100MB. That's a typical approach for MergeContent and SplitText processors to avoid such issues.

Hope this helps.

View solution in original post

8 REPLIES 8

avatar
Guru

avatar
Expert Contributor

@Sonu Sahi thanks; I'm going to try what @Pierre Villard suggested first, before I go this route.

avatar

Hi @Raj B,

I'd certainly recommend you to use multiple successive MergeContent processors instead of one. If your trigger is the size: you want to end with a file of 100MB, then I'd use a first MergeContent to merge small files into files of 10MB and then another one to merge into one file of 100MB. That's a typical approach for MergeContent and SplitText processors to avoid such issues.

Hope this helps.

avatar
Expert Contributor

@Pierre Villard thanks, I'll give it a shot.

avatar
Expert Contributor

@Pierre Villard, chaining 2 MergeContent Processors, as you suggested, worked for me; thank you.

avatar
New Contributor

@pvillard How does this work exactly? Im having issues segmenting large files as well. When i split them do i do it multiple times or just once and then I can recombine them successively. Thanks for you help!

avatar
Master Guru

avatar
Expert Contributor

@tspann, thank you