Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

NIFI stream using ListenHttp Processor creates too many files

avatar
Expert Contributor

A new file is being created for every JSON object. I believe that's too much and will create too many small files in HDFS (I am using putHDFS processor for that) .

Is it alright? Isn't it a bad idea to have too many small files in HDFS? Is there another way to get around this?

1 ACCEPTED SOLUTION

avatar
Rising Star

Simran, you can merge single JSON objects into a larger file before you put it to HDFS.

There is a dedicated processor for this: Merge Content

https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.2.0/org.apache...

The processor also allows you to configure a property specifying the number of JSON you want to be merged into one single file: 'Minimum Number of Entries'

As a side note, when you have a processor on your canvas, you can right click on it and go to 'Usage' to display the documentation of the processor.

Hope that helps.

View solution in original post

1 REPLY 1

avatar
Rising Star

Simran, you can merge single JSON objects into a larger file before you put it to HDFS.

There is a dedicated processor for this: Merge Content

https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.2.0/org.apache...

The processor also allows you to configure a property specifying the number of JSON you want to be merged into one single file: 'Minimum Number of Entries'

As a side note, when you have a processor on your canvas, you can right click on it and go to 'Usage' to display the documentation of the processor.

Hope that helps.