Support Questions

Find answers, ask questions, and share your expertise

NIFI stream using ListenHttp Processor creates too many files

avatar
Expert Contributor

A new file is being created for every JSON object. I believe that's too much and will create too many small files in HDFS (I am using putHDFS processor for that) .

Is it alright? Isn't it a bad idea to have too many small files in HDFS? Is there another way to get around this?

1 ACCEPTED SOLUTION

avatar
Rising Star

Simran, you can merge single JSON objects into a larger file before you put it to HDFS.

There is a dedicated processor for this: Merge Content

https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.2.0/org.apache...

The processor also allows you to configure a property specifying the number of JSON you want to be merged into one single file: 'Minimum Number of Entries'

As a side note, when you have a processor on your canvas, you can right click on it and go to 'Usage' to display the documentation of the processor.

Hope that helps.

View solution in original post

1 REPLY 1

avatar
Rising Star

Simran, you can merge single JSON objects into a larger file before you put it to HDFS.

There is a dedicated processor for this: Merge Content

https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.2.0/org.apache...

The processor also allows you to configure a property specifying the number of JSON you want to be merged into one single file: 'Minimum Number of Entries'

As a side note, when you have a processor on your canvas, you can right click on it and go to 'Usage' to display the documentation of the processor.

Hope that helps.