- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
NIFI stream using ListenHttp Processor creates too many files
- Labels:
-
Apache NiFi
Created ‎06-01-2017 10:00 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
A new file is being created for every JSON object. I believe that's too much and will create too many small files in HDFS (I am using putHDFS processor for that) .
Is it alright? Isn't it a bad idea to have too many small files in HDFS? Is there another way to get around this?
Created ‎06-01-2017 11:21 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Simran, you can merge single JSON objects into a larger file before you put it to HDFS.
There is a dedicated processor for this: Merge Content
The processor also allows you to configure a property specifying the number of JSON you want to be merged into one single file: 'Minimum Number of Entries'
As a side note, when you have a processor on your canvas, you can right click on it and go to 'Usage' to display the documentation of the processor.
Hope that helps.
Created ‎06-01-2017 11:21 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Simran, you can merge single JSON objects into a larger file before you put it to HDFS.
There is a dedicated processor for this: Merge Content
The processor also allows you to configure a property specifying the number of JSON you want to be merged into one single file: 'Minimum Number of Entries'
As a side note, when you have a processor on your canvas, you can right click on it and go to 'Usage' to display the documentation of the processor.
Hope that helps.
