Created 03-28-2024 10:51 PM
We have a requirement to merge files in a specific HDFS path. The merged file should also be created in the same HDFS path but with a different name. Could you please suggest how we can achieve this using the Nifi Processor?
Created 03-29-2024 06:51 AM
@s198 NiFi has no ability to merge files remotely. NiFi would need to consume all the files (ListHDFS --> FetchHDFS), then merge the content of those FlowFiles (MergeContent or MergeRecord), then use UpdateAttribute to set desired filename on merged file, and finally write the merged file back to HDFS using PutHDFS processor.
If you are using a NiFi Cluster, you would need to do all this merging on one node of the cluster, NiFi nodes can only execute against the FlowFiles present on that one specific node.
Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped.
Thank you,
Matt
Created 03-29-2024 06:51 AM
@s198 NiFi has no ability to merge files remotely. NiFi would need to consume all the files (ListHDFS --> FetchHDFS), then merge the content of those FlowFiles (MergeContent or MergeRecord), then use UpdateAttribute to set desired filename on merged file, and finally write the merged file back to HDFS using PutHDFS processor.
If you are using a NiFi Cluster, you would need to do all this merging on one node of the cluster, NiFi nodes can only execute against the FlowFiles present on that one specific node.
Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped.
Thank you,
Matt
Created 04-01-2024 02:44 AM
Thanks a lot @MattWho