Member since
03-01-2017
30
Posts
2
Kudos Received
0
Solutions
10-08-2022
09:16 AM
Hi, I am using NiFi 1.14.0 version. I installed it within podman having openshift. The problem which I am facing is after sometime NiFi stuck and unable to serve any request. Moreover new flowfiles are not generating and threads are stuck on Running processor. As per initial debug when checked on java melody, found 'Provenance Repository maintenance' and 'Lucene Index' thread are in Blocked state. Please assist on this. Thanks.
... View more
Labels:
- Labels:
-
Apache NiFi
04-27-2020
12:57 PM
The answer @mpayne is correct. Only that setting the header in MergeContent doesn't include a line break at the end of header and the records. So as EL is supported please include line break at the end.
... View more
05-16-2017
01:25 PM
@Gaurav Jain Was I able to successful answer your question? If so please mark the answer as accepted. Thank you, Matt
... View more
05-09-2017
12:36 PM
1 Kudo
@Gaurav Jain Ideally load-balancing would be handle by the systems pushing data to your NiFI. When that is not possible and you are forced to ingest all data to a single Node in your NiFi cluster, load-balancing must be handled via a dataflow implementation. Using a Remote Process Group (RPG) is the most common solution used to redistribute already ingested data across all node sin a cluster, but you can also use multiple PostHTTP processors (1 for every node in your cluster) and a single ListenHTTP processor to build a FlowFile distribution dataflow. See the following for more info... https://community.hortonworks.com/articles/16120/how-do-i-distribute-data-across-a-nifi-cluster.html Thanks, Matt
... View more
05-11-2017
04:13 PM
@Gaurav Jain NiFi does not redistribution of FlowFiles at this time between nodes behind the scenes. Any redistribution of FLowFiles between nodes in a cluster has to be done programmatically through your dataflow design via components (processors like postHTTP to ListenHTTP or RPG) to push FlowFiles to other nodes. Thanks, Matt
... View more
05-10-2017
12:32 PM
@Gaurav Jain You can build into your dataflow the ability to redistribute FlowFiles between your nodes. Below are just some of the benefits NiFi clustering provides: 1. Redundancy - You don't have a single point of failure. You dataflows will still run even if a node is down or temporarily disconnected form your cluster. 2. Scaleable - You can scale out the size of your cluster to add additional nodes at any time. 3. Ease of Management - Often times a dataflow or multipole dataflows are constructed within the NiFi canvas. The volume of data may eventually push the limits of your hardware necessitating the need for additional hardware to support the processing load. You could stand up another standalone Nifi instance running the same dataflow, but then you have two separate dataflows/canvases you need to manage. Clustering allows you to make a change in only one UI and have those changes synced across multiple servers. 4. Provides Site-To-Site for load-balanced data delivery between NiFi end-points. As you design your dataflows, you must take in to consideration how the data will be ingested. - Are you running a listener of some sort on every node? In that case source systems you push data to your cluster through some external load-balancer. - Are you pulling data in to your cluster? Are you using a cluster friendly source like JMS or Kafka wheer multiple NiFi nodes can pull data at the same time? Are you using non-cluster friendly protocols to pull data like SFTP or FTP? (In case like this load-balancing should be handled through list<protocol> --> RPG Input port --> Fetch<protocol> model) NiFi has data HA on its future roadmap which will allow other nodes to pickup work on data of a down node. Even when this is complete, I do not believe it will doing any behinds the scenes data redistribution. Thanks, Matt
... View more
03-01-2017
02:27 PM
2 Kudos
Evaluate the MergeContent processor: https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.MergeContent/index.html.
... View more
03-07-2017
02:05 AM
3 Kudos
@Gaurav Jain There are many ways to do it, however, assuming that you would use NiFi, you could build a flow starting with GetFile processor pointing to a folder where you have all your CSV files. Follow with MergeContent and PutFile processors. Otherwise, it could be even simpler and without NiFi. If your files have the same structure and each has a header, then a simple shell script can extract the header from one file and remove the header from the others and just merge the content. You can easily do all this using sed unix command.
... View more