Member since
07-30-2019
3427
Posts
1631
Kudos Received
1010
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 459 | 01-13-2026 11:14 AM | |
| 916 | 01-09-2026 06:58 AM | |
| 865 | 12-17-2025 05:55 AM | |
| 926 | 12-15-2025 01:29 PM | |
| 750 | 12-15-2025 06:50 AM |
05-10-2017
12:32 PM
@Gaurav Jain You can build into your dataflow the ability to redistribute FlowFiles between your nodes. Below are just some of the benefits NiFi clustering provides: 1. Redundancy - You don't have a single point of failure. You dataflows will still run even if a node is down or temporarily disconnected form your cluster. 2. Scaleable - You can scale out the size of your cluster to add additional nodes at any time. 3. Ease of Management - Often times a dataflow or multipole dataflows are constructed within the NiFi canvas. The volume of data may eventually push the limits of your hardware necessitating the need for additional hardware to support the processing load. You could stand up another standalone Nifi instance running the same dataflow, but then you have two separate dataflows/canvases you need to manage. Clustering allows you to make a change in only one UI and have those changes synced across multiple servers. 4. Provides Site-To-Site for load-balanced data delivery between NiFi end-points. As you design your dataflows, you must take in to consideration how the data will be ingested. - Are you running a listener of some sort on every node? In that case source systems you push data to your cluster through some external load-balancer. - Are you pulling data in to your cluster? Are you using a cluster friendly source like JMS or Kafka wheer multiple NiFi nodes can pull data at the same time? Are you using non-cluster friendly protocols to pull data like SFTP or FTP? (In case like this load-balancing should be handled through list<protocol> --> RPG Input port --> Fetch<protocol> model) NiFi has data HA on its future roadmap which will allow other nodes to pickup work on data of a down node. Even when this is complete, I do not believe it will doing any behinds the scenes data redistribution. Thanks, Matt
... View more
05-10-2017
12:13 PM
1 Kudo
@Muhammad Umar The log is telling you that the port NiFi is trying to use for its HTTP or HTTPS is already in use on the server where you have installed NiFi. HDF installed via Ambari by default uses port 9090 for HTTP and 9091 for HTTPS. You will need to change the NiFi configuration to use an available port on your server. Thanks, Matt
... View more
05-09-2017
04:05 PM
I literally hit the "tab" key on my keyboard.
... View more
05-09-2017
03:53 PM
1 Kudo
@Prabir Guha You can use the replaceText processor to replace tabs with commas in a text/plain input file. lets assume my input file's content has the following value:
I could then configure my replaceText processor to do teh following: The Search Value is set to a tab. The Replacement Value is set to a comma. The resulting content is: Thanks, Matt
... View more
05-09-2017
03:18 PM
@Sunil Neurgaonkar There are Global access policies and Component level access policies. The component level access policies are set against components (processors, input ports, output ports, Remote Process groups, etc...). There are no access policies for the icons in the tool bar used to create dataflows. Component level access policies can be assigned to process groups and sub process groups, or they can be assigned to specific components (processors, labels, input ports, output ports, Remote Process groups, etc...) If I am understanding you correctly, you want to control which dataflow building tools specific users have access to. correct? If so, that level of control does not exist. The assumption is that the admin user assigns different users the ability to view/modify only those users assigned process groups. Once they have modify on a process group, they will be able to use all the icons in the dataflow building toll bar to construct their dataflow. The only acception to that are components marked as restricted (this includes some processors and controller services) which would require the user to have been granted the global access policy to "access restricted components". The implementation of such granular control would be challenging to implement without significant changes in NiFi. Take the following template example:
- Templates can contain process groups, sub-process groups, and controller services. What would the expect behavior be if a user tried to instantiate such a template on to the canvas? Fail all together because it contains components user (TEST1) is not authorized to create? Once a dataflow is created you can set component level access policies very granularly against specific components rather then against the process group they reside in. While this granular access control would limit a user to being able to view/modify the specific component, the user would not be able to add new components to the process group. Thanks, Matt
... View more
05-09-2017
02:24 PM
What policies did you authorize the new user for? A user will not be able to load the canvas if they don't at least have the "view the user interface" global access policies assigned to them. Thanks, Matt
... View more
05-09-2017
02:20 PM
@Sunil Neurgaonkar You should avoid hand editing the users.xml file. Let NiFi do that for you to avoid typos. Can you share what that new error is you are seeing? Thanks, Matt
... View more
05-09-2017
12:45 PM
@Sertac Kaya Glad you were able to get the performance improvement you were looking for by allowing your NiFi instance access to additional system threads. If this answer helped you get to your solution, please mark it as accepted. Thank you, Matt
... View more
05-09-2017
12:41 PM
The "sleep" command is a linux command. The command simply runs and waits the configured amount of time before exiting. Typically this command in linux is found under /usr/bin/sleep. I noticed above is missing the leading "/" . But if it is still not found, try searching your linux for it. It is installed as part of the linux "coreutils" rpm. Matt
... View more
05-09-2017
12:36 PM
1 Kudo
@Gaurav Jain Ideally load-balancing would be handle by the systems pushing data to your NiFI. When that is not possible and you are forced to ingest all data to a single Node in your NiFi cluster, load-balancing must be handled via a dataflow implementation. Using a Remote Process Group (RPG) is the most common solution used to redistribute already ingested data across all node sin a cluster, but you can also use multiple PostHTTP processors (1 for every node in your cluster) and a single ListenHTTP processor to build a FlowFile distribution dataflow. See the following for more info... https://community.hortonworks.com/articles/16120/how-do-i-distribute-data-across-a-nifi-cluster.html Thanks, Matt
... View more