Member since
07-30-2019
3471
Posts
1642
Kudos Received
1020
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 143 | 06-03-2026 06:06 PM | |
| 458 | 05-06-2026 09:16 AM | |
| 823 | 05-04-2026 05:20 AM | |
| 493 | 05-01-2026 10:15 AM | |
| 620 | 03-23-2026 05:44 AM |
10-19-2017
06:21 PM
@dhieru singh I am not clear on what you mean by "stopping a dataflow resulted in data loss"? NiFi does not delete any data when a dataflow is stopped. Data will remain queued between the stopped components until the dataflow is restarted or a user manual operation is performed to purge the data from those queues. There is no notion of a "lock" in NIFi that can be set on a component or set of components in NiFi. In addition, having a a double confirm every time a user wants to stop a component to make an edit may be more annoying then beneficial. That being said, it might be an interesting idea to add the ability to "lock" the current running state of "process group". Basically putting all components in a process group in to read only mode until the lock is removed. Might be worth you creating an NiFi Apache Jira for such a thing. If you Nifi is secured, you can prevent such issues by taking away users "modify" access to the components. Without Modify access policies, users can only view the components. They will not be able to change active state (start, stop, enable, or disable) or the configuration. But this would also require that you re-add "modify" anytime a change is desired. Thank you, Matt
... View more
10-19-2017
12:45 PM
@Ben Morris NiFi has not explicitly defined max for the number of nodes that can be added to a single NiFi cluster. Just keep in mind that the more nodes you add, the more request replication that must occur between nodes. For example, If a user is connected to node 1 of 100 nodes and makes a change, that change must be replicated to all 99 other nodes. NiFi is configured with a number of node protocol threads (default 10). So NIFi is only capable of replicating that change to 10 nodes at a time. This value should be increased to accommodate larger clusters. Failing to adjust this value my result in nodes disconnecting because they did not receive the change request fast enough. In addition, you may need to be more tolerant on your connection and heartbeat timeouts. As far as max data per second, that is a hard number to lay out. It is highly dependent on a number of factors. Mostly affected by your particular dataflow implementation. Since NiFi is just a blank canvas in which you build your dataflow, in the end your dataflow design defines your performance/throughput in most cases. This comes down to which processors you use and how they are configured. Assuming you have a well designed and optimized dataflow design, you can expect upwards of the following: *** These numbers will still be affected by use of some processors. CompressContent for example: this processor can be CPU intensive over longer periods of time when compressing large files, so I can become a bottleneck. If you found that this answer addressed you question, please take a moment to click "accept". Thank you, Matt
... View more
09-01-2017
12:22 PM
@Kiem Nguyen I highly recommend starting a new question in Hortonworks community connection for this. Diagnosing what caused your node to disconnect and how to resolve is a different topic from how to stop a processor with a disconnected node. It would also be helpful to explain what you mean by "overloaded queue" and what makes you feel the size of your queue triggered your node to disconnect. What error did you see in the nifi-app.log on the node that disconnected. Thanks, Matt
... View more
08-31-2017
12:24 PM
1 Kudo
@Kiem Nguyen In a NiFi cluster, NiFi wants to make sure consistency across all nodes. You can't have each node in a NiFi cluster running a different version/state of the flow.xml.gz file. In a cluster, NiFi will replicate a request (such as stop x processor(s)) to all nodes. Since a node is not connected, that replication cannot occur. So to protect the integrity of the cluster, the NiFi canvas is essentially read-only while a node is disconnected. Your two options are: 1. Reconnect the disconnected node and then stop your dataflow(s). 2. Drop the disconnected node form your cluster via the "cluster" UI found in the hamburger menu in the upper right corner of the UI. This will make your cluster a 2 of 2 cluster and will return UI to full functionality. You will need to then restart that dropped node in order to get it to try to join the cluster again once fixed. Thanks, Matt
... View more
08-02-2017
01:15 PM
@Hadoop User Please start a new question rather then asking multiple unrelated questions in a single post. This makes it easier for community users to find similar issues. It also help other members identify unanswered questions so they may address them. This question would likely go unnoticed otherwise. I would need to do some investigation to come up with a good solution, but other community members may have already handled this exact scenario. By starting a new question, all members following the "data-processing" or "nifi-processor" or "nifi-streaming" will get notified of your question. Thanks, Matt
... View more
08-01-2017
04:31 PM
1 Kudo
@Hadoop User The ExtractText processor will extract the text that matches your regex and assign it to an attribute matching the property name on the FlowFile. The content of the FlowFile remains unchanged. Then you update a FlowFiles Attribute and finally use PutHDFS to write the content (which at this time you have not changed at all) to HDFS. If your intent is to write the modified string to HDFS, you need to update the actual content of the FlowFile and nit just create and modify attributes. For that use case, you would want to use ReplaceText processor instead. You would configure ReplaceText similar to the following: The above will result in the actual content of the FlowFile being changed to: [hdfs file="/a/b/c" and' the; '''', "", file is streamed. The location=["/location"] and log is some.log"] Thanks, Matt
... View more
08-01-2017
03:08 PM
@Foivos A The banner is a NiFi core feature and is not tied in anyway to the dataflows you select or build on your canvas. You are correct that the best approach for identifying which dataflows on a single canvas are designated dev, test, or production is through the use of "labels". In a secure NiFi setup, you can use NiFi granular multi-tenancy user authorization to control what components a user can interact with an view. If you use labels, you should set a policy allowing all user to view that specific component, so even if they are not authorized to access the labeled components, they will be able to see why via the label text. Thanks, Matt
... View more
08-01-2017
03:00 PM
@Hadoop User Your Java regular expression needs to escape the "[" and "]" since they have reserved meaning in Java. Try using the following java regular expression instead: (\[hdfs.*log"\]) Thanks, Matt
... View more
07-25-2017
03:01 PM
9 Kudos
The intent of this article is to show how NiFi policies in Ranger map to what you would see when using NiFi's default file based authorizer via the NiFi UI. This article will cover what access each of the policies granted to the entities (user and server) that assigned to them. There are controller level policies and component level policies in NiFi. The controller level policies are not tied to any specific component uuid. In Ranger those policies will just show as /<some policy name> - These include the following: Ranger Policy (Base policies): NiFi Policies (Hamburger menu) Ranger permissions description: /resources *** Note: No policies will be available until this policy is manually added. N/A This policy allows Ranger to retrieve a listing of all available policies from NiFi. The server/user from the keystore being used by Ranger must be granted “read” privileges to this resource. /flow * See note [3] below View the user interface Read/View - This policy gives users the ability to view the NiFi UI. All users must be granted “read” privileges to this policy or they will not be able to open the NiFi UI. If you are running a NiFi Cluster and/or accessing Your NiFi via a proxy, You need to grant all Nodes and any proxies read access to this policy as well. Write/Modify - N/A /system View system Diagnostics Read/View - Gives granted users access to the system diagnostics. In a NiFi cluster, nodes will need to access as well to display system diagnostic stats returned by other nodes. Write/Modify - N/A /controller Access the controller Read/View - Gives granted users and/or NiFi cluster nodes the ability to view:- Controller thread pool configuration- Cluster management page- Controller level Reporting tasks- Controller level Controller services Write/Modify - Gives granted users and/or NiFi cluster nodes the ability to create/modify:- Controller thread pool configuration- Cluster management page- Controller level Reporting tasks- Controller level Controller services /counters Access counters Read/View - Gives granted users ability to view counters Write/Modify - Gives granted users ability to modify counters /provenance Query provenance Read/View -Gives granted users ability to run provenance queries or access Provenance lineage graphs. Write/Modify - N/A /restricted-components * See note [1] below Access restricted components Read/View - N/A Write/Modify - Gives granted users ability to add components to the canvas that are tagged as “restricted” /proxy * See note [2] below Proxy user requests Read/View - Allows proxy servers to send request on behalf of other users. Write/Modify - Required /site-to-site Retrieve site-to-site details Read/View - Allows Other NiFi nodes to retrieve Site-To-Site details about this NiFi. /policies *** This policy has no purpose when using ranger and does not need to be used. Access all policies Read/View - Gives granted users the ability to view existing policies. Write/Modify - Gives granted users the ability to create new policies and modify existing policies. /tenants *** This policy has no purpose when using Ranger and does not need to be used. Access users/user groups Read/View - Gives granted users the ability to view currently authorized users and user groups. Write/Modify - Gives granted uses the ability to add, delete, and modify existing users and user groups. /parameter-contexts Access parameter contexts Read/View - Allows users to view and use ALL existing parameter contexts. Write/Modify - Allows users to create, modify, and delete ALL parameter contexts. /parameter-contexts/<uuid> Access Specific existing parameter context Read/View - Allows users to view and use a specific existing parameter context. Write/Modify - Allows users to modify or delete a specific parameter context. [1] new sub policies introduced for "/restricted-components" as of HDF 3.2 (Apache NiFi 1.12+). See following article for details: https://community.cloudera.com/t5/Community-Articles/NiFi-Restricted-Components-Policy-Descriptions/ta-p/249157 [2] All nodes in your NiFi cluster must be assigned to the "/proxy" policy. [3] All users must at a minimum be assigned to the "/flow" policy in order to view the NiFi UI. - The component level granular policies are based on the components assigned uuid. For connections, the policies are enforced based upon the processor component the connection originates from. - This includes the following policies: Ranger Component based policies: NiFi Component based policies: component Equivalent NiFi file based authorizer policy:Policy Ranger permissions description: /data-transfer/input-ports/<uuid> Each NiFi remote input port is assigned a unique <uuid> Receive data via site-to-site Both read and write is required and should be granted to the source NIFi servers sending data to this NiFi via this input port. /data-transfer/output-ports/<uuid> Each NiFi remote output port is assigned a unique <uuid> Send data via site-to-site Both read and write is required and should be granted to the source NIFi servers pulling data from this NiFi via this output port. /process-groups/<uuid> Each NiFi process group is assigned a unique <uuid> View the component Modify the component Read - (allows user to view process group details only) Write - (allows user to start, stop or delete process group. Users are able to added components inside process group and add controller services to process group) /data/process-groups/<uuid> Each NiFi process group is assigned a unique <uuid> View the data Modify the data Read - (allows user to view data was processed by components in this process group and list queues) Write - (allows users to empty queues/purge data from queues within process group) /policies/process-groups/<uuid> *** not needed when using Ranger Each NiFi process group is assigned a unique <uuid> View the policies Modify the policies Read - N/A in Ranger Write - N/A in Ranger /processors/<uuid> Each NiFi processor is assigned a unique <uuid> View the component Modify the component Read - (Allows user to view processor configuration only) Write - (Allows user to start, stop, configure and delete processor) /data/processors/<uuid> Each NiFi processor is assigned a unique <uuid> View the data Modify the data Read - (allows user to view data processed this processor and list queues on this processors outbound connections) Write - (allows users to empty queues/purge data from this processors outbound connections) /policies/processors/<uuid> *** Not needed when using Ranger Each NiFi processor is assigned a unique <uuid> View the policies Modify the policies Read - N/A in Ranger Write - N/A in Ranger /controller-services/<uuid> Each NiFi controller services is assigned a unique <uuid> View the component Modify the component Read - (Allows user to view controller service configuration only) Write - (Allows user to enable, disable, configure and delete controller services) /provenance-data/<component-type>/<component-UUID> Each NiFi component is assigned a unique <uuid> view provenance Read - Allows users to view provenance events generated by this component Write - N/A in Ranger /operation/<component-type>/<component-UUID> Each NiFi component is assigned a unique <uuid> operate the component Read - N/A in Ranger Write - Allows users to operate components by changing component run status (start/stop/enable/disable), remote port transmission status, or terminating processor threads There will be a unique policy for each and every component based on the specific components assigned uuid available. Component level authorizations are inherited from the parent process group when no specific processor or sub process group component level policy is set. Ranger supports the " * " wildcard when assigning policies. - In a NiFi cluster, all nodes must be granted the ability to view and modify component data in order for user to list or empty queues in processor component outbound connections. With Ranger this can be accomplished by using the a wildcard to grant all the NiFi nodes read and write to "/data/*" NiFi resource. *** Users should not be given global access to all data, but instead be restricted to specific process groups they have been granted access to. *** Also note at time of writing Ranger groups are not supported by NiFi for authorization. UPDATE: Ranger based group support was added as a new feature/capability in HDF 3.1.x
... View more
Labels:
07-25-2017
01:04 PM
@Sanaz Janbakhsh This question revolves around setting the correct file based authorizer permissions for listing and emptying queues. Since you are using Ranger , I suggest starting a new question so as not add confusion as process is different. Thanks, Matt
... View more