Member since
07-30-2019
2906
Posts
1438
Kudos Received
844
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
47 | 04-17-2024 11:30 AM | |
53 | 04-16-2024 05:36 AM | |
33 | 04-15-2024 05:31 AM | |
119 | 04-03-2024 05:59 AM | |
130 | 04-02-2024 01:22 PM |
05-16-2022
08:12 AM
@alperenboyaci Things to keep in mind with NiFi authorizations. When a user authenticates with NiFi, they are authenticating only with the node they are connecting with. That request then gets replicated to all the nodes in the cluster. This is where the "proxy user request" authorization policy comes in to play. So the node the user authenticated in to replicated the request on behalf of that authenticated user to the other nodes (this way the user does not need to authenticate to all nodes). Some request that get replicated will result in data being returned by other nodes in the cluster for the purpose of displaying that data information on the origninating host where the user initiated the action. So in order for that originating node to display that data, it must be authorized to do so. So while you may have authorized your authenticated user to "view the data" for a specific component, you also need to authorize your hosts for the same. This is why you see the "Insufficient Permissions" dialog box telling you that "server 2" is not authorized to "view the data" for the component requested. In addition to "proxy user requests", nodes would need to be authorized for: 1. "view the data" in order to list a connection queue of a component. 2. "modify the data" in order to empty a connection queue of a component. (your user would also need these same authorizations) Other policies that apply to the NiFi hosts include: 1. "receive data via site-to-site" set on "remote input ports" (set to your local hosts if you are sending FlowFiles to yourself within NiFi. Set to host of other external NiFi instances if receiving FlowFiles from a Remote Process group not on this cluster's hosts). 2. "send data via site-to-site" set on a "remote output port" (same logic as above) 3. "retrieve site-to-site details" set from global menu --> policies. (same logic as above) NiFi's component level authorization policies are set via the "key icon" found in the "operate panel for components added to canvas: Clicking on the key icon will display: Depending in the component selected, some policies may be greyed out if they do not apply to that component. While NiFi allows you to set policies on every component, it is more typical to set policies on the process group because components (processors, controller services, child process groups) inherit permission from the parent PG unless authorizations have been set on the component itself. When you first installed NiFi and started it for the first time, NiFi would have generated the root level Process Group (it is the canvas you see when you access the UI). With nothing selected on the canvas, the operate panel should display the name name and UUID for the root Process group (this assumes you have not click and entered a child process group). There are bread crumbs in the lower left corner to help user navigate the hierarchy of parent --> child Process groups. (label furthest to the left is the root process group: I know this was a bit extra detail, but hope it helps you be more successful. If you found any of the supplied responses assisted with your queries, please take a moment to login and click on "Accept as Solution" below each of those post. As a community we want to make sure we share the path to solutions with other community members through "Accept as Solution" marked responses. Thank you, Matt
... View more
05-06-2022
12:17 PM
@sam_s0ni Verify that all 3 NiFi nodes are: Running the exact same version of NiFI. Running the exact same version of Java 8 or 11 (only supported by newest releases) Contents of the NiFi lib directory, extensions directory, and any custom lib directories contain exact same content. Try removing the NiFi work directory from all three nodes. On restart NiFi will rebuild the contents of the work directory from above lib and extensions folders. If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt
... View more
05-06-2022
12:05 PM
1 Kudo
@Shanoj You really do not want to be using the Apache NiFi 0.x line anymore. That release version is more than 6 years old. Many many security bug fixes and improvements have been made since that time. Not to mention that the Cluster Manager was a single point of failure. If it goes down you lose all access to your NiFi. The Apache NiFi 1.x line introduced 0 master clustering allowing users to access NiFi from any node in the cluster. While I strongly encourage the use of an external zookeeper with Apache NiFi 1.x line, NiFi does offer and embedded zookeeper option. https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#embedded_zookeeper I encourage you to read through the following walkthrough documentation: https://nifi.apache.org/docs/nifi-docs/html/walkthroughs.html It includes sections on installing NiFi, Securing it, and deploying a cluster which even covers using the embedded zookeeper. If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt
... View more
05-06-2022
11:50 AM
@Ghilani While I agree that using record based processors so you can work with single FlowFiles with multiple records in them to make more efficient dataflows, what you are doing here should be possible with a ReplaceText processor in the interim using "Literal Replace": Here we are searching for the pattern _" and replacing it with just ". If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt
... View more
05-06-2022
11:20 AM
@Brenigan The ExtractText processor will support 1 to 40 capture groups in a Java regular expression. The user added property defines the attribute in to which the value from capture group one will be placed. The processor creates additional attribute by capture group number. so in your case you added a new property with: This is a single capture group which reads 4 digits. So in you example (9999, text) this would result in creating attributes: number = 9999 <-- alway contains value from capture group 1. number.1 = 9999 <-- the ".1" signifies the capture group the value came from. number.0 contains the entire matching java regular expression. This attribute is controlled by this property: Setting to false will stop this one from being added to your FlowFiles. To help understand this better, let's look at another example: Suppose your java regular expression looked like this with 2 capture groups instead: Also assume we had "Include Capture Group 0" set to "true" Now with same source text of "9999, text", we would expect to see these attributes added: number = 9999 <-- alway contains value from capture group 1. number.0 = 9999, text <-- The complete match from the java regular expression. number.1 = 9999 <-- The ".1" signifies the capture group the value came from number.2 = text <-- the ".2" signifies the capture group the value came from. Setting "false" for "Include Capture Group 0" would have resulted in "number.0" not being created; however, number, number.1, and number.2 would have still been created. This functionality allows this processor component to handle multiple use cases. If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt
... View more
05-05-2022
01:09 PM
@alperenboyaci In a NiFi cluster you have multiple nodes, but only one of those nodes becomes elected as the cluster coordinator by Zookeeper. When you start only one node in your cluster you effectively have a cluster with only 1 node in it and that same node must then be elected as the cluster coordinator. When you access the UI of any node in a NiFi cluster that request must be replicated to all nodes by the elected cluster coordinator (now keep in mind that as a user accessing a Node in the NiFi cluster, you have authenticated only in to the node you specifically enter the URL for.). This means that your request to access the canvas gets proxied by the cluster coordinator to the other nodes. So when you have only one node, nothing needs to be proxied since you are authenticated in to the only node in the cluster. So for you both your nodes (since either can be elected as the cluster coordinator at any time) must exist as users and be authorized to proxy user requests: I see you provided your authorizers.xml file which shows how you NiFi handles authorizations. I do see a few configuration issues. It is easiest to read this file from the bottom up... So we start with your authorizer "managed-authorizer" which is configured to "file-access-policy-provider". So then we read up looking for the "file-access-policy-provider" where we see it has been configured to use the "ldap-user-group-provider". So then we look for the "ldap-user-group-provider" which is used to establish user to group associations from your ldap. This provider does not reference to any other providers in this file. So while you also setup a "composite-configurable-user-group-provider" and "file-user-group-provider", these are not actually ever going to used by the configured "managed-authorizer" To resolve this issue you need to modify the configuration of your "file-access-policy-provider" so that: <property name="User Group Provider">ldap-user-group-provider</property> actually has: <property name="User Group Provider">composite-configurable-user-group-provider</property> That composite provider is configured to make use of both the "file-user-group-provider" and "ldap-user-group-provider". I see you have your NiFi node DNs properly configured in the "file-user-group-provider" and "file-access-policy-provider", so you are good there... but.... The file-access-policy provider only generates the ./conf/authorizations.xml file if it does NOT already exist and since the file-user-group-provider was not being used by your authorizer, it likely did not initially create this file correctly. So I recommend removing the authorizations.xml (not the authorizers.xml file) so that NiFi recreates it after you fixed your configuration issues in the authorizers.xml as I outlined above. If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt
... View more
04-18-2022
05:54 AM
@Neil_1992 I agree that the first step here is to increase the open file limit for the user that owns your NiFi process. check your current ulimit by becoming the user the user that owns your NiFi process and executing the "ulimit -a" command. You can also inspect the /etc/security/limits.comf file. NiFi can open a very large number of open files. The more FlowFile load, the larger the dataflows, the more concurrent tasks, etc all contribute to open file handles. I recommend setting the ulimit to a very large value like 999999, restarting NiFi and seeing if your issue persists. If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt
... View more
03-09-2022
11:48 AM
1 Kudo
@Onkar_Gagre Is the $.name field unique for every record or do batches of records share the same $.name value? If they are not unique, did you consider using the ConsumeKafkaRecord processor feeding a PartitionRecord processor to split your records by common name values? This would still allow you to work with batches of records rather than and individual record per FlowFile. Also might be helpful if you shared the details of your end-to-end use case as that may give folks the ability to offer even more dataflow design options. Thanks, Matt
... View more
03-09-2022
11:30 AM
1 Kudo
@sachin_32 You can accomplish by utilizing the "Advanced UI" capability found in the UpdateAttribute processor. The advanced UI allows you to create Rules (think if these as an IF/Then capability). So you would setup 3 rules: 1. If current date falls on Mon - Fri, do X 2. if current date falls on Sat, do nothing 3. if current date falls on Sun, do Y Expression Language guide Below you can see I have created 3 rules (Day1-5, Day6, and Day7) Once you create a Rule, you need to provide a Condition (This is your boolean if statement) In this case I am using it to figure out what the current day of the week with 1= Monday and 7 = Sunday and seeing if the day of the week is prior to Sat or after Sat in the current week. If a rules condition (if statement) resolves to a boolean "true", then the configured "Actions" (then statement) are evaluated. For my "Day1-5" rule, I set: Condition: ${now():format('u'):lt(6)} Action: ${now():toNumber():minus(${now():format('u'):plus(1):multiply(86400000)}):toDate():format("EEE, dd MMM yyyy")} For my "Day6" rule, I set: Condition: ${now():format('u'):equals(6)} Action: ${now():format('EEE, dd MMM yyyy')} For my "Day7" rule, I set: Condition: ${now():format('u'):gt(6)} Action: ${now():toNumber():minus(86400000):toDate():format("EEE, dd MMM yyyy")} About above: - The "now()" function returns the current date. - 86400000 is the number of milliseconds in 1 day. - So first I get the current date and convert it to milliseconds using the "toNumber()" function. - Then for Day1-5, I am subtracting based on current day of week a multiple of days worth of milliseconds. - For Day6, I am doing nothing other than reformatting the current days date. - For Day7, I am just subtracting one day or 86400000 milliseconds No matter which rule is applied the final date format i choose to write to an attribute named "PreviousSaturday" on the FlowFile is formatted using java simple date format "EEE, dd MMM yyyy" Example: "Sat, 05 Mar 2022" If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt
... View more
03-09-2022
09:18 AM
2 Kudos
@Harsh__Tanwar I am not clear on the exact failure you are trying to report on? If the processor is producing a Bulletin when the failure to read from eventhub occurs, you could set up the SiteToSiteBulletinReportingTask and have it send bulletins (of course it will capture all bulletins being produced by your NiFi) to a remote input port on your NiFi where you programmatically extract what you from the bulletin(s) and send an alert via perhaps a putEmail processor or send those bulletins to some external monitoring service to handle. If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt
... View more