Member since
07-30-2019
3400
Posts
1621
Kudos Received
1003
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 156 | 12-05-2025 08:25 AM | |
| 283 | 12-03-2025 10:21 AM | |
| 563 | 11-05-2025 11:01 AM | |
| 427 | 11-05-2025 08:01 AM | |
| 816 | 11-04-2025 10:16 AM |
06-13-2017
03:21 PM
@Oleksandr Solomko You can see where these files are queued via the "summary" UI: Once the Summary UI opens, select the "CONNECTIONS" tab. You can sort on any column by clicking that column. Once you have found the row for your queued connection, click on the "view connection details icon ( )on the far right side of the row. This will pop open a new UI that shows queue breakdown per node in cluster. This will help you identify if you are having a cluster wide issue here or it is localized to one specific node. If it is just one node with all this queued data, you could manually disconnect this node from your cluster. Then go directly to the URL for that disconnected node. See if you can empty the queue then. Check for ERROR or WARN logs specifically in that nodes nifi-app.log, nifi-user.log, and nifi-bootstrap.log. What OS and Java version are you running also? Thanks, Matt
... View more
06-13-2017
12:49 PM
1 Kudo
@forest lin Backpressure is not used to control data rate in your dataflow. The intent of the backpressure setting on connections is to control the amount of allowed queued data. Both Back pressure settings are "soft" limits. Once backpressure kicks in on a connection, the processor feeding that connection will no longer be allowed to run. So in you case above, you have backpressure set to 5 Objects (FlowFiles) or 5 KB of content. Since your queue is empty, no backpressure was being applied when the 37.05 MB FlowFile arrived at your ConvertCSVToAvro processor, so that processor was allowed to run. That 1 FlowFile was processed through and placed on the outbound connection. It is at that time back pressure kicked in because you exceeded one of your backpressure settings. The ConvertCSVToAvro processor will now be prevented from running until that backpressure drops below 5 FlowFiles or 5 KB of queued data again. If all your processor are processing FlowFiles rapidly, back pressure will be very sparsely applied. Also keep in mind for efficiency some processors work on batches of FlowFiles. You may see for example with a backpressure object threshold of 5 a queue with more then 5 FlowFiles. The batch of FlowFiles are placed on an outbound queue. That processor who did the batch processing will then not be allowed to run again until that outbound connection drops again below 5 FlowFiles. The ControlRate processor allows you to actually control the throughput of a dataflow. It does not slow the processing. The ControlRate processor will allow data to queue in its input side and based on its configured setting only allow x number of FlowFiles through over y amount of time. lets say it is configured to let 5 KB of data through every 1 minute. If you feed it a 37 MB file, it does not transfer just pieces of that FlowFile. It will feed through the entire 37 MB FlowFile and then not allow another FlowFile through until the average data per 1 minute is 5 KB. Because of how the above works, data could continue to queue in front of ControlRate. This is where backpressure settings become important to stop upstream processor from running. You can set backpressure all the way upstream to your data ingest processors so they stop accepting new FlowFiles. Thanks, Matt
... View more
06-12-2017
02:12 PM
@Justin R. Is this a NiFi cluster installation with multiple nodes running on the same host? If that is the case, which ever node manages to bind to the port first wins, all other nodes on same host will report that port is already in use. Matt
... View more
06-12-2017
01:18 PM
@Ahmad Mehr When you start NiFi, the UI does not become available until the application has completed loading. /bin/nifi.sh status The above command simply shows that the application is running, but does not indicate the UI is available yet. To verify that NiFi has completed the startup process and the UI is now available, you will need to look in the nifi-app.log for the following lines: 2017-06-12 09:16:16,029 INFO [main] org.apache.nifi.web.server.JettyServer NiFi has started. The UI is available at the following URLs:
2017-06-12 09:16:16,029 INFO [main] org.apache.nifi.web.server.JettyServer http://<HOSTNAME>:8075/nifi
2017-06-12 09:16:16,031 INFO [main] org.apache.nifi.BootstrapListener Successfully initiated communication with Bootstrap
2017-06-12 09:16:16,031 INFO [main] org.apache.nifi.NiFi Controller initialization took 14617467433 nanoseconds. Until you see these log lines, the UI will not be accessible. You can also run the following linux command to see if "something" is listening on port 8075 yet: netstat -ant|grep LISTEN|grep 8075 Thank you, Matt
... View more
06-09-2017
03:51 PM
3 Kudos
@Eric Lloyd Input and Output ports are designed to send or receive data from one level up. When an input or output port is added at the root canvas level the one level up is another out of the system. You will also notice that ports added to the root canvas are rendered a little differently. There is an open Apache Jira on this subject, feel free to add your comments and use case to it: https://issues.apache.org/jira/browse/NIFI-2933 The current feeling is that adding Remote input and output ports should be left to the system administrator. This is because in a secured connection the admin must add the connecting systems as new users and authorize them to access these ports. Users are not typically granted this level of access. Thanks, Matt
... View more
06-08-2017
04:08 PM
@Daniel Frank If you use @Matt Clarke in your response, I do not get an email notification. I am not following how you use the filename and path to file (B) to parse a totally different file (C) from the filesystem. Have you looked at the FetchFile processor. It accepts a FlowFile as input and uses attributes set on the incoming FlowFile to specify what file to fetch and from where. So you could getFile (B), extract what you need from file (B) into attributes that FetchFile can use to get File (C). FetchFile will stream the content of file (C) into the FlowFile originally belonging to File (B); however, the resulting FlowFile will retain all the FlowFile Attributes that already existed on FlowFile (B). Thanks, Matt If you found this answer addressed your question, please mark as accepted to close out this thread in the community.
... View more
06-08-2017
02:18 PM
@Daniel Frank What format is your data in? (text?) Is all the information you need in the content of these files? The getFile processor already writes attributes for the following on every FlowFile it creates: You could use the ExtractText processor to read the FlowFile content and extract bits to FlowFile Attributes. Thanks, Matt
... View more
06-08-2017
02:05 PM
@Anthony Murphy NiFi is designed to be resilient. It is designed to restore processor to last known state on startup (That state may be enabled, disabled, started, or stopped.) Are you sure these component processors where not stopped before the abrupt shutdown/restart of the server occurred? This is odd since you say it only happens occasionally. And I will be honest, this is the first time i have heard this issue. Is it always the same processors that fail to start? Are the processors that fail to start configured to use any NiFi Controller services? if so, are those Controller Services failing to start also? Check the nifi-app.log during startup to see if their were any logged ERROR or WARN messages related to these processor or controller services on startup. Thanks, Matt
... View more
06-08-2017
12:50 PM
1 Kudo
@Mahmoud Shash There was a bug identified in the Controller service UI of HDF 2.1.3. This bug affected users ability to modify, enable, disable and delete controller services. The HDF 2.1.3 release was pulled down. This bug was addresses in HDF 2.1.4. If you upgrade to HDF 2.1.4 you will be able to successfully access the Controller services in the CS UI. Thanks, Matt
... View more
06-08-2017
12:17 PM
1 Kudo
@Anishkumar Valsalam There are two parts that need to be successful to access NiFi: User authentication: In your case, you are using LDAP to authenticate your users. The NiFi login-identity-providers.xml is used to configure the ldap-provider. NiFI offers two supported configurable "Identity Strategy" options (USE_DN or USE_USERNAME). USE_DN is the default. With "USE_DN" the full DN returned by LDAP after successfully authenticating a used. With "USE_USERNAME" the username entered at login will be used. Which ever strategy is used, the value used will be passed through any configured "Identity Mapping Properties" in NiFi before the resulting mapped value is passed to part two. (Review LDAP settings and Identity mapping Properties in NiFi Admin guide for more details on setup) User Authorization: In you case, you are using Ranger for user authorization. (default is NiFi's file-based authorizer). The final value derived form step one above is passed to the configured authorizer to determine what NiFi resources that authenticated user has been granted access. Based on your output above, you appear to have two options possibly to match your authenticated value with your ldap sync'd user in Ranger: Configure an "Identity Mapping Property" in NiFi that will extract on the value from CN= from the entire returned DN.
Based on the DN pattern you shared, your pattern mapping would look like this:
nifi.security.identity.mapping.pattern.dn=^CN=(.*?), OU=(.*?), OU=(.*?), OU=(.*?), DC=(.*?), DC=(.*?), DC=(.*?)$nifi.security.identity.mapping.value.dn=$1 This will return just "anish" from the DN and that is what will be passed to the authorizer.
Change your "Identity Strategy" configuration in your login-identity-providers.xml file to use "USE_USERNAME". This assumes the username supplied at login matches exactly with the LDAP sync username.
Add/Modify the following line in your ldap-provider: <property name="Identity Strategy">USE_USERNAME</property> Thanks, Matt
... View more