Member since
07-30-2019
3390
Posts
1617
Kudos Received
999
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 226 | 11-05-2025 11:01 AM | |
| 439 | 10-20-2025 06:29 AM | |
| 579 | 10-10-2025 08:03 AM | |
| 394 | 10-08-2025 10:52 AM | |
| 435 | 10-08-2025 10:36 AM |
05-04-2017
04:44 PM
3 Kudos
@Eric Lloyd The NiFi expression language was written specifically for working with NiFi attributes. You would first need to use ExtractText processor to get the bits from your content moved into NiFi FlowFile Attributes: Add a new property to your ExtractText procesor configured as follows: Note that my Regex above has a white space at the end. This regex will result in multiple new FlowFile Attributes being created for you: So no need to followup with any substring NiFi expression language manipulation commands. Thanks, Matt
... View more
05-04-2017
04:29 PM
2 Kudos
@Prabir Guha You would certainly use the UpdateAttribute processor to do this and a NiFi expression language statement as follows : ${filename:substringAfterLast('/')} Thanks, Matt
... View more
05-04-2017
04:22 PM
1 Kudo
@Ninad Patkhedkar You are correct that the File size as it exists on the SFTP server is not written to an attribute on the listed FlowFile. On every FlowFile NiFi creates a FlowFile property ( fileSize ) which records the size of the Content associated to that FlowFile. This FlowFile property is not editable and is updated with the actual size of the fetched content post FetchSFTP. I understand you don't want to actual fetch the data but only want to get some metadata (including size) about what currently exist on the SFTP server. Interesting idea. I suggest creating an Apache Jira to add an addition FLowFile Attribute on list (for example maybe "sftp.server.file.size"). We have to make sure the attribute name is very descriptive so users do not confuse it with "fileSize" that already exists on the FlowFile. We can never assume that both "fileSize" and "sftp.server.file.size" will ever be exactly the same. And "fileSize" will change depending on how the content is manipulated as it progress through a NiFi dataflow. I see a valid use case here: But adding this property would allow users to make routing decisions on listed files. Perhaps you don't want to Fetch any Files form the SFTp server if they are larger then XXX in size. Thanks, Matt
... View more
05-03-2017
06:37 PM
@uttam kumar I have been unable to reproduce what you are seeing. Just to make sure I am doing the same steps, I have detailed what i have done below:
- I setup a Apache NiFi 1.1.2 standalone unsecure instance. - With root process group selected, I uploaded a template: - I then performed the following actions to upload my template. - Then i deleted the template I uploaded from my NiFi - After I deleted the template, I followed the same procedure above to upload the exact same template again. This process worked for me every time. I uploaded and deleted the same template repeatedly without issue. Am I doing the same thing you have been doing? Thank you, Matt
... View more
05-03-2017
12:44 PM
1 Kudo
@Sertac Kaya FlowFiles are transferred in a batches between process groups, but that transfer amounts to a updated FlowFile records. This transfer should take fractions of a ms to complete. So many threads should execute per second. So this raises the question of whether your flow is thread starved, concurrent tasks have been over allocated across your processors, your NiFi max timer driven thread count is to low, or your disk IO is very high. I would start by looking at your "Max Timer Driven Thread Count" settings. The default is only 10. By default every component you add to the NiFi canvas uses Timer driven threads. The above count restricts how many system thread can be allocated to components at any one time. I setup a simple 4 cpu vm running a default configuration. The number of FlowFiles passed through the connection between process group 1 and process group 2 ranged between 7084/second to 12,200/second. Thanks, Matt
... View more
05-02-2017
02:48 PM
2 Kudos
@Sertac Kaya A few questions come to mind... 1. What kind of processor is feeding the connection with the large queue inside ExampleA? 2. How large is that queue? The reason I ask is because NiFi uses swapping to help to limit JVM heap usage by queued FlowFiles. How swapping is handled is configured in the nifi.properties file: nifi.queue.swap.threshold=20000
nifi.swap.in.period=5 sec
nifi.swap.in.threads=1
nifi.swap.out.period=5 sec
nifi.swap.out.threads=4 The above shows NiFi defaults. A few options you may do to improve performance: 1. Set backpressure thresholds on you connections to limit the number of FlowFiles that will queue at any time. Setting the value lower then they swapping threshold will prevent swapping from occurring on the connection. Newer version of NiFi by default set FlowFile object thresholds on newly created connections to 10,000. swapping is per connection and not per NiFi instance. 2. Adjust the swap.threshold value to a large value to prevent swapping. Keep in mind that any FlowFiles not being swapped are held in JVM heap memory. Setting this value to high may result in Out Of Memory (OOM) errors. Make sure you adjust your heap setting fro your NiFi in the bootstrap.conf file. 3. Adjust the swap in and swap out number of threads. Thanks, Matt
... View more
05-02-2017
02:28 PM
@uttam kumar What version of NiFi are you running? Is this a standalone NiFi or a NiFi cluster? Is you NiFi secured? Thanks, Matt
... View more
05-02-2017
02:18 PM
2 Kudos
@Sanaz Janbakhsh HDF and HDP stacks may each use different version of Ranger. NiFi would not have been tested against a newer version of Ranger then what is included in HDF stack. That being said, there is no reason a single Ranger install could not be used to manage multiple services. The Ranger that is included with HDP will not include the service definition for NiFi, so it would need to be installed manually. The following link discusses how to set this up: http://bryanbende.com/development/2016/04/25/building-a-plugin-for-apache-ranger I do not know what impact manually adding service definitions HDP will have on HDP upgrades. Will those added service definitions be lost following upgrade? I would hope not, but personally have no knowledge in that area. Thanks, Matt
... View more
05-02-2017
12:53 PM
@Jatin Kheradiya Couple things.... 1. zookeeper is not going to work very well with a single instance running. In order to achieve Quorum there should be an odd number of zookeeper servers (3, 5, 7, etc...) with 3 as a min to achieve quorum. 2. When NiFi nodes start they communicate with ZK to find out who the currently elected cluster coordinator is. They will all request to become the cluster coordinator and an election process will begin. Until this election completes, the nodes will not join the cluster. You should see election will end messages in the nifi-app.log when an election is on-going. There are two properties in the nifi.properties file that control the election process: nifi.cluster.flow.election.max.candidates=
nifi.cluster.flow.election.max.wait.time=5 mins By default candidates is left blank which means the election will always run the full 5 minutes each time your NiFi cluster is restarted. To reduce how long the election takes to complete, set the candidates property to the number of nodes you have in your cluster. The election will complete once the configured number of candidates have checked in with zk or 5 minutes has passed. Thanks, Matt
... View more
05-01-2017
04:01 PM
1 Kudo
@kannan chandrashekaran
You are correct that a function does not exist at this time for padding left or right of a string. That being said, You can easily accomplish string padding using a simple flow consisting of a RouteOnAttribute" and "UpdateAttribute" processor. The RouteOnAttribute would contain one routing rule that checks the length of an attributes value and if it is not long enough routes it to the update attribute where you add one character of padding. My rule looks like this: -- "10" is the length I want my attribute to be. -- "test" is the attribute that I am calculating the length of. My UpdateAttribute processor then simply pads the value assigned to"test" with a single character. FlowFiles will continue in this loop until the value of "test" has reached 10 charatcters in length. Any FlowFile where the value associated to "test" has a length longer then 10 is just passed on without any change. Thanks, Matt
... View more