Member since
07-30-2019
3426
Posts
1631
Kudos Received
1010
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 388 | 01-13-2026 11:14 AM | |
| 744 | 01-09-2026 06:58 AM | |
| 778 | 12-17-2025 05:55 AM | |
| 839 | 12-15-2025 01:29 PM | |
| 703 | 12-15-2025 06:50 AM |
05-05-2017
04:14 PM
1 Kudo
@ismail patel For now you can force a better load distribution is such case by doing the following for your 0 byte low volume flow: No need to change any configuration on the RouteOnAttribute processor. Simply connect the pre-existing "unmatched" relationship to your RPG. Thanks, Matt
... View more
05-05-2017
04:04 PM
1 Kudo
@ismail patel When the Remote Process Group was originally written, the list and fetch type processors did not exist and NiFi was primarily being used to process large files. Based off that, the design used worked pretty well for load-balancing. Here is how it works: 1. RPG connects to target NiFi cluster. 2. Target NiFi returns a list of available peer nodes along with their current load. 3. RPG creates a distribution allocation. (You will see in logs that distribution in the from of percentages. Node 1 50%, Node2 50%). 4. RPG connects to node 1 and sends data from incoming queue for 5 seconds. then connects to node 2 and sends for 5 seconds. Lets assume a distribution las follows: Target is 4 node cluster and base on load distribution, the following distribution was calculated: Node 1 -- 10% Node 2 -- 20% Node 3 -- 20% Node 4 -- 50% As a result the RPG would connect in this pattern to send the data: Node1, Node2, node3, node4, node2, node3, node4, node4, node4, node4. While with a high volume dataflow and a dataflow dealing with larger files this balances out nicely over time. A low volume flow or a flow dealing with very small files, it does not work so well. In your case, you have a bunch of small files (all 0 byte), so in 5 seconds every one of them is send to node 1. There are improvements coming to the RPG to set batch sizes and improve how the distribution of data occurs. Thank you, Matt
... View more
05-05-2017
12:22 PM
@Ayaskant Das Just wondering if the above was able to resolve your issue. The nifi-user.log screenshot you provided clearly shows that you have reached NiFi and successfully authenticated with the above user DN; however, the user is not authorized to access the nifi /flow resource. Thank you, Matt In order for users to get notifications of a comment to a post you must tag them in teh response using the @<username> (example: @Matt Clarke )
... View more
05-04-2017
09:03 PM
@Bharadwaj Bhimavarapu Whether you are using the ConsumeKafka or PublishKafka processors, if Kafka is kerberized you will need to setup a JAAS file in your NiFi which provides the keytab and principal used to establish that secured connection. By default the /etc/krb5.conf will be used, but you can also tell NiFi to use a different krb5.conf file via a property in the nifi.properties (nifi.kerberos.krb5.file=). You will need to create a JAAS file (example: kafka-jaas.conf) that contains the following (Update to use appropriate keytab and principal for your user): KafkaClient{
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
keyTab="nifi.keytab"
serviceName="kafka"
principal="nifi@DOMAIN";
}; Add the following line to NiFi's bootstrap.conf file (make sure arg number 20 is not already being used, if so change to an unused number): java.arg.20=-Djava.security.auth.login.config=/<path-to>/kafka-jaas.conf Update the following configuration properties in your ConsumeKafka processor: SecurityProtocol= SASL_PLAINTEXT
ServiceName= kafka Basically you are setting up the Kafka client kerberos environment for your NiFi JVM. If this is a NiFi cluster, you will need to to the above on every node. You will need to restart NiFi for these changes to take affect. Thanks, Matt
... View more
05-04-2017
07:43 PM
@Jatin Kheradiya There are a few things that do not look right in your nifi.properties configuration above: On every node the following properties should be configured with the FQDN of the node: 1. nifi.remote.input.host=
2. nifi.cluster.node.address=
3. nifi.web.http.host= or nifi.web.https.host= I noticed you are configuring nifi.remote.input.host= with the IP of a different node. It is not clear from the above if you set a value for nifi.web.http.host= or nifi.web.https.host=. If you did not, Java may be resolving your hostname to localhost. This can be problematic for cluster communications. Since node may end up trying to talk to themselves rather then actually talking to the other nodes. Also make sure that the following ports are open in any firewalls between your nodes: 1. nifi.remote.input.socket.port=
2. nifi.cluster.node.protocol.port=
3. nifi.web.http.port=8080 or nifi.web.https.port=
Also make sure all three of your nodes can talk to zookeeper on port 2181. Thanks, Matt
... View more
05-04-2017
05:19 PM
@Eric Lloyd You can still use ExtractText to get all the bits broken out at once by adding multiple new properties: Thanks, Matt
... View more
05-04-2017
04:44 PM
3 Kudos
@Eric Lloyd The NiFi expression language was written specifically for working with NiFi attributes. You would first need to use ExtractText processor to get the bits from your content moved into NiFi FlowFile Attributes: Add a new property to your ExtractText procesor configured as follows: Note that my Regex above has a white space at the end. This regex will result in multiple new FlowFile Attributes being created for you: So no need to followup with any substring NiFi expression language manipulation commands. Thanks, Matt
... View more
05-04-2017
04:29 PM
2 Kudos
@Prabir Guha You would certainly use the UpdateAttribute processor to do this and a NiFi expression language statement as follows : ${filename:substringAfterLast('/')} Thanks, Matt
... View more
05-04-2017
04:22 PM
1 Kudo
@Ninad Patkhedkar You are correct that the File size as it exists on the SFTP server is not written to an attribute on the listed FlowFile. On every FlowFile NiFi creates a FlowFile property ( fileSize ) which records the size of the Content associated to that FlowFile. This FlowFile property is not editable and is updated with the actual size of the fetched content post FetchSFTP. I understand you don't want to actual fetch the data but only want to get some metadata (including size) about what currently exist on the SFTP server. Interesting idea. I suggest creating an Apache Jira to add an addition FLowFile Attribute on list (for example maybe "sftp.server.file.size"). We have to make sure the attribute name is very descriptive so users do not confuse it with "fileSize" that already exists on the FlowFile. We can never assume that both "fileSize" and "sftp.server.file.size" will ever be exactly the same. And "fileSize" will change depending on how the content is manipulated as it progress through a NiFi dataflow. I see a valid use case here: But adding this property would allow users to make routing decisions on listed files. Perhaps you don't want to Fetch any Files form the SFTp server if they are larger then XXX in size. Thanks, Matt
... View more
05-03-2017
06:37 PM
@uttam kumar I have been unable to reproduce what you are seeing. Just to make sure I am doing the same steps, I have detailed what i have done below:
- I setup a Apache NiFi 1.1.2 standalone unsecure instance. - With root process group selected, I uploaded a template: - I then performed the following actions to upload my template. - Then i deleted the template I uploaded from my NiFi - After I deleted the template, I followed the same procedure above to upload the exact same template again. This process worked for me every time. I uploaded and deleted the same template repeatedly without issue. Am I doing the same thing you have been doing? Thank you, Matt
... View more