Member since
07-30-2019
3386
Posts
1617
Kudos Received
998
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 337 | 10-20-2025 06:29 AM | |
| 477 | 10-10-2025 08:03 AM | |
| 343 | 10-08-2025 10:52 AM | |
| 369 | 10-08-2025 10:36 AM | |
| 400 | 10-03-2025 06:04 AM |
05-30-2016
01:18 PM
Are there any WARN/ERROR messages being produced in the nifi-app.log or nifi-bootstrap.log?
... View more
05-27-2016
12:53 PM
Keep in mind that FlowFile Attributes live in memory. Loading a FlowFile Attribute with the entire content of the file is going to have an impact on heap usage in your flow. That being said, there are two things to consider when building dataflows like this:
1. Increasing the the size of the available heap for the NiFi application. Heap space thresholds for NiFi are configured in the bootstrap.conf file and by default are very small (512 MB).
# JVM memory settings java.arg.2=-Xms512m java.arg.3=-Xmx512m 2. You must take in to consideration the data volumes you will be working with in the particular dataflow. To help prevent out of memory error in NiFi, we have established a threshold on how much data can queue on a connection before FlowFile's attributes are swapped out of heap to disk. The default configuration in the nifi.properties file is 20,000. ( nifi.queue.swap.threshold=20000 ) this is per connection not per flow. So if the FlowFiles you extracted content in begin to queue on numerous connections, you run the risk of hitting the out of memory condition quicker. You can decrease this value so swapping happens sooner, but that will in turn have an impact on performance. I would start with increasing the heap memory for your NiFi and the go from there.
... View more
05-27-2016
12:35 PM
While still possible to use multicast, it is very uncommon. Its original intend was so you could setup your NiFi cluster so it could auto-discover the NCM. The idea behind this was that if the NCM died, a new one could quickly be stood-up and the nodes would auto-discover and join that new NCM without needing to be restarted. This multicast setup has been around since clustering in NiFi was first added. This was long before site-to-site capability was added. With Site-to-Site, the ability use multicast for the intend described above is not possible without some unique setups within DNS. The RemoteProcessGroup is very dependent on a specific NCM URL, so having the URL change would break Site-To-Site.
... View more
05-25-2016
10:51 PM
The fact that it was started without any configuration modification will have only one impact. With default configuration, the NiFi instance would have started http as a standalone instance. As a result it would have generated a flow.xml.gz file and a templates directory inside the NiFi conf directory. If the cluster NCM you are joining this node to already has a existing flow or templates, this node will fail to join because they will not match. NO need to reinstall to fix this if that is the case. Simply delete the flow.xml.gz file and the templates directory before starting it again. When it joins the cluster it will get the current flow and templates from the NCM.
... View more
05-25-2016
10:17 PM
That state directory you found only exists because at some point you started your NiFi instance and it was generated by the application. Had this been a fresh install it would not have existed and you would have needed to create yourself to complete the zookeeper setup.
... View more
05-25-2016
10:14 PM
1 Kudo
Yes you can use that state directory and just create the zookeeper sub directory in which you will have the myid file. I do recommend that your state directory is instead created somewhere outside of the base NiFi install path. This can aid in simplifying future upgrades of NiFi. Since newer version will still want to reference the existing cluster wide state created in your existing NiFi version. If you do choose to move it form default, update the zookeeper properties file and create the new path.
... View more
05-17-2016
04:06 PM
2 Kudos
Is that the entire log message? Can you share the preceding lines to this stack trace? Marco,
The NoClassDefFoundError you have encountered is most likely caused by the contents of your core-sites.xml file. Check to see if the following line exists and if it does remove it from the file:
“com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec” from “io.compression.codecs” property in “core-site.xml” file. Thanks, Matt
... View more
04-28-2016
09:36 PM
2 Kudos
Understanding your flow will help us understand what is going on.
1. Are you creating a zero byte file that you are using as the trigger for your InvokeHTTP processor?
2. How do you have the invokeHTTP processor configured? (Is it set to Put Response Body In Attribute?)
If Put Response Body In Attribute is set to an attribute value, the content of the Flowfile on the "original" relationship will still have a zero byte content size. NiFi does not support the replay of flowfiles that are zero bytes in size. (A Jira is being entered for this as i see replay of zero byte file scan have a valid use case at times)
If you did not configure "Put Response Body In Attribute" property, a new FlowFile would have been generated where the response becomes the content and the FlowFile is routed to the "response" relationship. NiFi cannot replay files a creation time in the flow. The way replay works, Flowfiles are reinserted on the connection feeding the processor that produced the event. In cases where the processor producing the event actually created the Flowfile, there is no where to reinsert that claim for replay. You should however be able to replay that file at the next processor that produced an provenance event.
If that replay messgae is generated at a later in line processing event, it indicates that the content no longer exist in the content repos archive. Typically this is because the retention duration configured in the nifi.properties file has been exceeded for this content, but it could also be caused by other factors such as Content repo has exceeded the configured allowable disk utilization threshold percentage (also configured in nifi.properties file) or the content was manually deleted from repo (less likely). Queued active data in the flow takes precedence over archive data retention, so if you have a lot of queued data in your flow, you may not have an archived data at all because of the max disk utilization percentage configured for your NiFi.
... View more
04-26-2016
09:21 PM
There are additional items that will need to be taken in to consideration if you are running a NiFi cluster. See the following for more details:
https://community.hortonworks.com/content/kbentry/28180/how-to-configure-hdf-12-to-send-to-and-get-data-fr.html
... View more
04-26-2016
07:28 PM
Can you provide a little more detail on your use case? Where will the URLs you want to use originate from?
... View more