About MattWho

MattWho · ‎10-09-2018

@Cooper Max NiFi has two processes as you see above that are running. The NiFi bootstrap process is what is kicked off when starting NiFi and it then spans off the main NiFi process. The bootstrap process then monitors for the pid of that main process and if it disappears, the output you see above is thrown and the bootstrap then attempts tp restart the main process. - If your nifi-app.log is not exhibiting any signs of issues going on in your dataflow leading up to this event, the killing of this NiFi process is being triggered external to NiFi. - Most commonly you may find that the server itself has killed the process. I would suggest looking at your server logs for the execution of "OOM killer". When memory resources on a server reach usage a level where the OS feels it could result in the server become unresponsive or crash, oom killer is launched which evaluates the running process and elects a process to be killed to free memory to protect the OS. Considering the memory footprint of a typical main NiFi JVM process, it is commonly selected by the oom killer. - To resolve this issue, you would need to reduce the amount of memory that is being consumed by running process on this same server. - Do not run NIFi on server where other service are co-located - Reduce the configured JVM setting for the NiFi process in the nifi-bootsrap.conf file. ----- Above may require you to re-evaluate your dataflow design(s) in NiFi to reduce heap memory usage. - Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

MattWho · ‎10-09-2018

@Abdou B. - "Stopped" is probably not the correct word to use here. A processor that is started then executes based on the configured "run schedule". When Back pressure is being applied to a processor by one of the processors outgoing connections, the processor will no longer be scheduled to run. It is still started. As soon as back pressure is no longer being applied, the processor will begin executing again based on run schedule. - Thanks, Matt

MattWho · ‎10-08-2018

@Daniel Ahn NiFi does not support multiple FlowFile repositories. Only the Content and Provenance repositories support multiple locations (these are the only two repositories that may get very large). - It is highly recommended that the FlowFile repository is however located on its own separate disk or when that is not possible on its own separate logical volume. - You can move the existing FlowFile repository to a new disk. 1. Add a new disk to your NiFi server (or create a new logical volume) 2. Stop NiFi 3. Move entire contents from original flowfile_repository directory to new flowfile_repository location. 4. Verify proper ownership/permissions of all moved files 5. Edit nifi.properties so that "nifi.flowfile.repository.directory=" now points to new location. 6. Start NiFi - Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

MattWho · ‎10-04-2018

FlowFile content is not stored in provenance repository. The ability to view or replay content will only work if content still exists in content repository. Content repository can be configured to retain archived content. But keep in mind that the content of active FlowFiles still in dataflows will always take priority over archived content. If active data triggers thresholds for disk usage to exceed configured values, all archived content will be purged. Thanks, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

MattWho · ‎10-02-2018

@Arun A K The "ldap-user-group-provider" is not passed any input, so using things like (cn={0}) will not work. This is perfectly valid in the login-identity-providers.xml since it is being passed the login name the user enters on login screen. <userGroupProvider> <identifier>ldap-user-group-provider</identifier> <class>org.apache.nifi.ldap.tenants.LdapUserGroupProvider</class> <property name="Authentication Strategy">SIMPLE</property> <property name="Manager DN">uid=admin,cn=blah,cn=blah,dc=blah,dc=com</property> <property name="Manager Password">blah</property> <property name="TLS - Keystore"></property> <property name="TLS - Keystore Password"></property> <property name="TLS - Keystore Type"></property> <property name="TLS - Truststore"></property> <property name="TLS - Truststore Password"></property> <property name="TLS - Truststore Type"></property> <property name="TLS - Client Auth"></property> <property name="TLS - Protocol"></property> <property name="TLS - Shutdown Gracefully"></property> <property name="Referral Strategy">FOLLOW</property> <property name="Connect Timeout">10 secs</property> <property name="Read Timeout">10 secs</property> <property name="Url">ldap://blah.ldap.com:389</property> <property name="Page Size"></property> <property name="Sync Interval">30 mins</property> <property name="User Search Base">cn=users,cn=accounts,dc=blah,dc=blah,dc=com</property> <property name="User Object Class">person</property> <property name="User Search Scope">SUBTREE</property> <property name="User Search Filter"></property> <property name="User Identity Attribute">uid</property> <property name="User Group Name Attribute">memberOf</property> <property name="User Group Name Attribute - Referenced Group Attribute"></property> <property name="Group Search Base">cn=groups,cn=accounts,dc=blah,dc=blah,dc=com</property> <property name="Group Object Class">posixgroup</property> <property name="Group Search Scope">SUBTREE</property> <property name="Group Search Filter"></property> <property name="Group Name Attribute">cn</property> <property name="Group Member Attribute">member</property> <property name="Group Member Attribute - Referenced User Attribute"></property> </userGroupProvider> The above is what you should have at a minimum. You may need to also specify the two "Referenced User/Group Attribute" values, but can't tell from your sample LDAP output is that is really going to be necessary. - Thanks, Matt

MattWho · ‎10-02-2018

@Thomas Lebrun Provenance events are dated. While the provenance repository can be moved from one NiFi to another without issue, simply backing up a portion of it or all of it and trying to merge it with an existing provenance repository later is not possible. - Even trying to take an entire backed up provenance repository and placing it in a clean NiFi later would have its challenges. You would need to make sure the provenance retention settings in whatever NiFi you placed this backed up Provenance repository extended beyond the age of the oldest event in that backed up provenance repository or NiFi would simply purge all the events on startup. - A better option might be to consider building a dataflow on each of your NiFi instances/clusters that uses the SiteToSiteProvenanceReportingTask to send provenance events to another NiFi where it would have a dataflow build to wrote out those events to your choice of long term storage or auditing endpoint of your choice. The provenance events output by this reporting task are just JSON. - https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-site-to-site-reporting-nar/1.7.1/org.apache.nifi.reporting.SiteToSiteProvenanceReportingTask/index.html - Thank you, Matt

MattWho · ‎10-01-2018

@yazeed salem Your NiFi expression Language statement looks good. I even tested base on your example and it routed my flowfiles correctly Make sure that each of the FlowFiles being processed have required FlowFile Attributes set on them. You can stop your RouteOnAttribute processor and allow a few files to queue in the connection feeding it. Then right click on connection and select "List queue". You can then click on "details" icon to far left of any FlowFIle to verify that it does have correct attributes set on it. - - Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

MattWho · ‎09-26-2018

@Pepelu Rico The FTP processors (listFTP or GetFTP) create an attribute (filename) on the created NiFi FlowFiles they output. You can use the UpdateAttribute processor and NiFi Expression Language to manipulate the filename string to create a new FlowFile attribute with the desired path you want to they use in your PutHDFS processor. Here you can see the configuration of the UpdateAttribute processor. I have added a "new property" and named it "hdfsPath". This property name will become the name of my new FlowFileAttribute. I then created a NiFi Expression Language statement that extracts the date string from the existing "filename" attribute and converts it in to a path format using: ${filename:substringAfterLast('_'):substringBeforeLast('.'):toDate('yyyyMMdd'):format('/yyy/MM/dd')} The result of above is assigned to "hdfsPath". I can then use that new attribute as my path in the PutHDFS processor for example: Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

MattWho · ‎09-18-2018

@Wojtek I believe you are misunderstanding how the UpdateAttribute processor functions. - Each new property you add expects a property name (this becomes name of attribute being created or updated) and a value. The value can be a string or a NiFi Expression Language (EL) statement. - In your screenshot above you have created EL statements. For example Property = bytes Value = ${bytes} What the above will actually do is: - The EL statement "${bytes}" tells NiFi to try to locate an NiFi attribute (At no time does the updateAttribute processor read the content of the FlowFile) with a property name of bytes and return its assigned value. That returned value will then be used to change the existing value assigned to the FlowFile attribute "bytes" or create a new FlowFile attribute with property name "bytes" and assign the value to that. - NiFi searches for NiFi Attributes in the following hierarchy: 1. NiFi checks all existing attributes assigned already to FlowFile being processed. 2. NiFi checks all in scope process group variables 3. NiFi checks the NIFi variable registry file 4. NiFi checks the NiFi JVM properties 5. NiFi checks the NIFi user system environment variables. - Since the attribute "bytes" does not exsit in any of these places you are ending up with no value or empty string values being set on all these new properties. - Since you are trying to extract values from the content and assign those to FlowFile attributes, you will want to use a different processor. Perhaps ExtractText instead. - Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

MattWho · ‎09-14-2018

@sri chaturvedi yes, potentially if there are enough inbound FlowFiles to trigger processor to run 4 times concurrently.

Online	Online
Last Visited	‎02-03-2026 01:52 PM

Member Since	‎07-30-2019 10:41 AM
Last Visited	‎02-03-2026 01:52 PM
Posts	3,434
Kudos received	1628

Cloudera Community

Re: Setting TTL per key when writing to redis

Re: Best Practice for configuring registry flows

Re: Nifi 2.7.2 Start Problem

Re: Error importing NiFi workflow template from ve...

Re: nifi 2.6 registry security scan results

Re: Nifi often restarts automatically, causing pro...

Re: nifi back pressure threshholds

Re: Additional Hard Drive for Flowfile Repository

Re: NiFi : Best practice to backup data provenance

Re: How to configure Managed Ranger Authorizer for...

Re: NiFi : Best practice to backup data provenance

Re: NiFi Expression - Multiple AND - AND Condition...

Re: NIFI - save file with date of file tittle

Re: Problem with assigning values to attributes in...

Re: What should be Ideal Run-duration and Run sche...