About MattWho

MattWho · ‎01-14-2019

@Mr Anticipation - *** Community Forum Tip: Try to avoid starting a new answer in response to an existing answer. Instead use comments to respond to existing answers. There is no guaranteed order to different answer which can make it hard following a discussion. - 1. NiFi and NiFi-registry are two totally different pieces of software. Each of these services are likely running as different service users. HDF service user defaults: NiFi service --- default service user is "nifi" NiFi Registry service ---> default service user is "nifiregistry" - 2. The NiFi service is where you are building your dataflows on the canvas. The NiFi-Registry service is used to store version controlled dataflows from your NiFi. - 3. Make sure that the directory you are trying to ingest files is accessible by the nifi service user. Suggest accessing server via command line and becoming the nifi service user (#sudo su - nifi) and then navigate to the target directory cd /home/xxx/receive. Keep in mind that even though the "receive" directory may be set to 777, if the nifi service user can't access /home or /home/xxx they will not be able to see "/home/xxx/receive" regardless of what permissions are set on that directory. - Thank you, Matt

MattWho · ‎01-11-2019

@Mr Anticipation - The ERROR says you have a permissions issue. The user who owns the NiFi java process does not have permissions to navigate down the path /home/xxx/receive and/or does not have permissions to files you want to ingest. - Ambari by default creates the "nifi" service user account which is used to run NiFi. As such, that "NiFi" user must have access to traverse that directory path and consume the target file(s). - following command can be used to see what user owns the two nifi processes. # ps -ef|grep -i nifi - Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

MattWho · ‎01-03-2019

@Adam J The Remote Process Group (RPG) was not designed with any logic to make sure specific FlowFiles went to one node versus another. IT was designed to simply build a delivery model based on load on target NiFi cluster nodes. That delivery model will change potentially each time the lates cluster status is retrieved. - If you need to be very specific as to which node get a specific FlowFile, you best bet is to use a direct delivery dataflow design. The best option here is to have your splitText processor send to a routeOnContent processor that sends the split with URL 1/2 to one new connection and the flowfile with url 3/4 to another connection. Each of these connections would feed to a different postHTTP processor (this processor can be configured to send as flowfile). One of the would be configured to send to a listenHTTP processor on node 1 and the other configured to point at same listenHTTP processor on node 2. - You may want to think about this setup from a HA standpoint. If you lose either node 1 or 2, those flowfiles will just stack up and not transfer until the node is back online. at the same time the other urls continue to transfer. - Something else you may want to look into is the new load-balanced connections capability introduced in NiFi 1.8: https://blogs.apache.org/nifi/entry/load-balancing-across-the-cluster - There is a "Partition by Attribute" option with this new feature which would make sure flowfiles with matching attribute go to same node. While you still can't specify a specific node, it does allow similar flowfiles to get moved to same node. if node goes down you don't end up with an outage, but files with matching attributes will stay together going to different node that is still available. - Thanks, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

MattWho · ‎01-02-2019

@Nimrod Avni The config.json generated as output when you stood up your NIFi CA (server) is there to simplify the execution of the client mode so that you do not have to manually pass all the server info to the client mode input. This was just a choice made by the development team to generate this file rather then just expect user to remember what they entered when the stood up the Server. You can delete this file if you want to as long as you have stored or can remember the pertinent information yourself for running the tls-toolkit client mode later. as far as client mode goes, the generated config.json is also just there to provide you the pertinent information about the client keystore that was created this is all information you should already know (unless you did not provide a password and toolkit auto-generated one for you which they you would need to get form the output config.json file.) - Thanks, Matt

MattWho · ‎01-02-2019

You can run the tls-toolkit in client mode directly from any node, but you will either need to provide the CA server info or copy CA config.json to each node manually. I was not trying to imply that you must execute client mode form same server where CA server was installed. - The NiFi CA was not built with the intent for use in a production environment. It was built as a tool that allows users to easily and quickly setup secured NiFi instances/clusters for development and testing purposes. For production environments a corporately/privately managed CA should be used. - There should only ever be one NiFi CA installed and being used to sign all certificates. I apologize if what i wrote was confusing and led you to believe multiple NiFi CAs were needed or should be used. - Feel free to open an Apache NiFi Jira to add the ability to update an existing or output a new nifi.properties file when client mode is used. I don't see that as a bad request at all. - Thank you, Matt

MattWho · ‎01-02-2019

@Nimrod Avni - The Standalone option is not ideal for setting up a NiFi cluster. Since the certificates generated are not signed by a Certificate Authority, the truststore will need to contain a trustedCertEntry for each certificate created. Adding additional nodes to a cluster would require going back and modifying the truststore on every node in the cluster. - The Client/Server mode allows you to standup a Certificate Authority (Server mode) that will be used to sign all the client certificate created (one for each NiFi node). When you stand up the Server a config.json is generated which can be used as input to the client mode operation. Because of this it is common that each of the client certificates are also generated from same server where the CA (server) was created/started. The client mode simply outputs a config.json file for each client certificate which simply provides the information needed to setup the relevant nifi.properties properties on each of your NiFi nodes. - It is safe to say that the structure will remain unchanged with a major NiFi release version. An external script could be used to update a nodes nifi.properties file from the output generated in the client mode config.json file. HDF for example already does this. If you choose to utilize the NiFi CA in HDF, it will take care of obtaining the client certificates and updating the nifi.properties on each node. This allows new client certificate to be generated on demand for each node. There is no option to configure NiFi to read these security parameters from the client mode generated config.json file. - Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

MattWho · ‎01-02-2019

@john y - The rest-api endpoint you are using is incorrect for instantiating an existing template on the canvas. You should instead be using a curl command that looks something like this: # curl 'http://localhost:8080/nifi-api/process-groups/<PROCESS GROUP UUID>/template-instance' -H 'Content-Type: application/json' --data-binary '{"templateId":"<THE_TEMPLATE_UUID>","originX":100,"originY":100,"disconnectedNodeAcknowledged":false}' —compressed - The rest-api endpoint contains the UUID of the process group in which you will be instantiating your template. You need to include a header like above that defines the content type and then provide "--data-binary" json that includes the template's UUID and the x coordinates on the graph where the template should be placed. - Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

MattWho · ‎12-07-2018

@Nicolas Osorio . You are correct. I wrote this article some time ago. The use of the less secure aes128 is not going to be accepted by Newer versions of browsers and NiFi. Switching to a more secure aes256 will resolve issue.

MattWho · ‎11-26-2018

@Mauro Beltrame - Keep in mind that every node in your NiFi cluster runs its own copy of the flow.xml, has its own set of repositories, and works on its own set of FlowFiles. When a "primary node" change occurs, this does not mean that FlowFiles being processed on old primary node are moved over to the new primary node. It is still the responsibility of the old primary node to finish processing FlowFiles on that node. The "primary node" execution configuration setting on processors simply sets whether this processor should be scheduled to execute on all nodes at same time or just get scheduled on which ever nodes is currently elected the primary node. This is important for non-cluster friendly protocols that some processors use (i.e listSFTP, ListFIle, etc...). - Keep in mind that only processors that are responsible for ingesting data (those that create the FlowFile in NiFi) should be configured for "Primary node" only operation. Al processors within the body of a dataflow (any processor that accepts an inbound connection to it) should be configured to always run on all nodes in your cluster. - If I had to guess one or more of the following is occurring for you: 1. Processors within the body of your dataflows are configured with "Primary node". This means that any FlowFiles ingested on old primary node will end up queued in front of one of these now primary node only configured processors that is no longer getting scheduled on that old primary node. 2. Your primary node only processors have some configuration that is dependent on a local file that that is not present on every node in same directory with same permissions. (for example, Using a private key in listSFTP and that private key was not placed on all nodes). - Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

MattWho · ‎11-16-2018

Article content updated to reflect new provenance implementation recommendation and change in JVM Garbage Collector recommendation.

Online	Offline
Last Visited	‎07-09-2026 02:31 PM

Member Since	‎07-30-2019 10:41 AM
Last Visited	‎07-09-2026 02:31 PM
Posts	3,472
Kudos received	1638

Cloudera Community

Re: ListenNetFlow processor does not decode Cisco ...

Re: Can we detect who did a particular operation i...

Re: How to invoke a url in nifi which is protected...

Re: Retry impacts scheduler

Re: 503 error while copying/versioning big process...

Re: HDF NiFi issue. Not recognizing directories

Re: HDF NiFi issue. Not recognizing directories

Re: How to send flowfile to other Nifi instances i...

Re: nifi security configuration

Re: nifi security configuration

Re: nifi security configuration

Re: Truoble with creating templates and processors...

Re: How to create user generated keys for securing...

Re: Why some NiFi process get blocked when switch ...

Re: HDF/NIFI Best practices for setting up a high ...