About MattWho

MattWho · ‎01-25-2021

@vchhipa If you are forced to "terminate" the processor after requesting it to stop, this can mean that you have a stuck or very long running process thread. The "terminate" does not actually kill the thread, it disassociates that JVM thread from the processor and he current FlowFile that thread is associated with. The terminated thread continues to run until it completes (NiFi does request thread to quit/exit when terminating it, but success of that varies since not all process support that ability). Any terminated threads still active will be represented in the processor by a small number in parentheses (1) displayed in its upper right corner. The previously associated FlowFile is left on the inbound connection and will be handled based on queue priority when the consuming processor is started again. If the "terminated" thread should eventually complete, any output/return from that thread including logging is just sent to null. To figure out what is going on when you have a seemingly hung thread is to get a series of NiFi thread dumps: ./nifi.sh dump <dump-filename-nodenum-01.txt> Getting at least 3 dumps at an interval of 2-5 minutes apart is usually good. What you are looking for is a thread associated with your processor class (invokeHTTP in this case) where the same thread number exists in every thread dump collected. Then you will want to luck that the thread stack to see if all are identical. If thread output changes between thread dump outputs, it indicates that thread is not hung but rather just long running. If thread dump output does not change, you'll want to dig in to that output to see what it is waiting on. Hope this helps, Matt

MattWho · ‎01-25-2021

@adhishankarit The issues is being caused by the line returns used in the middle of the NiFi NiFiExpression Language (EL) ifElse() function you are using. The text box where you enter your NiFi EL uses a NiFi editor that highlights to show proper EL format. You'll notice your EL stops highlighting once you reach first line return. So you'll notice character 32 is the first single quote character. Since EL breaks at this point it fails to find the matching second expected single quote. This leaves you with two options: 1. Create flat json without the line returns. 2. Looking at result you are trying to achieve, design your NiFi EL differently: Note proper NiFi EL highlighting above. Hope this helps, Matt

MattWho · ‎01-21-2021

@Lallagreta Make sure you do not have any line returns in the values for your dynamic properties added in the UpdateAttribute processor. When you click on the value field for each property you should not see a line "2". For example: Above would result in the value assigned to the FlowFile Attribute having a line return. If this is the case, edit the properties value(s) to remove the line returns so you only see one line (1). Hope this helps, Matt

MattWho · ‎01-21-2021

@pal_lee A FlowFile represents each object moving through the system and for each one, NiFi keeps track of a map of key/value pair attribute strings and its associated content of zero or more bytes. A FlowFile consist of two parts (FlowFile Attributes/Metadata and FlowFile content). NiFi overview. I assume what you are really asking here is how to programmatically construct a NiFi dataflow via rest-api calls rather then drag and drop to the canvas via the NiFi UI. Probably the easiest way to become familiar with how to do this is through the use of the developer tools available in your browser. Each request made while constructing a dataflow via the NiFi UI would be captured by the developer tools and you can right click on those calls to get the curl command that was actually executed against the ret-api endpoint to initiate that change on the canvas. These work as great examples for how to add components and establish connections between those components. The NiFI rest-api docs are also a good source of reference but really do not cover down to the specific nuances of every component. Hope this helps, Matt

MattWho · ‎01-21-2021

@Siddo The current strategy you are using is the best option with a use a case where the client is sending/pushing data to listeners across your NiFi cluster nodes. Whenever you have a client that is pushing data to NiFi, this setup avoids as your mentioned having a single point of failure. If a load balancer can't be used, It becomes the responsibility of the client to detect delivery problems and switch to delivering to a different node. Load balancing within NiFi's dataflows is the best option when the dataflow is consuming from a source system. Some data consumption methods are not cluster friendly (for example FTP). This is because every node in a NiFi cluster executes the same flow.xml.gz. If you had for example the listSFTP/GetSFTP processors running on every node, you would have data duplication and potentially issues as every node tried to consume the same data. So in this scenario you would configure the processor to execute on the primary node only and then use LB connections to immediately redistribute those FlowFiles across your cluster before doing further processing. This is why we created the List and Fetch processor pairs. These are typically non cluster friendly type processors. So a ListSFTP produces FlowFiles with zero content and only attributes with details on where to fetch a specific FlowFiles content. Those 0 bytes FlowFiles quickly Load balance across the cluster where the FetchSFTP processor would fetch the actual content for the FlowFile specific data file and insert it into the FlowFile. This type of setup also avoid single point of failure since loss of the currently elected primary node (where the data lister/consumer is running) would result in a new node being elected as the new Primary node. That new primary node reads state from the cluster state provider and begins listing where the previous elected node's list processor stopped. So you can see that each use case has very specific benefits/use cases. Another scenario may be that even with an external F5 LB, you may find one node in your cluster ends up with a larger burden of work load (maybe one node ends up with bulk of larger data files. That data can be redistribute on connections were such single node bottle necks occur to re-balance the load at that point in a dataflow. So at times a combination may make sense as well, but I would not just apply this strategy unless needed since it adds to network usage. NiFi's internal LB connections can also be used to move all data to a single node for some use case. Let's say there is a batch of data spread out across multiple NiFi nodes that you want to merge in to a single FlowFile. NiFi nodes each work on only the FlowFiles on their own node. But using LB connection in specific spots on your flow would allow you to move all like data to the sam node before a merge type processor. Hope this helps, Matt

MattWho · ‎01-15-2021

MiNiFi offers CPP version that is well suited for Windows event log ingestion.

MattWho · ‎01-15-2021

@dzbeda Just to add to this, MiNiFi offers a C++ agent. There are many users out there using MiNiFi CPP collecting Wiindows event logs and forwarding them to nifi via invokehttp (on MiNiFi CPP) to listenhttp (on NiFi). Thanks, Matt

MattWho · ‎01-15-2021

@Melissa It may be helpful if you can share screenshots of your controller services (JsonTreeReader and JsonRecordSetWriter complete configurations) and the complete exception written to the nifi-app.log.

MattWho · ‎01-15-2021

@Lyoung The NiFi client (NiFi or MiNiFi instance running the Remote Process Group (RPG)) has not control over the connection with the server (NiFi configured with Remote input or Output ports). The RPG is provided with a http or https address of one or more target NiFi nodes in a NiFi cluster). A background thread connects to that target NiFi to fetch Site-To-Site (S2S) details. If the target is https enabled, a mutual TLS handshake will happen. This means the client must have a keystore and truststore configured in the nifi.properties (NiFi) or config.yaml (MiNiFi) that can successfully be used to mutually authenticate with the target NiFi server. The server side NiFi must have the properties you listed configured: nifi.remote.input.host=<must be set to hostname of NiFi on which you are configuring this property. This is the hostname returned to client in the S2S details. Be careful that what ever you set heer does not resolve to localhost.> nifi.remote.input.secure=false (this tells client if connection is secure or unsecure. If false, the "nifi.web.http.port" property must be set and the URL used in the RPG must be "http://<target nifi>:<http port>/nifii". If set to true, the "nifi.web.https.port" property must be set and the URL used in the RPG must be "https://<target nifi>:<https port>/nifii" nifi.remote.input.socket.port=<This is the RAW port that will be used to actually send or receive the FlowFiles from remote Input or Output ports on target NiFi node(s). If this property is not set on the target NiFi node(s), RAW transport protocol will not be supported. (S2S details are always fetched over HTTP)> nifi.remote.input.http.enabled=true. <This properties states whether the "http" transport protocol can be used for sending the FlowFiles.> nifi.remote.input.http.transaction.ttl=30 sec nifi.remote.contents.cache.expiration=30 secs Based on the log output shared it sounds like above properties were not set on the Target NiFi node(s). Did you set them on client NiFi (NiFi actually running the RPG)? In addition to the Target NiFi S2S details above for each target NiFi node being returned to client, the details will also include the FlowFile load on each node, Remote input ports that client has been authorized to use, and Remote Output ports that the client has been authorized to use. If the target server side NiFi node(s) are unsecured then there will be no authorization set for ports, all clients would have access to all remote input/output ports. Also keep in mind that any changes to NiFi's/MiNiFi's configuration files would require a restart of the service before they would be applied. Aside from above, I would need to see screenshots and nifi.properties/config.yaml configs of both your client and server side of this S2S connection to help further. Hope this helps, Matt

MattWho · ‎01-15-2021

@Fierymech If you clear all the FlowFiles out of you test dataflow, stop all processors, and start on your GetFile processor, how many FlowFiles get queued on the success connection out of the GetFile processor? How many "out" does it show on the GetFile stats?

Online	Offline
Last Visited	‎12-26-2025 01:55 PM

Member Since	‎07-30-2019 10:41 AM
Last Visited	‎12-26-2025 01:55 PM
Posts	3,406
Kudos received	1618

Cloudera Community

Re: Error importing NiFi workflow template from ve...

Re: Error importing NiFi workflow template from ve...

Re: How to elevate a default nifi user to admin - ...

Re: NiFi EnvokeHTTP - putting current date on HTTP...

Re: Invoking Nifi rest api in Data Flow

Re: Flowfiles stuck in queue to InvokeHttp...

Re: Nifi Expression language ifelse in ReplaceText...

Re: AttributesToJson into Elasticsearch doesn't wo...

Re: Can we create flow files using NIFI Api's in ...

Re: NiFi Load balancing, internal or external

Re: How to collect windows event log using Nifi

Re: How to collect windows event log using Nifi

Re: PublishKafkaRecord_2_6 with JsonRecordSetWrite...

Re: Nifi Site To Site Input port not seen by RPG

Re: Unable to route Text in nifi using the route t...