About MattWho

MattWho · ‎10-17-2018

@pavan srikar - The design you have in place looks to be correct solution based on your described use case here. Every node in your cluster runs the exact same flow.xml.gz - You would typically configure your "PutDistributedMapCache" and "FetchDistributedMapCache" processors to use a "Distributed Cache Service" that every node has access to. - This allows you run a single "primary node" only flow that retrieves the token based on a one hour cron and writes it to the distributed Map cache and then have a second flow that every node runs that pulls that stored token value from the distributed map cache and uses it for your downstream calls. - Using the "RedisDistributedMapCacheClientService" controller service for example allows you to set a TTL on the values you store in the cache. This allows you to expire the stored token before it is no longer valid. For example token is good for 1 hour, so you could set TTL to 50 - 55 minutes. - Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

MattWho · ‎10-15-2018

@Stephen Greszczyszyn - NiFi is designed to be data agnostic meaning it has no dependency on any specific type(s) of data. This is accomplished by wrapping ingested content in a NiFi FlowFile. A NiFi FlowFile consists of two parts: - 1. FlowFile content (This is the bytes of data which are simply written to claims in the content repository) 2. FlowFile attributes/metatada (This is information about the FlowFile and its content) - While NiFi does not have and dependency on data types, various processors that are available in NiFi likely will. So you will need to take a closer look at the documentation for any processor you use that will need to interact with the FlowFile content. NiFi has some syslog based processors already. - When it comes to writing the raw data NiFi simply transmits the bytes. If the target will accept the raw data, then all is good. - Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

MattWho · ‎10-15-2018

@Amit Mishra - Based on the ERROR above it appears someone has added a non standard jar file to NiFi's default lib directory: phoenix-4.7.0.2.5.3.0-37-client.jar . --> added to :/usr/hdf/2.1.1.0-2/nifi/lib/ directory - First thing I would try is removing this file from NiFi's default lib directory on every NiFi node. Then restart your NIFi cluster to make sure ERROR goes away. - Then if you find you need this jar for something in your dataflow, perhaps try creating a new custom lib directory in NiFi and adding it there. This done by simply adding a new property in the nifi.properties file: - 1. Create a new custom lib directory on each of the NiFi nodes (for example: /var/lib/nifi/custom-lib/) 2. Move your custom phoenix-4.7.0.2.5.3.0-37-client.jar in to that new directory. (Recommend moving any other custom added jar/nar files here as well. You should not be adding any non standard files to NiFI's default lib directory.) 3. Make sure property directory and file ownership and permissions are set. 4. Add new custom property named: nifi.nar.library.directory.custom1= . (custom1 is an example and can be set to whatever you like.) 5. Set this new properties value to teh path to your custom lib directory you created on each node. (for example: nifi.nar.library.directory.custom1=/var/lib/nifi/custom-lib/ ) 6. Restart all your NiFi nodes. 7. Verify ERROR still does not appear. - Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

MattWho · ‎10-12-2018

@David Sargrad - The link example you provided in your comment is trying to deal with a zip that contains zipped files (a zip of zips). If you are talking about a single zip that contains a directory tree with subfiles, this is relatively easy to do. - After ingesting your zip file via GetHTTP feed it to an "UnpackContent" processor and then to a "PutFile" processor. - When the "UnpackContent" processor unzips the source file, it will create a new FlowFile for each unique file found. A variety of FlowFile attributes will be set on each of those generated FlowFiles. This includes the "path" In the above example I created a directory named "zip-root" and created 4 sub-directories within that zip-root directory. I then created one file in each of those subdirectories. I then zipped (zip -r zip-root.zip zip-root) up the zip-root directory named zip-root.zip. The above screenshots shows just one of those unpacked files. - After "UnpackContent" executed, it produced 4 new FlowFile (one for each file found in those sub-directories with in the zip). - The "path" FlowFile attribute on each of these generated FlowFiles can be used to maintain the original directory structure when writing out the FlowFiles vi "PutFile" as follows: You can see form above configuration that as each FlowFile is processed by the PutFile processor it will place in a directory based on the value assigned to the "path" attribute set on each incoming FlowFile. Here i decide that my target base directory should be /tmp/target/ and then I preserve/generate the original zipped files directory beneath there. - Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

MattWho · ‎10-12-2018

@David Sargrad - NiFi is designed to prevent data loss. This means that NiFi needs to do something with NiFi FlowFiles when the processing of that FlowFile encounters a failure somewhere within a dataflow. - When in comes to ingest type processors like GetHTTP, a FlowFile is only generated upon success. As such, there is no FlowFile created during failure that would need to be handled/routed to some failure relationship. - Upon next scheduled run, the getHTTP processor will simply try to execute just like it did on previous run. If successful, a FlowFile will be produced and routed to the outbound success relationship connection. - Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

MattWho · ‎10-11-2018

@David Sargrad - Would potentially be a waste of resources to use separate NiFi instances fro each of your dataflows. A single instance also provide no HA at all. A better approach is to setup a NiFi cluster where you run multiple dataflows. To help keep your dataflows organized, user typically make use of a "Process group" for each unique dataflow. On the root canvas you simply have a Process group for each unique dataflow you create to keep the UI clean and manageable. This type of setup would also allow you to easily "version control" each of these process groups independently to a NiFi registry. - Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

MattWho · ‎10-10-2018

@Alex Coast The 5 minute retention of Bulletins is a hard coded value that cannot be edited by the end user. It is normal to see the occasional bulletin from some NiFi processors. For example a failed putSFTP because of filename conflict or network issues, but on retry it is successful. A continuous problem would result in non stop bulletins being produced which would be easily noticed. - Take a look at my response further up on using the "SiteToSiteBulletinReportingTask" if you are looking to retain bulletins info longer, manipulate, route, store that somewhere, etc.. - Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

MattWho · ‎10-10-2018

@spdvnz My suggestion here would be to handle this via the "SiteToSiteBulletinReportingTask". - You can build a dataflow to receive these bulletin events, manipulate them as you want and store them in a location of your choice for your auditing needs. - Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

MattWho · ‎10-09-2018

@walid hfaiedh @mina kh As you noted there was a change between NiFi 1.1. and NiFi 1.5 in the jython jar being used. This change was made to address https://issues.apache.org/jira/browse/NIFI-4301. As result the SAXParser your script used in NiFi 1.1 is no longer available. - The following Apache Jira was opened to address this issue: https://issues.apache.org/jira/browse/NIFI-5650 - Possible alternatives solution in mean time: 1. Rewrite your script in groovy 2. If script still executes fine via command line from the NiFi server, try using ExecuteStreamCommand processor instead for now. (this may not be an option depending on nature of your jython script) 3. Swap out the scripting nar provided in NiFi 1.5 with the scripting nar from NiFi 1.1 until above issue is addressed. - Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

MattWho · ‎10-09-2018

@Thuy Le - Does your Python script execute fine via command line on your NiFi servers? If so, did you consider instead just using the ExecuteStreamCommand processor to execute your script? This will behave in same way as executing yourself form command line and will not rely on NiFi built in libs. - The latest version of NiFi contain numerous mongo specific processors. I am not a Mongo user myself, but have you looked to see if any of those would meet the needs of your dataflow here? - Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

Online	Online
Last Visited	‎02-03-2026 06:11 PM

Member Since	‎07-30-2019 10:41 AM
Last Visited	‎02-03-2026 06:11 PM
Posts	3,434
Kudos received	1628

Cloudera Community

Re: Setting TTL per key when writing to redis

Re: Best Practice for configuring registry flows

Re: Nifi 2.7.2 Start Problem

Re: Error importing NiFi workflow template from ve...

Re: nifi 2.6 registry security scan results

Re: Broadcast a Flowfile from primary node to all ...

Re: Can NiFi route raw packets - like UDP

Re: Not able to start NiFi cluster

Re: How to define a NIFI processor that will unzip...

Re: Why does the NIFI GetHTTP processor only have ...

Re: NIFI Architectural Approach - Independent Flow...

Re: NiFi - Capture error message in Bulletin

Re: NiFi - Capture error message in Bulletin

Re: Saxparser2 not found: upgrade nifi 1.1 to nifi...

Re: NIFI -- How to import python libs for ExecuteS...