About MattWho

MattWho · ‎03-13-2017

@John T The PriorityAttributePrioritizer controls the order in which FlowFiles on a queue are read for processing by the next processor in a dataflow. In your case that next processor is a MergeContent processor which just places the FlowFile in a Bin and move on to the next priority FlowFile. There is no definable merging order with the TAR format in the NiFi MergeContent processor. If you know the exact number of fragments that are going in to your bundle, you could try setting the fragment identifiers on the incoming FlowFiles to force a merging order. This would require you to change your merge strategy to "Defragment". Thanks, Matt

MattWho · ‎03-13-2017

@Paras Mehta Unless the SQL statement in the content of each of your FlowFiles is identical, a new transaction will be created for each FlowFile rather then multiple FlowFiles in a single transaction. This is because of how the putSQL is designed to do batch inserts. The SQL command may use the ? to escape parameters. In this case, the parameters to use must exist as FlowFile attributes with the naming convention sql.args.N.type and sql.args.N.value, where N is a positive integer. The sql.args.N.type is expected to be a number indicating the JDBC Type. The content of the FlowFile is expected to be in UTF-8 format. So consider the case where you are inserting to a common DB table. In order to have every insert statement be exactly identical you will need to create attributes sql.args.N.type and sql.args.N.value for each column. sql.args.1.type ---> 12 sql.args.1.value ---> bob sql.args.2.type ---> 12 sql.args.2.value ---> smith sql.args.3.type --> 12 sql.args.3.value --> SME And so on.... You can use UpdateAttribute and maybe ExtractText processors to set these attributes on your FlowFiles. The you can use a ReplaceText processor to replace the content of your FlowFile with a common INSERT statement like below: INSERT INTO mydb("column1","column2","column3") VALUES(?,?,?) Now every FlowFile will have identical content and batch inserts will work in a single transaction. The "?" are replaced by the values from the attributes sql.args.1.value. sql.args.2.value, sql.args.3.value, and so on.... Thanks, Matt

MattWho · ‎03-13-2017

@mel mendoza May I suggest that you open an Apache Jira against NiFi for this. Thanks, Matt

MattWho · ‎03-13-2017

@Christophe Vico Exactly, so only <name> as you saw would have been passed to ranger for authorization. It is working correctly based upon your current configuration. Can you access the UI of all of the other nodes without issue? Do there DNs match the same patterns on those nodes. If my above answer addressed your question, please accept that answer. Thank you, Matt

MattWho · ‎03-13-2017

@vikash kumar Did you see the addition I made to my answer above that provided you with a working Expression Language statement to handle your routing? If this solution addressed you question, please accept teh answer. Thank you, Matt

MattWho · ‎03-10-2017

@vikash kumar The RouteOnAttribute processor expects that a NiFi expression Language statement is used. If the the evaluation of that EL statement results in a true, then the FlowFile will be routed to the corresponding property name's relationship. Here is an example that will route FlowFile where the value associated to "file.lastModifiedTime" on the incoming FlowFile falls within the last 24 hours to the "last24hours" relationship: Here is the full EL statement so you can copy it: ${file.lastModifiedTime:toDate("yyyy-MM-dd'T'HH:mm:ssZ"):toNumber():ge(${now():minus(86400000)})} Thank you, Matt

MattWho · ‎03-10-2017

@vikash kumar Are you looking for files where the "file.lastModifiedTime" is exactly 2017-03-08'T'11:00:00Z? Or are you looking for all files created at that time and newer?

MattWho · ‎03-10-2017

@vikash kumar Can you share the NiFi expression language routing rule you created in your RouteOnAttribute processor? The rule must evaluate to "true" before a FlowFile will be routed to that relationship.

MattWho · ‎03-10-2017

@vikash kumar You can use the ListSFTP processor to list all files on your SFTP server. The ListSFTP processor will create a 0 byte file with the following additional attributes written to it: Take the success relationship of the ListSFTP and send it to a routeOnAttribute processor. Use the routeOnAttribute processor to route on those FlowFiles where the attribute "file.lastModifiedTime" date falls within your desired range to a FetchSFTP processor. All other listed files could just be auto-terminated. Thanks, Matt

MattWho · ‎03-10-2017

@Mehul Shah No problem. At the bottom of every answer is a line that says: Simply click on "Accept" under the answer that helped resolve your issue. As a future note, avoid adding new answers as responses to other answers. Responses should be made as comments so that it is one continuous thread for other users searching for answers to the same issue. Thanks again, Matt

Member Since	‎07-30-2019 10:41 AM
Last Visited
Posts	3,131
Kudos received	1560

Cloudera Community

Re: Flowfile stuck in Wait in EnforceOrder process...

Re: Untrusted proxy error Authentication Failed o....

Re: REST API Configuration for NiFi 2.0

Re: Fileflow penalized for certain time before all...

Re: Nifi : Implement Sleep Mechanism in nifi witho...

Re: Order of files Tared in MergeContent processor

Re: Fast Bulk Insert into Oracle from NiFi

Re: NiFi - ListFTP on Unix (SunOs 4.1)

Re: Nifi Cluster- Proxy hostname

Re: How to download files from sftp, with last mod...

Re: How to download files from sftp, with last mod...

Re: How to download files from sftp, with last mod...

Re: How to download files from sftp, with last mod...

Re: How to download files from sftp, with last mod...

Re: Not able to start NiFi