About Wynner

Wynner · ‎05-03-2017

@Ravi Teja The flowfile isn't going out the FAILED connection because the session is rolled back to the incoming queue. I have highlighted that part of the log with the roll back information. Meaning the flowfile is penalized and then put back on the incoming queue for the PutSQL processor. If you want to write out the file that is causing the roll back, with a PutFile processor, there are ways to pull that flowfile out of the flow. You can use an RouteOnAttribute and put the uuid of the flowfile in as a property to get the flowfile and then route the flowfile out to the PutFile processor. For provenance events, there is a slight delay in the writing of events based on the configuration of NiFi. This is the property that controls how quickly the provenance events are available in the UI: nifi.provenance.repository.rollover.time -- The amount of time to wait before rolling over the latest data provenance information so that it is available in the User Interface. The default value is 30 secs. To see events for a particular processor, right click on the processor and a menu will pop up, select Data provenance Then another window will open displaying the provenance events for just that processor

Wynner · ‎05-02-2017

@spdvnz There is no specific configuration for an AWS cloud instance. There are best practices when configuring NiFi, they can be found in the documentation embedded with NiFi and this HCC article is a good place to start also: Best Practices for high performance NiFi instance

Wynner · ‎05-01-2017

@Raj B The PutHDFS will "close" the file after each append. The writing of a file every minute will be less efficient than say writing a bigger file every ten minutes, but that is something you will have to determine which is more important to your use case. But even just waiting a minute is still more efficient then writing 100 files a sec.

Wynner · ‎05-01-2017

@spdvnz NiFi will work "out of the box" in almost any environment as long as java is installed. Download the tar.gz file, extract the NiFi directories and then run the command bin/nifi.sh start from the NiFi root directory, and it will work. NiFi uses port 8080 for the UI by default. Is there something specific you want to know?

Wynner · ‎04-29-2017

@Bala Vignesh N V That is normal, now just click the "/nifi" link, it will take you to the UI.

Wynner · ‎04-28-2017

@Raj B I posted an answer to your other question before I saw this question. So, maybe a combination of the two methods would be the best approach. You could set a time limit on the merge of say 30 minutes or an hour, then check the file size in HDFS and if it isn't the desired size append, if it is then start writing to a new file in HDFS. That way you are still having a small impact on HDFS checking the file size only once every 30 to 60 minutes instead of every time a new file comes in.

Wynner · ‎04-28-2017

@Raj B It might be more efficient to just merge the incoming files to the desired size and then write the file to HDFS. The MergeContent processor can be configured to merge files to a desired size and then create one flow file from all of the merged files. For example: The MergeContent would be configured in the following way This would create a file between 1GB and 1.5GB. The processor can also be configured to use time limits on the merge file generation. But I think overall this approach would be less imposing on the HDFS system.

Wynner · ‎04-28-2017

@Simon Jespersen Does the script have to be written using Python? There is a three part article written by @Matt Burgess that has some great examples: ExecuteScript Cookbook

Wynner · ‎04-27-2017

@Mohammed El Moumni You can use the rest api to get the value of counter. You could create a script with the needed curl commands the get the value of the counter and then use the ExecuteScript processor to run the script. You can access the rest api document via your NiFi instance, the url will be similar to this http://nifi-host:port/nifi-docs/rest-api. For example, curl command to get counters in my flow, with output, is: curl 'http://nifi-host:port/nifi-api/counters' {"counters":{"aggregateSnapshot":{"generated":"15:12:55 BST","counters":[{"id":"46f421a7-d348-3e65-8a21-290cf1fc8fa1","context":"ExecuteScript (abde5770-015b-1000-0000-0000722fe055)","name":"my-counter","valueCount":0,"value":"0"},{"id":"bd62b47c-5124-35b1-926b-2149eb519f79","context":"ExecuteScript (abfdad0a-015b-1000-ffff-fffff0fcbeb9)","name":"my-counter","valueCount":56400,"value":"56,400"},{"id":"c6d58dcc-8171-3d63-adb2-8cc8e239cc9a","context":"Groovy script of counter (abfdad0a-015b-1000-ffff-fffff0fcbeb9)","name":"my-counter","valueCount":6930,"value":"6,930"},{"id":"776c5c43-40ee-372e-957c-f40b465d3326","context":"All ExecuteScript's","name":"my-counter"

Wynner · ‎04-27-2017

@Mohammed El Moumni I owe you an apology. While trying to get a screenshot of the trash can, I discovered that in fact there is no way currently to delete the counter. I have opened a Apache Jira for this issue and here is the link: NIFI-3751 I've updated the post above with correct information.

Online	Offline
Last Visited	‎12-03-2025 06:24 AM

Member Since	‎07-30-2019 10:40 AM
Last Visited	‎12-03-2025 06:24 AM
Posts	944
Kudos received	197

Cloudera Community

Re: Authentication issue while connecting Nifi to ...

Re: Execute Stream Command - NIFI

Re: How to initialize a processor via NIFI API

Re: HDF cluster in PROD managed by Ambari, did a c...

Re: consuming from kafka topics using NiFi consume...

Re: Apache Flow file from PUTSQL processor is not ...

Re: Server configuration for NiFi Cloud Standalone...

Re: Questions on NiFi PutHDFS append option

Re: Server configuration for NiFi Cloud Standalone...

Re: NiFi installation

Re: Options in addressing "too many small files in...

Re: Questions on NiFi PutHDFS append option

Re: manipulate flowfile with executescript process...

Re: How to get a Nifi counter value using ExecuteS...

Re: how to delete Nifi counters ?