Member since
06-09-2016
48
Posts
10
Kudos Received
0
Solutions
06-07-2017
08:35 AM
the above command works as a curl perfectly. can we achieve the same from the nifi processors? I am trying with InvokeHTTP and POSTHTTP but where do we need to specify the data-binary values in the processors?
... View more
09-09-2016
05:53 PM
1 Kudo
@INDRANIL ROY Here is a template that shows how to get the data formatted properly: delimitedtohbase.xml The first two processors (GenerateFlowFile and ReplaceText) are just creating fake data every 30 seconds, you would replace that with wherever your data is coming from.
... View more
09-08-2016
03:59 AM
Yep what you describe with UpdateAttribute/MergeContent sounds perfectly fine. What you'll want there precisely will depend on how many relationships you have out of RouteText. As for concurrent tasks I'd say it would be 1 for GetFile 1 for SplitFile 2...4 or 5 or so on RouteText. No need to go too high generally. 1 for MergeContent 1 to 2 for PutHDFS You don't have to stress too much on those numbers out of the gate. You can run it with minimal threads first, find any bottlenecks and increase if necessary.
... View more
12-30-2016
09:21 PM
In addition to @Pierre Villard's answer (which nicely gets the job done with ExecuteScript, I have a similar example here), since you are looking to do row-level operations (i.e. select columns from each row), you could use SplitText to split the large file into individual lines, then your ReplaceText above, then MergeContent to put the whole thing back together. I'm not sure which approach is faster per se; it would be an interesting exercise to try both.
... View more
10-12-2016
06:04 PM
The mergeContent Processor simply bins and merges the FlowFiles it sees on an incoming connection at run time. In you case you want each bin to have a min 100 FlowFiles before merging. So you will need to specify that in the "Minimum number of entries" property. I never recommend setting any minimum value without also setting the "Max Bin Age" property as well. Let say you only ever get 99 FlowFiles or the amount of time it takes to get to 100 exceeds the useful age of the data being held. Those Files will sit in a bin indefinitely or for excessive amount of time unless that exit age has been set. Also keep in mind that if you have more then one connection feeding your mergeContent processor, on each run it looks at the FlowFiles on only one connection. It moves in round robin fashion from connection to connection. NiFi provides a "funnel" which allows you to merge FlowFiles from many connections to a single connection. Matt
... View more
08-11-2016
05:45 AM
Thanks a lot.It works.
... View more
12-21-2016
11:57 AM
this should help : https://community.hortonworks.com/questions/64771/unable-to-updateexecute-processor-though-nifi-rest.html
... View more
08-02-2016
09:41 PM
The following HCC How-To shows a nifi flow where the first steps read from and process a config file. Hope it may be useful. (Shout-out to @Matt Burgess for initial guidance on this). ) Using NiFi to ingest and transform RSS feeds to HDFS using an external config file
... View more
10-28-2016
11:21 AM
@ INDRANIL ROY Hi INDRANIL ROY, Are you able to get the continuously streaming data (flat file) into hadoop. What are the ecosystems you have used to get the real time data into hadoop. Please provide the ecosystems details or the steps you followed to get the flat files in to hadoop.
... View more
01-16-2017
09:14 AM
One question.. Does the ID of a Processor or ProcessFlow change if NiFi is rebooted?
... View more