Member since
07-30-2019
105
Posts
129
Kudos Received
43
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1343 | 02-27-2018 01:55 PM | |
1745 | 02-27-2018 05:01 AM | |
4785 | 02-27-2018 04:43 AM | |
1338 | 02-27-2018 04:18 AM | |
4250 | 02-27-2018 03:52 AM |
09-06-2016
02:10 PM
1 Kudo
"A single concurrent task can work on a single file." That is worth clarifying. It is actually that a single concurrent task can work on a single process session. When RouteText creates a process session it pulls in a single flow file. Other processors can pull in many more. Just depends on the use case and design but fundamentally a single concurrent task can work on far more than a single file. For "this" use case and "this" processor the recommendation is to spit the input up so that parallelism can be taken advantage of.
... View more
09-06-2016
01:45 PM
3 Kudos
Hello Is a perfectly fine use case but I'd recommend breaking the input data up a bit so you can take advantage of the parallelism. So given you have a 50M line input I'd recommend running that first through SplitText to break that into files with say 10,000 lines. That would yield about 5,000 splits each with around 10,000 lines. Then feed that into the RouteText processor. This way it can be operated in a far better divide and conquer manner. You should see rates pretty close to the ideal rate of your underlying storage system. In very conservative terms assume that is about 50MB/s so it should take about 5 minutes at most (and that can certainly be improved). Thanks Joe
... View more
09-02-2016
10:54 AM
4 Kudos
If the processor in question believes there is something about a given flowfile that is temporary and may resolve itself it will mark the flowfile at penalized. When it routes that penalized flowfile to some outgoing connection those penalized flowfiles will not be accessible to the processor that might consume it until the penalty period expires. You can read a bit more about that here https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#settings-tab A good example where this is useful is to consider delivery of flow files to some remote system using PutSFTP. It is common to route 'failures' of PutSFTP back to itself so it will keep trying. But, sometimes there can be conflicts like filenames on the remote server that match so you want to wait until they clear out and try again. In this case penalization lets us operate on other data while we put the problematic flowfiles off to the side. It's all just part of helping ensure the most productive action possible can happen and we're not just sitting there pounding the remote system with the same flow file over and over.
... View more
09-02-2016
08:00 AM
When using a secured instance of NiFi the user either logs in with username and password or they are identified using their certificate. The user first attempts to access NiFi at which point an account is automatically created without any permissions. Then an administrator can grant permissions and you'll see them on that page you're showing above.
... View more
09-02-2016
01:52 AM
2 Kudos
Hello Obaid, You can go to help in NiFi and bring up the docs. Scroll down on the left hand pane to the section titled 'developer' and select REST API. The docs can also be found here https://nifi.apache.org/docs.html From there you can select 'provenance' to get a detailed breakdown of the requests and information necessary for the requests. A really good thing to do is use Chrome's developer tools and use the NiFi UI to create requests. Then you can see precisely what is being done by NiFi's client and you can then emulate the same thing programatically. Thanks Joe
... View more
08-02-2016
03:28 PM
4 Kudos
When entering text values into NiFi you should be able to hit "Shift-Enter" and it will give you true new lines. See attached screenshot
... View more
07-28-2016
04:33 AM
3 Kudos
Regarding the first question about wanting to distribute data from a given node to another node... Site-to-Site is meant for sending data from one cluster to another cluster on explicit ports (named input/entry points) to another cluster. It then takes care of load-balancing and fail-over. At present, site-to-site does not support sending data to a limited subset of nodes based on some defined criteria (partitioning). Though this is an interesting idea and something that has been talked about. However, as you've describe your case thus far you might find that simply using PostHttp (on the sending node(s)) and ListenHttp on the listening node(s) is sufficient. With PostHTTP you get to address a specific recipient and therefore will know that only that node is getting the data of interest. You could then route other data that can be more generally spread throughout the cluster to use site-to-site.
... View more
07-23-2016
01:58 AM
2 Kudos
You could use existing processors such as ExtractText for some types of emails to extract attributes which you can then use for routing. Or you could use the scripting processors and write your own code to extract features of the emails as attributes then use RouteOnAttribute. In the NiFi community there was recently work merged https://issues.apache.org/jira/browse/NIFI-1899 which looks like it will help a lot. For now, probably the best approach is to use ExecuteScript or InvokeScript to put together a quick e-mail parsing processor. Thanks
... View more
07-20-2016
03:12 PM
1 Kudo
It is not available yet but the team is working hard on it and we hope to have it officially supported very soon!
... View more
07-18-2016
01:53 AM
2 Kudos
@P C I share your view that there are a number of scenarios for which a JVM based dataflow management tool would be unfit or suboptimal. Recognizing that and a number of other unique challenges that exist in the edge collection space, the Hortonworks DataFlow team is working as part of the Apache MiNiFi community that Matt just mentioned. MiNiFi is a subproject of Apache NiFi and is designed to work seamlessly with NiFi. To your specific question asking if Hortonworks is developing any support for QNX I can state that we are supporting a range of IoT and 'metal that moves' cases as mentioned in that article. A recent public example of our efforts in this area can be found in this article https://hortonworks.com/blog/qualcomm-hortonworks-showcase-connected-car-platform-tu-automotive-detroit/ Thanks
... View more