Member since
07-30-2019
3396
Posts
1619
Kudos Received
1001
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 419 | 11-05-2025 11:01 AM | |
| 325 | 11-05-2025 08:01 AM | |
| 456 | 11-04-2025 10:16 AM | |
| 673 | 10-20-2025 06:29 AM | |
| 813 | 10-10-2025 08:03 AM |
06-07-2017
02:29 PM
1 Kudo
@J. D. Bacolod Have you considered using the PutDistributedMapCache and GetDistributedMapCache processors? Have two separate dataflows. One runs on a cron and is responsible for obtaining the token and write that token to the distirbutedMapCache using the putDistirbutedMapCache processor. The Second flow is for doing all your other operations using that token. Just before the invokeHTTP processor add a GetDistibutedMapCache processor that reads the token from the distributed map cache in to a FlowFile attribute. You then use that attribute to pass the token in your connections. One thing to keep in mind is that it is possible that a new token may be retrieved after a FlowFile had already retrieved the old token from the distirbutedMapCache. This would result in auth failure. So you will want your flow to loop back to GetDistributedMapChace processor to get latest key on auth failure on your invokeHTTP processor. This flow does not keep track in any way when a token expires, but if you know how long a token is good for you can set your cron accordingly. Thanks, Matt
... View more
06-13-2017
05:49 PM
1 Kudo
Dear @Matt Burgess what a luck ,yea it worked after upgrading to hdf3.0 Thanks a lot man:) Appreciated a lot
... View more
06-06-2017
01:53 PM
1 Kudo
The session provides methods to read and write to the flow file content. If you are reading only then session.read with an InputStreamCallback will give you an InputStream to the flow file content If you are writing only then session.write with an OutputStreamCallback will give you an OutputStream to the flow file content If you are reading and writing at the same time then a StreamCallback will give access to the both an InputStream and OutputStream In your case, if you are just looking to extract a value then you likely need an InputStreamCallback and you would use the InputStream to read the content and parse it appropriately for your data. You can look at examples in the existing processors: https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ExtractText.java#L313-L318 Keep in mind, the above example reads the whole content of the flow file into memory which can be dangerous when there are very large flow files, so whenever possible it is best to process the content in chunks.
... View more
06-01-2017
03:23 PM
1 Kudo
@Alvaro Dominguez The primary node could change at anytime. You could use postHTTP and listenHTTP processor to route FlowFiles from multiple nodes to a single node. My concern would be heap usage to merge (zip) 160K FlowFiles on a single NiFi node. The FlowFile metadata for all those FlowFiles being zipped would be help in heap memory until the zip is complete. Any objection to having a zip of zips? In other words you could still create 4 unique zip files (1 per node each with unique filename), then send these zipped files to one node to be zipped once more in to a new zip with the single name you want written into HDFS. Thanks, Matt
... View more
06-13-2017
03:25 PM
@Oleksandr Solomko have you changed the default value of the "nifi.queue.swap.threshold" property in nifi.properties? If so, you may be running into NIFI-3897.
... View more
06-01-2017
11:23 AM
@Simran Kaur I had a feeling your issue was related to a missing config. Glad to hear you got it working. If this answer addressed your original question, please mark it as accepted. As far as your other question goes, I see you already started a new question (https://community.hortonworks.com/questions/105720/nifi-stream-using-listenhttp-processor-creates-too.html). That is the correct approach in this forum, we want to avoid asking unrelated questions in the same post. I will have a look at that post as well. Thank you, Matt
... View more
06-02-2017
02:16 PM
thanks @Matt Clarke. Will downgrade asap.
... View more
11-16-2018
01:06 PM
Article content updated to reflect new provenance implementation recommendation and change in JVM Garbage Collector recommendation.
... View more
05-25-2017
07:23 PM
As Matt pointed out, in order to make use of 100 concurrent tasks on a processor, you will need to increase Maximum Timer Driver Thread Count over 100. Also, as Matt pointed out, this would mean on each node you have this many threads available. As far as general performance... the performance of a single request/response with Jetty depends on what is being done in the request/response. We can't just say "Jetty can process thousands of records in seconds" unless we know what is being done with those records in Jetty. If you deployed a WAR with a servlet that immediately returned 200, that performance would be a lot different than a servlet that had to take the incoming request and write it to a database, an external system, or disk. With HandleHttpRequest/Response, each request becomes a flow file which means updates to the flow file repository and content repository, which means disk I/O, and then transferring those flow files to the next processor which reads them which means more disk I/O. I'm not saying this can't be fast, but there is more happening there than just a servlet that returns 200 immediately. What I was getting with the last question was that if you have 100 concurrent tasks on HandleHttpRequest and 1 concurrent task on HandleHttpResponse, eventually the response part will become the bottle neck.
... View more