Member since
07-30-2019
3427
Posts
1632
Kudos Received
1011
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 85 | 01-27-2026 12:46 PM | |
| 494 | 01-13-2026 11:14 AM | |
| 1034 | 01-09-2026 06:58 AM | |
| 923 | 12-17-2025 05:55 AM | |
| 984 | 12-15-2025 01:29 PM |
08-22-2023
05:24 AM
@sahil0915 What you are proposing would require you to ingest into NiFi all ~100 million records from DC2, hash that record, write all ~100 million hashes to a map cache like Redis or HBase (which you would also need to install somewhere) using DistributedMapCache processor, then ingest all 100 million records from DC1, hash those records and finally compare the hash of those 100 million record with the hashes you added to the Distributed map cache using DetectDuplicate. Any records routed to non-duplicate would represent what is not in DC2. Then you would have to flush your Distributed Map Cache and repeat process except this time writing the hashes from DC3 to the Distributed Map Cache. I suspect this is going to perform poorly. You would have NiFi ingesting ~300 million records just to create hash for a one time comparison. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
08-21-2023
10:09 AM
@abdullahvvs Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks.
... View more
08-21-2023
06:29 AM
@learner-loading were you able to resolve your issue? If any of the above posts were the solution please mark the appropriate, as it will make it easier for others to find the answer in the future.
... View more
08-20-2023
01:06 AM
I appreciate the comprehensive response, Thanks .
... View more
08-18-2023
08:23 AM
2 Kudos
For information, jira ticket created. https://issues.apache.org/jira/browse/NIFI-11967
... View more
08-18-2023
02:36 AM
1 Kudo
Hi, If you're using Apache NiFi and the token you're trying to capture with the InvokeHTTP processor is too large to be stored as an attribute, you can follow the steps below to work around this limitation: Keep the token in the content of the FlowFile if it's returned by the InvokeHTTP processor. You can use processors like ReplaceText to wrap the token in the header format you need. For instance, if you need the header to be Authorization: Bearer {token}, then you can configure a ReplaceText processor to replace the content (i.e., the token) to match this format.
... View more
08-17-2023
11:17 PM
I was able to resolve this error, by making the below configuration to ExecuteStreamCommand: Command Path: <complete path of python.exe> Command Arguments: <complete path of your script>;<other arg>;<other arg>
... View more
08-16-2023
06:57 AM
I set up the authorizers.xml file as you suggested and it's working perfectly, Thank you very much @MattWho !!
... View more
08-15-2023
06:23 AM
@Tenda Since you are saying you can freely navigate the NiFi UI when in this "stuck" state, NiFi is not stuck as both the UI and processor components all operate within the same JVM. What you circled indicates that at the exact moment (last time browser refreshed) there were 24 active threads out of the 32 configured in the Max Timer Driven Thread pool settings. Milliseconds later that could still be 24 active threads but consumed by different components. The NiFi processors will all show small a small number in the upper right corner if they have an active threads, so step one is determining which processors are holding these 24 threads for a long time. Then looking at those processors and the thread dumps to figure out why those threads are long running. Typically we would see this when external service connections are made which are unstable, network issues, local NiFi repo I/O, NiFi CPU utilization, or long or very frequent GC pauses, or even OOMs. So you have ruled out a few of these so far it sounds. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
08-11-2023
08:18 AM
1 Kudo
@Madhav_VD Apache NiFi contains no native processors that utilize Apache Tika other than IdentifyMimeType (this processor does not do any extraction), but you can find others in the Apache that have created custom processors that utilize Apache Tika. Adding custom nars to Apache NiFi is as easy as adding the custom nar to the auto-load directory: https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#autoloading-processors While I have no experience with any of these custom nars, you can give them a try to see if they meet your needs. If not they may provide you with a stepping stone for creating your own custom variant. https://github.com/tspannhw/nifi-extracttext-processor/releases/tag/html https://community.cloudera.com/t5/Community-Articles/ExtractText-NiFi-Custom-Processor-Powered-by-Apache-Tika/ta-p/249392 https://community.cloudera.com/t5/Community-Articles/Creating-HTML-from-PDF-Excel-and-Word-Documents-using-Apache/ta-p/247968 https://github.com/tspannhw/nifi-extracttext-processor If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more