Member since
07-30-2019
3436
Posts
1632
Kudos Received
1012
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 157 | 01-27-2026 12:46 PM | |
| 572 | 01-13-2026 11:14 AM | |
| 1263 | 01-09-2026 06:58 AM | |
| 1036 | 12-17-2025 05:55 AM | |
| 504 | 12-17-2025 05:34 AM |
01-12-2021
05:21 AM
@CristoE Since this question already has an accepted solution and is specific to DISTCP replacement for HDFS, It would be much better to start an entirely new question in the community. You can always add a link to this question solution as reference in your new question post. You would get more visibility that way and we would not dilute the answer to this question with suggestions related to ADLS rather then HDFS.
... View more
01-05-2021
09:47 PM
1 Kudo
Thank you Matt! Altering the "Max Wait Time" value was a game-changing move. I still need to improve it. But the thread problem is fixed now.
... View more
01-05-2021
10:49 AM
@kiranps11 Did you add and start a "DistributedMapCacheServer" controller service running on port 4557? The "DistributedMapCacheClientService" controller service only creates a client that is used to connect to a server you must also create. Keep in mind that the DistributedMapCacheServer does not offer High Availability (HA). Enabling this controller services will start a DistributedMapCacheServer on each node in your NiFi cluster, but each of those servers do not talk to each other. This is important to understand since you have configured your DMC Client to use localhost. This means that each node in your cluster would be using its own DMC server rather than a single DMC server. For a HA solution you should be using an external map cache via one of the other client offerings like "HBase_2_ClientMapCacheService " or "RedisDistributedMapCacheClientService", but this would require you to setup that external HBAs or Redis server with HA yourself. Hope this helps, Matt
... View more
01-05-2021
10:29 AM
1 Kudo
@Boenu Beginning with Apache NiFi 1.12 a change was implemented that changed the default anonymous access to static resources to false. This was done as part of https://issues.apache.org/jira/browse/NIFI-7170. This change also added the ability to add an additional property to the nifi.properties file to restore the behavior to that of Apache NiFi 1.11 and older versions: nifi.security.allow.anonymous.authentication=true The above restores previous behavior while work is done to change the how NiFi handles access to these static endpoints. The following Jiras cover that work: https://issues.apache.org/jira/browse/NIFI-7849 https://issues.apache.org/jira/browse/NIFI-7870 Hope this helps, Matt
... View more
01-04-2021
08:30 AM
@kalhan While it is possible to have a single ZK cluster to support multiple services, It is the recommendation that NiFi have its own dedicated ZK cluster. NiFi cluster stability is dependent on ZK and many of the NiFi processors that can be used depend on on Cluster state which is also stored in ZK. IF ZK becomes overburdened it can affect overall stability and performance of NiFi. If you found any of the answers provided on this query helped you, please select "accept solution" on each of them. Thank you, Matt Hope this helps.
... View more
12-24-2020
07:35 AM
@adhishankarit There is nothing you can pull form the NiFi Rest-API that is going to tell you about successful outcomes form processor execution on a FlowFile. Depending on data volumes, this also sounds like a resource expensive endeavor. That being said, NiFi does have a Site-To-SIte (S2S) Bulletin reporting task. When a processor throws an error it will produce a bulletin and this reporting task can capture those bulletins and send them via NiFi's S2S protocol to another NiFi instance directly into a dataflow were you can handle via dataflow design however you like. Only way you can get INFO level logs in to bulletins is by setting the bulletin level to INFO on all you processors. This only works if you have also configured your NiFI logback.xml so that all NiFi components log at the INFO level as well. Downsides to this: 1. Every processor would display the red bulletin square in upper right corner of processor, This makes using this to find components that are having issues difficult 2. This results in a lot of INFO level logging to the nifi-app.log. You mention edge nodes. You could setup a tailFile processor that tails the nifi-app.log and then send via a dataflow that log data via FlowFiles to some monitor NiFi cluster where another dataflows parses those records via a partitionRecord processor by log_level and then routes based on that log_level for additional handling/notification processing. Downside here: 1. Since you what to track success, you still need INFO level logging enabled for all components. This means even this log collection flow is producing log output. So large logs and logs being written to even when actual data being processed in other flows is not happening. NiFi does have a master bulletin board which you could hit via the rest-api, but this does not get you past the massive logging you may be producing to monitor success. https://nifi.apache.org/docs/nifi-docs/rest-api/index.html Hope this gives you some ideas, Matt
... View more
12-24-2020
07:15 AM
@adhishankarit This only works if intent is to replace or append the FlowFile Attributes to the content of the FlowFile. But you will still not result in a single FlowFiles will all 15 attributes. ReplaceText does not merge anything and only has access to FlowFile Attributes on the FlowFile that is being executed upon. You would still need to merge the content of those 2 FlowFiles to have a single FlowFiles with all 15 attributes which would exist in the content of the new FlowFile.
... View more
12-23-2020
01:39 PM
@Anurag007 You did not share how your logs are getting in to your NiFi. But once ingested, you could use a PartitionRecord processor using one of the following readers to handle parsing your log files: - GrokReader - SyslogReader - Syslog5424Reader You can then use your choice of Record Writers to output your individual split log outputs. You would then add one custom property that is used to group like log entries by the log_level This custom property will become a new FlowFile attribute on the output FlowFiles. You can then use a RouteOnAttribute processor to filter out only FlowFiles where the log_level is set to ERROR. Here is a simple flow I created that tails NiFi's app log and partitions logs by log_level and and then routes log entries for WARN or ERROR. I use the GrokReader with the following GrokExpression %{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} \[%{DATA:thread}\] %{DATA:class} %{GREEDYDATA:message} I then chose to use the JsonRecordSetWriter My dynamic i added property in the Property = log_level
Value = /level In my RouteOnAttribute processor, I can route based on that new "log_level" attribute that will exist on each partitioned FlowFile using two dynamic property which each become a new relationship: property = ERROR
value = ${log_level:equals('ERROR')}
property = WARN
value = ${log_level:equals('WARN')} Hope this helps, Matt
... View more
12-23-2020
09:33 AM
Hi Matt, Great, with your suggestion, I got what I was expecting. Thank You, --Murali
... View more
12-23-2020
08:28 AM
@pjagielski It is always helpful to share the exact NiFi version you are running as there may be known issues we can point you. Assuming here that you may be running latest Apache NiFi 1.12 release, my first thought may be related to this issue: https://issues.apache.org/jira/browse/NIFI-7992 While your content repo is not filling up, I would suggest inspecting you logs to see how often Content claims are being moved to archive. A background thread then removes those claims as a result of your archive settings. Hope this helps, Matt
... View more