Member since
07-30-2019
3426
Posts
1631
Kudos Received
1010
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 401 | 01-13-2026 11:14 AM | |
| 775 | 01-09-2026 06:58 AM | |
| 795 | 12-17-2025 05:55 AM | |
| 856 | 12-15-2025 01:29 PM | |
| 709 | 12-15-2025 06:50 AM |
05-08-2017
02:13 PM
@umair ahmed Just spoke with Dave and he cleaned up his template/response to you here:
https://community.hortonworks.com/questions/101496/nifi-invokehttp-retries.html#answer-101588 His solution for triggering sleep based on retry count set in my template is perfect for meeting your needs. It also scales very easily by simply adding additional timer rules to the advanced UI of the UpdateAttribute processor. Thanks, Matt
... View more
05-08-2017
01:00 PM
@umair ahmed The Retry loop template above allows you to configure the number of retry attempts before existing the loop. I am not sure what you mean by "on time that is it retry at certain time". If the intent is to slow how fast the FlowFile is retired, you could add an additional routeOnAttribute processor to the failure loop to to loop until file has aged x amount of time. Thanks, Matt
... View more
05-08-2017
12:46 PM
1 Kudo
@Gaurav Jain A NiFi cluster consists of the following core capabilities: 1. Cluster Coordinator - One node in a Nifi cluster is elected through zookeeper to be the cluster coordinator. Once an election is complete, all other nodes in the cluster will directly send health and status heartbeats directly to this cluster coordinator. If the currently elected cluster coordinator should stop heartbeating to zookeeper, a new election is held to elect one of the other nodes as the new cluster coordinator. 2. Each Node in NiFi cluster runs independent of each other. They run their own copy of the flow.xml.gz, have their own repo, work on their own FlowFiles. A node that becomes disconnected from the cluster (failed to send heartbeat, network issues between nodes, etc..) will continue to runs its dataflow. If it disconnected due to heartbeat, it will reconnect upon next successful heartbeat. 3. Primary Node - Every Cluster will elect one of its nodes as the primary node. The role of the primary node is run any processor that has been scheduled to run on "primary node only". The intent of the scheduling strategy is to help with processor protocols that are not cluster friendly. For example GetSFTP, ListSFTP, GetFTP, etc... Since every node in a cluster runs the same dataflow, you don't want these competing protocols fighting for the same files on every node. If the node that is currently elected as your primary node becomes disconnected from your cluster, it will stop running any processors configured as "primary node only". The cluster will also elect a new primary node and that new node will start running the "primary node only" configured processors at that time. 4. When a cluster has a disconnected node, any changes to the dataflows will not be allowed. This prevents the flow.xml.gz from becoming unmatched between all cluster nodes. The disconnected node must be rejoined to cluster or dropped completely from the cluster before the editing capability is restored. Thanks, Matt
... View more
05-08-2017
12:22 PM
2 Kudos
@Gaurav Jain The URL provided when adding the Remote Process Group (RPG) to your canvas must be successful only when initially added. Once a successful connection is established the target instance will return a list of currently connected cluster nodes. The source instance with the RPG will record those hosts in peer files. From that point forward the RPG constantly updates the list of available nodes and will not only load-balance to those nodes but will also use anyone of them to get an updated status. Lets assume your source instance of NiFi has trouble getting a status update from any of the nodes, it will still attempt to load-balance with failover delivery of data to the last known set of nodes until communication is successful in getting an updated list. In addition, NiFi will also allow you to specify multiple URLs in the RPG when you create it. Simply provide a comma separated list of URLS for the nodes in the same target cluster. This does not change how the RPG works. It will still constantly retrieve a new listing of available nodes. This allows the target cluster to scale up or down without affecting your Site-To-Site (S2S) functionality. Thanks, Matt
... View more
05-08-2017
12:07 PM
@ismail patel Backpressure thresholds are soft limits and some processors do batch processing. The listHDFS processor will produce a list of Files from HDFS and produce a single 0 byte FlowFile for each file in that list. It will then commit all those FlowFiles to the success relationship at once. So if back pressure threshold was set to 5, the ListHDFS processor would still dump all FlowFiles on to it. (even if the listing consisted of 1000s of Files). At that point backpressure would be applied and prevent the listHDFS form running again until the queue dropped back below 5, but this is not the behavior you need here. The RouteOnAttribute processor is one of those processors that works on 1 FlowFile at a time. This allows us to more strictly adhere to the back pressure setting of 5 on its unmatched relationship. The fact that I used a RouteOnAttribute processor is not important, any processor that works on FlowFiles one at a time would work. I picked RouteOnAttribute because it operates off of FlowFile Attributes which live in heap memory which makes processing here very fast. Thanks, Matt
... View more
05-05-2017
10:00 PM
@Pradhuman Gupta You cannot setup logging for a specific processor. But you can setup a new logger for a specific processor class. First you would create a new appender in the nifi logback.xml file: <appender name="PROCESSOR_FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
<file>${org.apache.nifi.bootstrap.config.log.dir}/nifi-processsor.log</file>
<rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
<!--
For daily rollover, use 'user_%d.log'.
For hourly rollover, use 'user_%d{yyyy-MM-dd_HH}.log'.
To GZIP rolled files, replace '.log' with '.log.gz'.
To ZIP rolled files, replace '.log' with '.log.zip'.
-->
<fileNamePattern>${org.apache.nifi.bootstrap.config.log.dir}/nifi-processor_%d.log</fileNamePattern>
<!-- keep 5 log files worth of history -->
<maxHistory>5</maxHistory>
</rollingPolicy>
<encoder class="ch.qos.logback.classic.encoder.PatternLayoutEncoder">
<pattern>%date %level [%thread] %logger{120} %msg%n</pattern>
<immediateFlush>true</immediateFlush>
</encoder>
</appender>
Then you create a new logger that will write to that appender log file: <logger name="org.apache.nifi.processors.attributes.UpdateAttribute" level="WARN" additivity="false">
<appender-ref ref="PROCESSOR_FILE"/>
</logger> In the above example i am creating a logger for the UpdateAttribute processor. Now any WARN or ERROR log messages produced by this specific processor will be written to this new log. You can expand upon this flow by configuring loggers for each Processor class you want to monitor and send them to the same appender. Then use a SplitText processor to split the content of the FlowFile produced by the TailFile. then use Route On Content processor to route specific log lines produced by each processor class to a different put email or simply create a different message body attribute for each. Thanks, Matt
... View more
05-05-2017
07:48 PM
1 Kudo
@uttam kumar Using the NiFi REST-API, i was able to upload, delete, and upload again the same template using the following commands To upload: curl -X POST -v -F template=@"/<path to template>/template.xml" http://<host>:<port>/nifi-api/process-groups/<UUID of process group>/templates/upload The UUID of my template was in the response to the above command. To delete: curl -X DELETE -v http://<host>:<port>/nifi-api/templates/<UUID of template> I then repeated the above to commands successfully several times. Thanks, Matt
... View more
05-05-2017
06:17 PM
@nyakkanti Only processor properties that support the NiFi expression language can be configured to use FlowFile attributes. Some properties (even if the processor property says supports expression language: false, you may be able to use a simple ${<FlowFile-attribute-name>} to return a value from a attribute key/value pair. But this is never true for sensitive properties such as password properties. Password properties are especially difficult since those values are encrypted upon apply. So if you entered ${<FlowFile-attribute-name>} , that is what gets encrypted. The processor then decrypts that when making the connection. Nowhere in that process does it ever retrieve a value from the FlowFile. Thanks,
Matt
... View more
05-05-2017
05:51 PM
@ismail patel Nodes in a nifi cluster really don't know about each other. They find out from zookeeper who the elected cluster coordinator is and start sending health and status messages via heartbeats. Each node in the cluster runs its own copy of the dataflow and works on its own set of FlowFiles. When a Node goes down other nodes do not pickup FlowFiles from that node an work on them. I am not following exactly what you did above. If you disconnect a node from the cluster, that heath and status heartbeats will no longer be coming in. Any queued data on that node should not be reflected in the cluster view until the node is reconnected. If the RPG was not sending FlowFiles to Node 2, there is likely an issue with the connection. Make sure you correctly configured the S2S properties in the nifi.properties file on both node correctly and that there is no firewall blocking the connection.
... View more
05-05-2017
04:31 PM
@ismail patel The ability to set FlowFile batch size is coming in Apache NiFi 1.2.0 which should be up for vote any day now. https://issues.apache.org/jira/browse/NIFI-1202 Thanks, Matt
... View more