About vinaychandra_t

MattWho · ‎03-03-2017

@spdvnz NiFi Processors: NiFi processor components that are likely to encounter failures will have a "failure" routing relationship. Often times failure is handled by looping that failure relationship back on the same processor so that the operation against the failed FlowFile will be re-attempted after a FlowFile penalty duration has expired (default 30 secs). However, you may also which to route failure through additional processors. For example maybe you failure is in a PutSFTP processor configured to send data to system "abc". Instead of looping the failure relationship, you could route failure to a second PutSFTP processor configured to send to a alternate destination server "xyz". Failure from the second PutSFTp could be routed back to the first PutSFTP. In this scenario, a complete failure only occurs delivery to both systems fails. NiFI Process Groups: I am not sure what failover condition at a process group level you are trying to account for here. Process groups are nothing more then a logical container of individual dataflows. failover would still be handled through dataflow design. NiFi Node level failover: In a NiFi cluster, there is always a "cluster coordinator" and a "primary node" elected. The "primary node" will run all processors that are configured with "on primary node" only on their scheduling tab. Should the cluster coordinator stop receiving heartbeats from the current primary node, a new node will be designated as the primary node and will start the "on primary node" processors. If the "cluster coordinator" is lost, a new cluster coordinator will be elected and will assume the role of receiving heartbeats from other nodes. A node that has become disconnected from the cluster will continue to run its dataflows as long as NiFi is still running. NiFI FlowFile failover between nodes. Each node in a NiFi cluster is responsible for all the FlowFiles it is currently working on. Each node has no knowledge of what FlowFiles are currently queued on any other node in the cluster. If a NiFi node is completely down, the FlowFiles that it had queued at the time of failure will remain in its repos until the NiFi is brought back online. The content and FlowFile repositories are not locked to a specific NiFi instance. While you cannot merge these repositories with existing repos of another node, It is possible to standup an entirely new NiFi node and have it use these repositories from the down node to pick up operation where it left off. So it is important to protect the FlowFile and Content repositories via RAID so that disk failure does not result in data loss. Data HA across NiFi nodes is a future roadmap item. Thanks, Matt

MattWho · ‎03-03-2017

@spdvnz When NiFi dataflows are designed so that they are pulling data from source systems, load-balancing and auto-scaling can be handled fairly easily through the use primary node and remote process groups. For example: A ListSFTP processor can be configured to run on the primary node only. It would be responsible for producing a 0 byte FlowFile for every File returned in the listing. Those 0 byte FlowFiles are then routed to a Remote Process Group (RPG) which is configured to point back at the same NiFi cluster. The RPG would handle smart load-balancing of data across all currently connected nodes in the cluster (If additional nodes are added or nodes are removed, the RPG auto-scales accordingly). On the receiving side of the RPG (remote input port) the 0 byte FlowFile would be routed to a FetchSFTP processor that will actually retrieve the content for the FlowFile. There are many list and fetch processors. They generally exist where the process is not exactly cluster friendly. (Not a good idea having GetSFTP running on every node as they would all complete for the same data.) If your only options is to have your source system push data to your NiFi cluster, those source systems would need to know the currently available nodes at all times. One option is use MiNiFi on each of your source systems that can make use of NiFi's Site-To-Site (Remote Process Groups) protocol to send data in a smart load-balanced way to all the NiFi nodes currently connected in the target cluster. Another option: https://community.hortonworks.com/articles/16120/how-do-i-distribute-data-across-a-nifi-cluster.html Thanks, Matt

vinaychandra_t · ‎03-03-2017

Uninstalled and installed again. Installed HDF mpack. Thanks @Jay SenSharma

vinaychandra_t · ‎02-06-2017

Using commons-collections-3.2.1 solved the issue.Thanks, @Matt Burgess.

bala_bigdata · ‎07-31-2019

@slachterman i just followed the instructions provided in the article to move generated flow files to ADLS using PutHDFS processor, But I am getting the below errors. Please help. I have specified all the configurations for the PutHDFS as per the article. It is keep on saying "unable to find the valid certification path to requested target, but not sure where i need to upload the ADLS cred certificate in the PutHDFS processor 2019-07-30 17:24:20,657 ERROR [Timer-Driven Process Thread-9] o.apache.nifi.processors.hadoop.PutHDFS PutHDFS[id=44e51785-016c-1000-901a-6aa4d9167c2c] Failed to access HDFS due to com.microsoft.azure.datalake.store.ADLException: Last encountered exception thrown after 5 tries. [javax.net.ssl.SSLHandshakeException,javax.net.ssl.SSLHandshakeException,javax.net.ssl.SSLHandshakeException,javax.net.ssl.SSLHandshakeException,javax.net.ssl.SSLHandshakeException] [ServerRequestId:null]: com.microsoft.azure.datalake.store.ADLException: Error getting info for file / Operation GETFILESTATUS failed with exception javax.net.ssl.SSLHandshakeException : sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target Last encountered exception thrown after 5 tries. [javax.net.ssl.SSLHandshakeException,javax.net.ssl.SSLHandshakeException,javax.net.ssl.SSLHandshakeException,javax.net.ssl.SSLHandshakeException,javax.net.ssl.SSLHandshakeException] [ServerRequestId:null] Last encountered exception thrown after 5 tries. [javax.net.ssl.SSLHandshakeException,javax.net.ssl.SSLHandshakeException,javax.net.ssl.SSLHandshakeException,javax.net.ssl.SSLHandshakeException,javax.net.ssl.SSLHandshakeException] [ServerRequestId:null] at com.microsoft.azure.datalake.store.ADLStoreClient.getExceptionFromResponse(ADLStoreClient.java:1194) at com.microsoft.azure.datalake.store.ADLStoreClient.getDirectoryEntry(ADLStoreClient.java:741) at org.apache.hadoop.fs.adl.AdlFileSystem.getFileStatus(AdlFileSystem.java:487) at org.apache.nifi.processors.hadoop.PutHDFS$1.run(PutHDFS.java:268) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1942) at org.apache.nifi.processors.hadoop.PutHDFS.onTrigger(PutHDFS.java:236) at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1162) at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:209) at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117) at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target at sun.security.ssl.Alerts.getSSLException(Alerts.java:192) at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1946) at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:316) at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:310) at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1639) at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:223) at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1037) at sun.security.ssl.Handshaker.process_record(Handshaker.java:965) at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1064) at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1367) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1395) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1379) at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559) at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185) at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1564) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492) at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480) at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:347) at com.microsoft.azure.datalake.store.HttpTransport.makeSingleCall(HttpTransport.java:307) at com.microsoft.azure.datalake.store.HttpTransport.makeCall(HttpTransport.java:90) at com.microsoft.azure.datalake.store.Core.getFileStatus(Core.java:691) at com.microsoft.azure.datalake.store.ADLStoreClient.getDirectoryEntry(ADLStoreClient.java:739) ... 18 common frames omitted

maykiwogno · ‎06-28-2017

Hi thanks for tuto. Do you know if it is possible to use "StoreInKiteDataset" with kerberos to write to HDFS ?

vinaychandra_t · ‎11-22-2016

Thanks for explaining the missing bits @Shishir Saxena, this is very valuable information. I moved the input/output ports to root level processing group and I could transfer data from A to B.

srijitachaturve · ‎04-10-2019

This doesn't works for me, i places a flow.xml.gz from dev to prd cluster,cleared all repsotiories of prod but still i see state in processors.Could you please suggest other way to clear state for all processors at one go ? i tried deleting state folder contents under /nifi/conf but that too dint help,it gave me some error.

mpandit · ‎09-19-2016

You can use the Jms Connection Factory Provider to specify your vendor specific details. https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.jms.cf.JMSConnectionFactoryProvider/index.html

MattWho · ‎09-12-2016

@spdvnz NiFi's Hadoop based processors already include the Hadoop client libraries so there is no need to install them outside of NiFi or install NiFi on the same hardware where Hadoop is running. The various NiFi processors for communicating with Hadoop use the core-site.xml, hdfs-site.xml, and/or hbase-site.xml files as part of their configuration. These files would need to be copied from you Hadoop system(s) to a local directory on each of your NiFi instances for use by these processors. Detailed processor documentation is provided by clicking on "help" in the upper right corner within the NiFi UI. You can also get to the processor documentation by right clicking on a processor and slecting "usage" from teh context menu that is displayed. Thanks, Matt

Online	Offline
Last Visited	‎07-13-2018 01:11 PM

Member Since	‎09-02-2016 01:21 PM
Last Visited	‎07-13-2018 01:11 PM
Posts	56
Kudos received	6

Cloudera Community

Re: Failover mechanism in nifi

Re: load balancing in nifi

Re: Cannot find install-mpack : Issue while Instal...

Re: Could not initialize class net.sf.json.JsonCon...

Re: Connecting to Azure Data Lake from a NiFi data...

Re: HDF/NiFi to convert row-formatted text files t...

Re: site to site from http to http

Re: Nifi clear state

Re: Custom JMS factory configuration in Apache NiF...

Re: Hadoop and Nifi integration best practices