- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
ListSftp works but FetchSftp doesn't work in Cluster mode
- Labels:
-
Apache NiFi
Created ‎08-09-2016 07:01 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I deployed a 3 nodes cluster in AWS. One of them is NCM.
The embedded zookeeper servers are set in the two work nodes.
The data flow is: ListSftp -> FetchSftp -> PutFile.
The ListSftp is scheduled in the Primary node.
The issue is:
ListSftp works well. The test files are queued before coming into FetchSftp.
The error in FetchSftp is:
18:36:05 UTCERROR5cdfac90-2d07-443e-97b6-b06a1a883a22 172.31.48.155:8080FetchSFTP[id=5cdfac90-2d07-443e-97b6-b06a1a883a22] FetchSFTP[id=5cdfac90-2d07-443e-97b6-b06a1a883a22] failed to process due to org.apache.nifi.processor.exception.ProcessException: IOException thrown from FetchSFTP[id=5cdfac90-2d07-443e-97b6-b06a1a883a22]: java.io.IOException: error; rolling back session: org.apache.nifi.processor.exception.ProcessException: IOException thrown from FetchSFTP[id=5cdfac90-2d07-443e-97b6-b06a1a883a22]: java.io.IOException: error
I tried GetSftp -> PutFile with the same sftp setting. It works well.
I was wondering whether the issue is related with zookeeper or primary node talking with the other work node.
I didn't setup site-to-site property in nifi.properties.
Didn't setup distributed cache service.
How could I get more log details about this processor IOException?
Thanks.
Created ‎08-09-2016 08:57 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You certainly need to configure state-management.xml and fill in the "Connect String"
The Admin guide of the NIFI docs under "Help" link from within the UI has the steps to stand up the embedded zookeeper.
Also not sure if you saw my previous comment it could be you need that:
On FetchSFTP are you putting in:
"${path}/${filename}"
For Remote path setting?
Created ‎08-09-2016 07:13 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Alvin,
You should be able to get more details by adding the following line to your conf/logback.xml file:
<logger name="org.apache.nifi.processors.standard.FetchSFTP" level="DEBUG" />
That will cause it to log the full stack trace so that you can see what's going on.
FetchSFTP does not interact with ZooKeeper or site-to-site, so you should be okay there. The Distributed Cache Service is also not necessary to use FetchSFTP.
Created ‎08-09-2016 07:36 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Alvin,
You should try setting the FetchSFTP to primary node also and see if that clears the error.
Created ‎08-09-2016 07:57 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @dwynne
I tried to set FetchSFTP on primary node too.
The error result is the same.
I was wondering whether I missed something to setup listSftp->fetchSftp in cluster
Thanks.
Created ‎08-09-2016 08:01 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Alvin,
What run schedule do you have for the ListSFTP and FetchSFTP processors?
Created ‎08-09-2016 08:07 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I didn't change the default one.
On both two processors:
Scheduling strategy: On primary node
Concurrent tasks: 1
Run schedule: 0 sec
Created ‎08-09-2016 07:53 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I added above logger setting in logback.xml. However, I still didn't find any hints.
The FetchSFTP processor is Time Driven. Only ListSFTP is on Primary Node.
Below is the details. Thanks.
2016-08-09 18:59:48,090 INFO [Clustering Tasks Thread-2] org.apache.nifi.cluster.heartbeat Heartbeat created at 2016-08-09 18:59:47,963 and sent at 2016-08-09 18:59:48,090; send took 0 millis 2016-08-09 18:59:48,091 ERROR [Timer-Driven Process Thread-10] o.a.nifi.processors.standard.FetchSFTP FetchSFTP[id=5cdfac90-2d07-443e-97b6-b06a1a883a22] FetchSFTP[id=5cdfac90-2d07-443e-97b6-b06a1a883a22] failed to process due to org.apache.nifi.processor.exception.ProcessException: IOException thrown from FetchSFTP[id=5cdfac90-2d07-443e-97b6-b06a1a883a22]: java.io.IOException: error; rolling back session: org.apache.nifi.processor.exception.ProcessException: IOException thrown from FetchSFTP[id=5cdfac90-2d07-443e-97b6-b06a1a883a22]: java.io.IOException: error 2016-08-09 18:59:48,093 ERROR [Timer-Driven Process Thread-10] o.a.nifi.processors.standard.FetchSFTP org.apache.nifi.processor.exception.ProcessException: IOException thrown from FetchSFTP[id=5cdfac90-2d07-443e-97b6-b06a1a883a22]: java.io.IOException: error at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2013) ~[nifi-framework-core-0.7.0.jar:0.7.0] at org.apache.nifi.processors.standard.FetchFileTransfer.onTrigger(FetchFileTransfer.java:238) ~[na:na] at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) ~[nifi-api-0.7.0.jar:0.7.0] at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1054) [nifi-framework-core-0.7.0.jar:0.7.0] at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) [nifi-framework-core-0.7.0.jar:0.7.0] at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) [nifi-framework-core-0.7.0.jar:0.7.0] at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:127) [nifi-framework-core-0.7.0.jar:0.7.0] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_101] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_101] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_101] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_101] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_101] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_101] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101] Caused by: java.io.IOException: error at com.jcraft.jsch.ChannelSftp$2.read(ChannelSftp.java:1421) ~[na:na] at com.jcraft.jsch.ChannelSftp$2.read(ChannelSftp.java:1340) ~[na:na] at org.apache.nifi.stream.io.StreamUtils.copy(StreamUtils.java:35) ~[nifi-utils-0.7.0.jar:0.7.0] at org.apache.nifi.processors.standard.FetchFileTransfer$1.process(FetchFileTransfer.java:241) ~[na:na] at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:1998) ~[nifi-framework-core-0.7.0.jar:0.7.0] ... 13 common frames omitted
Created ‎08-09-2016 08:09 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm not sure if this is root cause but if you are using embedded zookeeper make sure you are not using the NCM as a zookeeper node since it will not start up the embedded zookeeper.
Created ‎08-09-2016 08:30 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @jsequeiros
The embedded zookeeper servers are setup only on two worker nodes. I didn't use NCM as a zookeeper node.
But I am not sure whether this is a permission issue.
Since I am in a dev cluster, I started nifi with sudo. Is it a issue?
Unfortunately, the log only shows "java.io.IOException: error" without details.
Thanks.
Created ‎08-09-2016 08:52 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
On FetchSFTP are you putting in:
"${path}/${filename}"
For Remote path setting?
