Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

ListSftp works but FetchSftp doesn't work in Cluster mode

Solved Go to solution
Highlighted

ListSftp works but FetchSftp doesn't work in Cluster mode

Contributor

Hello,

I deployed a 3 nodes cluster in AWS. One of them is NCM.

The embedded zookeeper servers are set in the two work nodes.

The data flow is: ListSftp -> FetchSftp -> PutFile.

The ListSftp is scheduled in the Primary node.

The issue is:

ListSftp works well. The test files are queued before coming into FetchSftp.

The error in FetchSftp is:

18:36:05 UTCERROR5cdfac90-2d07-443e-97b6-b06a1a883a22 172.31.48.155:8080FetchSFTP[id=5cdfac90-2d07-443e-97b6-b06a1a883a22] FetchSFTP[id=5cdfac90-2d07-443e-97b6-b06a1a883a22] failed to process due to org.apache.nifi.processor.exception.ProcessException: IOException thrown from FetchSFTP[id=5cdfac90-2d07-443e-97b6-b06a1a883a22]: java.io.IOException: error; rolling back session: org.apache.nifi.processor.exception.ProcessException: IOException thrown from FetchSFTP[id=5cdfac90-2d07-443e-97b6-b06a1a883a22]: java.io.IOException: error

I tried GetSftp -> PutFile with the same sftp setting. It works well.

I was wondering whether the issue is related with zookeeper or primary node talking with the other work node.

I didn't setup site-to-site property in nifi.properties.

Didn't setup distributed cache service.

How could I get more log details about this processor IOException?

Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: ListSftp works but FetchSftp doesn't work in Cluster mode

Explorer

You certainly need to configure state-management.xml and fill in the "Connect String"

The Admin guide of the NIFI docs under "Help" link from within the UI has the steps to stand up the embedded zookeeper.

Also not sure if you saw my previous comment it could be you need that:

On FetchSFTP are you putting in:

"${path}/${filename}"

For Remote path setting?

View solution in original post

14 REPLIES 14
Highlighted

Re: ListSftp works but FetchSftp doesn't work in Cluster mode

Rising Star

Alvin,

You should be able to get more details by adding the following line to your conf/logback.xml file:

<logger name="org.apache.nifi.processors.standard.FetchSFTP" level="DEBUG" />

That will cause it to log the full stack trace so that you can see what's going on.

FetchSFTP does not interact with ZooKeeper or site-to-site, so you should be okay there. The Distributed Cache Service is also not necessary to use FetchSFTP.

Highlighted

Re: ListSftp works but FetchSftp doesn't work in Cluster mode

Alvin,

You should try setting the FetchSFTP to primary node also and see if that clears the error.

Highlighted

Re: ListSftp works but FetchSftp doesn't work in Cluster mode

Contributor

Hi @dwynne

I tried to set FetchSFTP on primary node too.

The error result is the same.

I was wondering whether I missed something to setup listSftp->fetchSftp in cluster

Thanks.

Highlighted

Re: ListSftp works but FetchSftp doesn't work in Cluster mode

Alvin,

What run schedule do you have for the ListSFTP and FetchSFTP processors?

Highlighted

Re: ListSftp works but FetchSftp doesn't work in Cluster mode

Contributor

I didn't change the default one.

On both two processors:

Scheduling strategy: On primary node

Concurrent tasks: 1

Run schedule: 0 sec

Highlighted

Re: ListSftp works but FetchSftp doesn't work in Cluster mode

Contributor

@mpayne

I added above logger setting in logback.xml. However, I still didn't find any hints.

The FetchSFTP processor is Time Driven. Only ListSFTP is on Primary Node.

Below is the details. Thanks.

2016-08-09 18:59:48,090 INFO [Clustering Tasks Thread-2] org.apache.nifi.cluster.heartbeat Heartbeat created at 2016-08-09 18:59:47,963 and sent at 2016-08-09 18:59:48,090; send took 0 millis 2016-08-09 18:59:48,091 ERROR [Timer-Driven Process Thread-10] o.a.nifi.processors.standard.FetchSFTP FetchSFTP[id=5cdfac90-2d07-443e-97b6-b06a1a883a22] FetchSFTP[id=5cdfac90-2d07-443e-97b6-b06a1a883a22] failed to process due to org.apache.nifi.processor.exception.ProcessException: IOException thrown from FetchSFTP[id=5cdfac90-2d07-443e-97b6-b06a1a883a22]: java.io.IOException: error; rolling back session: org.apache.nifi.processor.exception.ProcessException: IOException thrown from FetchSFTP[id=5cdfac90-2d07-443e-97b6-b06a1a883a22]: java.io.IOException: error 2016-08-09 18:59:48,093 ERROR [Timer-Driven Process Thread-10] o.a.nifi.processors.standard.FetchSFTP org.apache.nifi.processor.exception.ProcessException: IOException thrown from FetchSFTP[id=5cdfac90-2d07-443e-97b6-b06a1a883a22]: java.io.IOException: error at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2013) ~[nifi-framework-core-0.7.0.jar:0.7.0] at org.apache.nifi.processors.standard.FetchFileTransfer.onTrigger(FetchFileTransfer.java:238) ~[na:na] at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) ~[nifi-api-0.7.0.jar:0.7.0] at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1054) [nifi-framework-core-0.7.0.jar:0.7.0] at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) [nifi-framework-core-0.7.0.jar:0.7.0] at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) [nifi-framework-core-0.7.0.jar:0.7.0] at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:127) [nifi-framework-core-0.7.0.jar:0.7.0] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_101] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_101] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_101] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_101] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_101] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_101] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101] Caused by: java.io.IOException: error at com.jcraft.jsch.ChannelSftp$2.read(ChannelSftp.java:1421) ~[na:na] at com.jcraft.jsch.ChannelSftp$2.read(ChannelSftp.java:1340) ~[na:na] at org.apache.nifi.stream.io.StreamUtils.copy(StreamUtils.java:35) ~[nifi-utils-0.7.0.jar:0.7.0] at org.apache.nifi.processors.standard.FetchFileTransfer$1.process(FetchFileTransfer.java:241) ~[na:na] at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:1998) ~[nifi-framework-core-0.7.0.jar:0.7.0] ... 13 common frames omitted

Highlighted

Re: ListSftp works but FetchSftp doesn't work in Cluster mode

Explorer

I'm not sure if this is root cause but if you are using embedded zookeeper make sure you are not using the NCM as a zookeeper node since it will not start up the embedded zookeeper.

Highlighted

Re: ListSftp works but FetchSftp doesn't work in Cluster mode

Contributor

Hi @jsequeiros

The embedded zookeeper servers are setup only on two worker nodes. I didn't use NCM as a zookeeper node.

But I am not sure whether this is a permission issue.

Since I am in a dev cluster, I started nifi with sudo. Is it a issue?

Unfortunately, the log only shows "java.io.IOException: error" without details.

Thanks.

Highlighted

Re: ListSftp works but FetchSftp doesn't work in Cluster mode

Explorer

On FetchSFTP are you putting in:

"${path}/${filename}"

For Remote path setting?

Don't have an account?
Coming from Hortonworks? Activate your account here