Member since
04-05-2016
130
Posts
93
Kudos Received
29
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3791 | 06-05-2018 01:00 AM | |
5142 | 04-10-2018 08:23 AM | |
5649 | 07-18-2017 02:16 AM | |
2913 | 07-11-2017 01:02 PM | |
3342 | 07-10-2017 02:10 AM |
04-10-2018
08:23 AM
Hi @Benjamin Bouret Thank you very much for reporting the issue. It was a bug with HTTP S2S transport protocol. It can not send data more than 2GB at once. I filed Apache NiFi JIRA and a patch for that. https://issues.apache.org/jira/browse/NIFI-5065 As a work-around, please use RAW S2S transport protocol instead, it can send large files without issue.
... View more
04-10-2018
07:25 AM
UPDATES Excuse me, the previous diagnose was wrong. I was trying to reproduce the issue by tweaking timeout settings, however, it turned out the issue is not caused by the timeout setting, instead, there's some issue around how HTTP S2S transport transfers data. I got following exception when I tried to send a 8GB file with HTTP S2S: 2018-04-10 16:05:45,006 ERROR [I/O dispatcher 25] o.a.n.r.util.SiteToSiteRestApiClient Failed to send data to http://HW13076.local:8080/nifi-api/data-transfer/input-ports/ad9a3887-0162-1000-e312-dee642179
c9c/transactions/608f1ce4-56da-4899-9348-d2864e364d40/flow-files due to java.lang.RuntimeException: Sending data to http://HW13076.local:8080/nifi-api/data-transfer/input-ports/ad9a3887-0162-1000-e312-dee
642179c9c/transactions/608f1ce4-56da-4899-9348-d2864e364d40/flow-files has reached to its end, but produced : read : wrote byte sizes (659704502 : 659704502 : 9249639094) were not equal. Something went wr
ong.
java.lang.RuntimeException: Sending data to http://HW13076.local:8080/nifi-api/data-transfer/input-ports/ad9a3887-0162-1000-e312-dee642179c9c/transactions/608f1ce4-56da-4899-9348-d2864e364d40/flow-files h
as reached to its end, but produced : read : wrote byte sizes (659704502 : 659704502 : 9249639094) were not equal. Something went wrong.
at org.apache.nifi.remote.util.SiteToSiteRestApiClient$4.produceContent(SiteToSiteRestApiClient.java:848)
at org.apache.http.impl.nio.client.MainClientExec.produceContent(MainClientExec.java:262)
at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.produceContent(DefaultClientExchangeHandlerImpl.java:140)
at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.outputReady(HttpAsyncRequestExecutor.java:241)
at org.apache.http.impl.nio.DefaultNHttpClientConnection.produceOutput(DefaultNHttpClientConnection.java:290)
at org.apache.http.impl.nio.client.InternalIODispatch.onOutputReady(InternalIODispatch.java:86)
at org.apache.http.impl.nio.client.InternalIODispatch.onOutputReady(InternalIODispatch.java:39)
at org.apache.http.impl.nio.reactor.AbstractIODispatch.outputReady(AbstractIODispatch.java:145)
at org.apache.http.impl.nio.reactor.BaseIOReactor.writable(BaseIOReactor.java:188)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:341)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588)
at java.lang.Thread.run(Thread.java:745) 2018-04-10 16:06:25,009 ERROR [Timer-Driven Process Thread-3] o.a.nifi.remote.StandardRemoteGroupPort RemoteGroupPort[name=input,targets=http://localhost:8080/nifi] failed to communicate with remote NiFi instance due to java.io.IOException: Failed to confirm transaction with Peer[url=http://HW13076.local:8080/nifi-api] due to java.io.IOException: Awaiting transferDataLatch has been timeout.
2018-04-10 16:06:25,009 ERROR [Timer-Driven Process Thread-3] o.a.nifi.remote.StandardRemoteGroupPort
java.io.IOException: Failed to confirm transaction with Peer[url=http://HW13076.local:8080/nifi-api] due to java.io.IOException: Awaiting transferDataLatch has been timeout.
at org.apache.nifi.remote.AbstractTransaction.confirm(AbstractTransaction.java:264)
at org.apache.nifi.remote.StandardRemoteGroupPort.transferFlowFiles(StandardRemoteGroupPort.java:369)
at org.apache.nifi.remote.StandardRemoteGroupPort.onTrigger(StandardRemoteGroupPort.java:285)
at org.apache.nifi.controller.AbstractPort.onTrigger(AbstractPort.java:250)
at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Awaiting transferDataLatch has been timeout.
at org.apache.nifi.remote.util.SiteToSiteRestApiClient.finishTransferFlowFiles(SiteToSiteRestApiClient.java:938)
at org.apache.nifi.remote.protocol.http.HttpClientTransaction.readTransactionResponse(HttpClientTransaction.java:93)
at org.apache.nifi.remote.AbstractTransaction.confirm(AbstractTransaction.java:239)
... 12 common frames omitted I will continue investigating the cause. I tested sending the same file with RAW S2S and worked just fine. Please use RAW transport protocol if possible.
... View more
04-10-2018
03:10 AM
Hi @Benjamin Bouret Thanks for reporting this. NiFi Site-to-Site client implements different kind of timeout and expiration settings, such as cache expiration, idle connection expiration, penalization period, batch duration, and timeout. The error you shared can occur if a S2S client waited more than 'idle connection expiration'. The problem is, 'idle connection expiration' is not configurable by NiFi user at the moment. So, if data transferring takes more than the default 30 seconds, it will fail with the reported message. Even if longer 'Communication Timeout' is set at the Remote Process Group configuration. From the error message you shared, I assume you are using HTTP transport protocol for S2S. I wonder if using RAW can be a work around. But by looking at the NiFi code, it may not be the case though.. because RAW uses the 'idle connection expiration' to shutdown existing sockets, too. Split/Merge pattern will not work as you found S2S clients distribute FlowFiles among nodes in the target cluster. I think a possible work around is using other ListenXXXX processors (e.g. ListenHTTP or ListenTCP) at the target NiFi cluster. Then send data using corresponding processors such as PostHttp or PutTCP ... etc. This way, you can control how to distribute the segmented FlowFiles to target nodes. You need to do manually pick a target hostname for load balancing. It can be done with NiFi Expression Language and certain set of processors. Please refer this template: https://gist.github.com/ijokarumawak/077d7fdca57b9c8ff386f28c5198efd1 I will raise Apache NiFi JIRA so that 'idle connection expiration' can be set based on the 'Communication Timeout' value. In the meantime, I hope the above workaround works for you.
... View more
07-18-2017
02:16 AM
2 Kudos
Hi @Gabriel Queiroz, If you'd like to use ID FlowFile attribute from DetectDuplicate processor's 'Cache Entry Identifier', you need to use NiFi Attribute Expression Language syntax. Currently you have configured it as '$ID', but you need it as '${ID}' (wrap it with a curly bracket).
... View more
07-11-2017
01:02 PM
Hi @Pavan Challa If I understand your use case correctly, I think I have come up with a groovy script to do the job. It loops through dataFlow elements, test if filePattern matches, then resolve path with ExpressionLanguage. Please check this Gist if it works for you: https://gist.github.com/ijokarumawak/a4ef40b49b45cecf3c43b56493683725 I had to change filePattern to be Regular Expression <filePattern>salary_*.gz</filePattern>
/* Added a dot before the star */
<filePattern>salary_.*.gz</filePattern> Hope this helps.
... View more
07-11-2017
12:25 PM
Great, thank you very much!
... View more
07-11-2017
12:09 PM
Excuse me @Pavan Challa, I should have looked at the related question more carefully. So, what you'd like to do is looping through 'dataFlow' elements to find one which has 'filePattern' that matches with the name of incoming file? If so, that might be too much to do with XMLFileLookupService. I'd write a script with ExecuteScript that parses the XML file and do the matching.
... View more
07-11-2017
05:29 AM
1 Kudo
Hello @Eric Lloyd It seems you're using NiFi 1.3.0, if so, GrokReader might be helpful to extract whole stacktrace. Actually GrokReader doc has an example which reads java stacktrace, please check. https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.3.0/org.apache.nifi.grok.GrokReader/additionalDetails.html
... View more
07-11-2017
05:23 AM
Hello @Pavan Challa Probably SplitXml processor will be helpful. Specify depth '2' and you'll get FlowFiles having only single 'dataFlow' element as its content.
... View more
07-11-2017
05:09 AM
Unfortunately, I'm not aware of any existing client app that can deserialize what NiFi state manager stores. But since ZookeeperStateProvider.deserialize method source code is available and it's not that complicated, you can write a simple app that connects to Zk, get value from Znode and deserialize it.
... View more