Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

[Nifi ] [Ceph] [S3] Nifi miss some file when connect Ceph by S3 interface

avatar
New Contributor

Nifi: 1.11.3

Ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)

 

Hello, I 'm using ListS3 - FetchS3 to get objects on Ceph cluster. I'm using config:
ListS3:

  • Bucket: test-empty-bucket01.
  • Region: US West (Oregon).
  • Write Object Tags: False.
  • Write User Metadata: False.
  • Communications Timeout: 30 secs.
  • Endpoint Override URL: http://localhost:12345/ (My Ceph cluster)
  • Use Versions: false.
  • List Type: List Objects V1.
  • Minimum Object Age: 0 sec.
  • Requester Pays: False.

FetchS3:

  • Bucket: test-empty-bucket01.
  • Object Key: ${filename}.
  • AWS GovCloud (US).
  • Communications Timeout: 30 secs.
  • Endpoint Override URL: http://localhost:12345/ (My Ceph cluster).
  • Requester Pays: False.

My issues:

  • When I use List Objects V2, ListS3 was not save current state file.
  • When I use this config, it run very fast. But after 1-2 hours, ListS3 put issue:

2020-03-04 13:30:52,521 ERROR [Timer-Driven Process Thread-3] org.apache.nifi.processors.aws.s3.ListS3 ListS3[id=a3675f45-0170-1000-a9c4-825011327395] ListS3[id=a3675f45-0170-1000-a9c4-825011327395] failed to process session due to com.amazonaws.SdkClientException: Unable to execute HTTP request: Software caused connection abort: recv failed; Processor Administratively Yielded for 1 sec: com.amazonaws.SdkClientException: Unable to execute HTTP request: Software caused connection abort: recv failed
com.amazonaws.SdkClientException: Unable to execute HTTP request: Software caused connection abort: recv failed
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1175)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1121)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:770)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:744)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:726)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:686)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:668)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:532)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:512)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4926)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4872)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4866)
at com.amazonaws.services.s3.AmazonS3Client.listObjects(AmazonS3Client.java:881)
at org.apache.nifi.processors.aws.s3.ListS3$S3ObjectBucketLister.listVersions(ListS3.java:464)
at org.apache.nifi.processors.aws.s3.ListS3.onTrigger(ListS3.java:308)
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1176)
at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:213)
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.SocketException: Software caused connection abort: recv failed
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
at com.amazonaws.http.protocol.SdkHttpRequestExecutor.doReceiveResponse(SdkHttpRequestExecutor.java:82)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1297)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1113)
... 25 common frames omitted

Can you help me?

Thank you very much.

5 REPLIES 5

avatar
Super Mentor

@LunaLua 

 

I know next to nothing about Ceph, but the exception being thrown by the client (ListS3 processor) identifies issue as:

Caused by: java.net.SocketException: Software caused connection abort: recv failed


This points at the server side as having closed the connection unexpectedly.  I would suggest looking at the Ceph logs to see what exception(s) are being  thrown on that side around the same time as you see the exception in NiFi.  Perhaps that can provide you with more context around what is going wrong here.

 

Hopefully there are other community members who know more about Ceph or maybe have used the Amazon SDK to interface with Ceph who can provide even more insight.

 

Hope this helps you,

Matt

avatar
New Contributor

I met new issue in Minio Server. 

Nifi ListS3:

- Bucket: test-minio

- Region: US West (Oregon)

- Communications Timeout: 30 secs

- Endpoint Override URL: http://107.113.193.160:9000

- List Type: List Objects V1

 

Issue (On Nifi logs):

2020-03-30 10:22:43,453 WARN [Timer-Driven Process Thread-11] o.a.n.controller.tasks.ConnectableTask Administratively Yielding ListS3[id=15549dd4-0171-1000-5c78-0061e4b538a5] due to uncaught Exception: com.amazonaws.SdkClientException: Unable to execute HTTP request: Read timed out
com.amazonaws.SdkClientException: Unable to execute HTTP request: Read timed out
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1175)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1121)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:770)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:744)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:726)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:686)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:668)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:532)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:512)

....

Caused by: java.net.SocketTimeoutException: Read timed out

 

avatar
Master Guru

Was MiNiO running?   Did it crash?   Not run on supported ports?  Need admin permissions?  Reboot?   Firewall something blocking it.

 

This error is either it is down or needs HTTPS.

 

can you connect from amazon s3 client or telnet on that port.   wireshark debugging?

avatar
Master Guru

ceph is not supported and i don't know if they really follow the s3 interface.   try latest nifi 1.11.4 and use basic s3 mode