Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Issue with S3

avatar
New Contributor

Hello all,
I'm setting up a process in NIFI to download some files from S3 bucket. The process contain a ListS3, UpdateAttribute, FetchS3Object and PutHDFS on a 3 nodes cluster environment. The ListS3 retrieve the files but the files get stuck in the queue before the FetchS3Object and will not move past the FetchS3Object.  I don't received any error message and the files would just sit there. Any reason why the files are not moving to the FetchS3Object processor?  I have all default settings set on both ListS3 and FetchS3Object.

Jame1979_0-1681138742331.png

Thanks!

7 REPLIES 7

avatar
Community Manager

@Jame1979, Welcome to our community! To help you get the best possible answer, I have tagged in our NiFi experts @MattWho @SAMSAL @steven-matison @DigitalPlumber @cotopaul  who may be able to assist you further.

Please feel free to provide any additional information or details about your query, and we hope that you will find a satisfactory solution to your question.



Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:

avatar
Explorer

Hello Vidya,
Thanks for tagging NIFI experts. 

avatar

@Jame1979,

It would really help if you could post a screenshot of each of the processors, as the problem might be related to some of your configurations. You might be doing something with UpdateAttribute which will affect the behavior of FetchS3.

 

As I can see, you have a terminated thread and a running thread, meaning that something happened there. I had a similar issue and I solved it by restarting the NiFi Cluster. There was a problem in the back-end which translated into threads being generated but actually not performing any action. After the restart, everything went back to normal and I started collecting data using Fetch :).

You could also set the FetchS3 to DEBUG to see if any logs are being generated, which might point you into the right direction.

avatar
Explorer

Hello CotoPaul,
I didn't think to change the FetchS3 to DEBUG, after doing that I received some alerts that stating issue with the VPCE endpoint "unable to get a match with the vpce endpoint pattern; using the configured region: us-east-1".  I'm checking with cloudops team on the error message.

Thanks!

avatar
Expert Contributor

@Jame1979 , like @cotopaul mentions you seem to have a hung thread that you tried to terminate.
Restart NiFi should solve that.
Additionally and important here is that you are never really "spreading" the work among you 3 nodes.
The List happens on the primary node and it stays on the primary node.
What you should do is right click on the connection between List and UpdateAttribute and select load balancing strategy of "Round Robin"

avatar
Explorer

Hello DigitalPlumber,
Thanks for that information will make that changed. 

avatar
Explorer

Hello @MattWho @SAMSAL @steven-matison @DigitalPlumber @cotopaul , after confirming the access I'm able to verified I can access the bucket from aws cli on the same system Nifi is running on and download files. 

The problem I'm having now is that the ListS3 pickup the list of the files from the bucket but the FetchS3 doesn't do anything.  When I enabled debugging on the FetchS3 I receiving the following error message.
  

FetchS3 Error message 

jame1997_4-1682606943667.png

 

FetchS3 Configuration

jame1997_2-1682606842330.png

ListS3 Configuration 

jame1997_3-1682606879264.png

Any suggestion to what is causing the issue?