Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Listsftp taking a long time,

avatar
Contributor

There is a need to load 3 terabyte of historical unix files into hdfs. I am using listsftp, fetchsftp, update attribute and puthdfs processors for this. There are 16 directories with 3 subdirectories each with 350 subdirectories each. I have set the search recursively to true in the listsftp. The dataflow works for a smaller dataset when i point to a specific directory/subdirectory/subdirectory but when i try to do for the whole parent directory the listsftp processor doesn't perform. This is a one time historical load. Is there a way i could only process one directory/subdirectory/subdirectory at one time. Has anyone come across this issue. Thank you for your help.

,

1 ACCEPTED SOLUTION

avatar
Super Mentor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
4 REPLIES 4

avatar
Master Guru

do you get an error? error logs?

you may need more error

avatar
Contributor

It seems to me that it get stuck in the first processor itself for a long time because i don't see any data being pushed over to the next processor fetchsftp; but I don't see any errors.

avatar
Contributor

Hi Timothy, this is the following error i get:

ERROR [Timer-Driven Process Thread-2] o.a.nifi.processors.standard.ListSFTP java.lang.OutOfMemoryError: Java heap space

avatar
Super Mentor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login