Member since
03-26-2024
26
Posts
18
Kudos Received
0
Solutions
04-09-2024
04:33 AM
1 Kudo
1) Initially, I faced the "NiFi PutSFTP failed to rename dot file issue" only when the child processor was configured with "Outbound Policy = Batch Output". It worked without the child processor group. 2) I modified the PutSFTP failure retry attempt to 3, and it fixed the issue. 3) Later, I introduced a RouteOnAttribute after the FetchHDFS processor for some internal logic implementation, and the PutSFTP error started again. 4) This time, I updated the "Run Schedule" of the PutSFTP processor from 0Sec to 3 Sec, and it again fixed the issue. 5) I have a requirement to transfer stats of each file (with file name, row count, file size) etc. So, I introduced one more PutSFTP processor, and the issue popped up again. 6) Finally, I made the following changes to both of my PutSFTP processors: a) Added PutSFTP failure retry attempt to 3. b) Modified the "Run Schedule" of the first PutSFTP Processor to "7 Sec". c) Modified the "Run Schedule" of the second PutSFTP Processor to "10 Sec". Now it is working fine. Are we getting this issue because of 20 flowfiles processing at a time ? Could you please suggest if this is the right way to fix the "NiFi PutSFTP failed to rename dot file issue"?
... View more
04-09-2024
04:31 AM
1 Kudo
Hi @MattWho 1) Initially, I faced the "NiFi PutSFTP failed to rename dot file issue" only when the child processor was configured with "Outbound Policy = Batch Output". It worked without the child processor group. 2) I modified the PutSFTP failure retry attempt to 3, and it fixed the issue. 3) Later, I introduced a RouteOnAttribute after the FetchHDFS processor for some internal logic implementation, and the PutSFTP error started again. 4) This time, I updated the "Run Schedule" of the PutSFTP processor from 0Sec to 3 Sec, and it again fixed the issue. 5) I have a requirement to transfer stats of each file (with file name, row count, file size) etc. So, I introduced one more PutSFTP processor, and the issue popped up again. 6) Finally, I made the following changes to both of my PutSFTP processors: a) Added PutSFTP failure retry attempt to 3. b) Modified the "Run Schedule" of the first PutSFTP Processor to "7 Sec". c) Modified the "Run Schedule" of the second PutSFTP Processor to "10 Sec". Now it is working fine. Are we getting this issue because of 20 flowfiles processing at a time ? Could you please suggest if this is the right way to fix the "NiFi PutSFTP failed to rename dot file issue"?
... View more
04-05-2024
11:45 AM
1 Kudo
Hi @MattWho As you suggested I tried with a child processor group as below with "FlowFile Concurrency = Single FlowFile Per Node" and "Outbound Policy = Batch Output" to ensure that all fetched FlowFiles are successfully processed and start the MergeContent Processor. Input Port --> GetHDFSFileInfo --> RouteOnAttribute --> UpdateAttribute --> FetchHDFS --> PutSFTP --> ModifyBytes --> Output Port My GetHDFSFileInfo processor returns 20 HDFS files, and each execution successfully transfers 18 to 19 files to my SFTP server. However, during each execution, one or two file transfers fail in the PutSFTP Processor with the error message 'Failed to rename dot-file.' An error screenshot is attached below I am facing this issue only when the child processor is configured with "Outbound Policy = Batch Output". If we try without child processor group, then also it is working. Am I missing some configuration settings here? Could you please help to fix the issue with the PutSFTP processor?
... View more
04-05-2024
06:49 AM
1 Kudo
My data flow starts from a single FlowFile produced by a Sqoop job, which then expands into multiple FlowFiles after executing the GetHDFSFileInfo Processor (based on the number of HDFS files). To capture all failure scenarios, I have created a Child processor group with the following processors Input Port --> GetHDFSFileInfo --> RouteOnAttribute --> UpdateAttribute --> FetchHDFS --> PutSFTP --> ModifyBytes --> Output Port Main Processor Group -------------------- RouteOnAttribute --> Above mentioned Child Processor Group --> MergeContent --> downstream processors The child processor group is configured with "FlowFile Concurrency = Single FlowFile Per Node" and "Outbound Policy = Batch Output" to ensure that all fetched FlowFiles are successfully processed (written to the SFTP server). My GetHDFSFileInfo processor returns 20 HDFS files, and each execution successfully transfers 18 to 19 files to my SFTP server. However, during each execution, one or two file transfers fail in the PutSFTP Processor with the error message 'Failed to rename dot-file.' An error screenshot is attached below I am facing this issue only when the child processor is configured with "Outbound Policy = Batch Output". If we try without child processor group, then also it is working. Could you please help to fix the issue with putSFTP processor. This is in continuation with the solution provided in the thread https://community.cloudera.com/t5/Support-Questions/How-to-convert-merge-Many-flow-files-to-single-flow-file-in/m-p/385990#M245919
... View more
Labels:
- Labels:
-
Apache NiFi
-
HDFS
04-05-2024
05:33 AM
Thank you @MattWho for your timely support and quick solutions. Kudos to you!
... View more
04-03-2024
05:13 AM
1 Kudo
Thanks @jAnshula for your suggestion. We are trying to achieve this without using Nifi. Could you please let us know if any options are available using hdfs/hadoop commands?
... View more
04-03-2024
03:20 AM
Thanks @jAnshula for your suggestion My remote server does not support Hadoop-compatible file systems, so the DistCp command does not work for me. The primary objective is to copy the HDFS files as is to a Linux machine.
... View more
04-02-2024
11:59 PM
We have a requirement to transfer files from an HDFS directory to a remote server. I've noticed options to copy files from HDFS to the local filesystem first (using copyToLocal) and then transfer files from the local filesystem to the remote server (using scp). But is there any direct method to copy files from HDFS to a remote server, such as using Sqoop functionality or any other methods like without copying to the local file system
... View more
Labels:
- Labels:
-
Apache Sqoop
-
HDFS
04-02-2024
09:20 PM
Thank you @MattWho I noticed that the number of files in the HDFS directory can be retrieved using the "hdfs.count.files" property. Can we utilize this property to initiate the merging process instead of "bin age"? If yes, could you please suggest what changes we need to make in the MergeContent processor?
... View more
04-01-2024
03:05 AM
1 Kudo
I am fetching file(s) from an HDFS path and transferring them to an SFTP server using Nifi. The HDFS file list is created by a Sqoop job, and the HDFS directory may contain one file or more than one file. Here is the list of processors I am using right now: RouteOnAttribute --> GetHDFSFileInfo --> RouteOnAttribute --> UpdateAttribute --> FetchHDFS --> PutSFTP --> UpdateAttribute My data flow starts from a single FlowFile produced by a Sqoop job, which then becomes many FlowFiles after executing the GetHDFSFileInfo Processor (based on the number of HDFS files). However, I require only a single FlowFile post PutSFTP for downstream processing of job completion. Could you please suggest some solutions to execute the processors after PutSFTP only once? Do we need to create any separate processor group from GetHDFSFileInfo to PutSFTP? My Dataflow looks like below
... View more
Labels:
- Labels:
-
Apache NiFi
-
HDFS
- « Previous
-
- 1
- 2
- Next »