Support Questions

Find answers, ask questions, and share your expertise

Getting erro for GetHDFS nifi processor

avatar
New Member

I am trying to get files from HDFS directory with apache nifi GetHDFS processor; however, I am getting the error in the nifi-app.log whenever I try to run the job

Caused by: java.lang.IllegalArgumentException: Wrong FS: hdfs://XXX:8020/user/sample_b.csv, expected: file:///

at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:649)

Any one has any idea what is causing the error?

1 ACCEPTED SOLUTION

avatar
Master Guru

@Pallavi Ab,

I think issue is with hdfs-site.xml and core-site.xml

Use the xml's from /usr/hdp/2.4.2.0-258/hadoop/conf instead of /usr/hdp/2.4.2.0-258/etc/hadoop/conf.empty directory

/usr/hdp/2.4.2.0-258/hadoop/conf/hdfs-site.xml
/usr/hdp/2.4.2.0-258/hadoop/conf/core-site.xml

Copy them to another directory and try to use them in the Hadoop configuration resources property in GetHDFS processor.

View solution in original post

6 REPLIES 6

avatar
Master Guru
@Pallavi Ab

once make sure your file is in the directory and Nifi has permissions to the directory.

I am not sure about your get hdfs configurations, take a look on the below configs and configure your processor same configs in the screenshot shown below.

Configs:-

42835-gethdfs.png

Important Property is Keep source file configure this property as per your needs.

Keep Source Filefalse
  • true
  • false
Determines whether to delete the file from HDFS after it has been successfully transferred. If true, the file will be fetched repeatedly. This is intended for testing only.

avatar
New Member

@Shu

Thank you for quick response...I made the changes as suggested by you but now the processor is failing with below error:

Caused by: java.io.IOException: PropertyDescriptor PropertyDescriptor[Directory] has invalid value /user/cmor/kinetica/files/sample_b.csv. The directory does not exist.

Here is how the configuration of the processor looks:

gethdfs.png

avatar
Master Guru

@Pallavi Ab

As per your Logs

Caused by: java.io.IOException: PropertyDescriptor PropertyDescriptor[Directory] has invalid value /user/cmor/kinetica/files/sample_b.csv.The directory does not exist.

Can you check is the above directory exists in HDFS by using below command

bash# hdfs dfs -test -d /user/cmor/kinetica/files
bash# echo $?
bash# hdfs dfs -test -e /user/cmor/kinetica/files/sample_b.csv
bash# echo $?
//if echo returns 0 file or directory exists
//if echo returns 1 file or directory exists

Make sure the path in the Directory property is correct and run the processor again.

Usage of hdfs test command

bash# hdfs dfs -test -[defsz] <hdfs-path>
Options:<br>-d: f the path is a directory, return 0.<br>-e: if the path exists, return 0.<br>-f: if the path is a file, return 0.<br>-s: if the path is not empty, return 0.<br>-z: if the file is zero length, return 0.

avatar
New Member

@Shu

This is what I see after running the commands

42838-hdfsoutput.png

avatar
Master Guru

@Pallavi Ab,

I think issue is with hdfs-site.xml and core-site.xml

Use the xml's from /usr/hdp/2.4.2.0-258/hadoop/conf instead of /usr/hdp/2.4.2.0-258/etc/hadoop/conf.empty directory

/usr/hdp/2.4.2.0-258/hadoop/conf/hdfs-site.xml
/usr/hdp/2.4.2.0-258/hadoop/conf/core-site.xml

Copy them to another directory and try to use them in the Hadoop configuration resources property in GetHDFS processor.

avatar
New Member

Thank you @Shu; it is running now.