Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Work putHDFS on HDFS in HA?

avatar
Expert Contributor

I have a NiFi job but putHDFS processor return an error:

failed to invoke @OnScheduled method due to java.lang.IllegalArgumentException: java.net.UnknownHostException: provaha; processor will not be scheduled to run for 30 sec: java.lang.IllegalArgumentException: java.net.UnknownHostException: provaha

provaha is the correct reference for HDFS High Availability.

is there any particular configuration for puthdfs that it work with HDFS in HA?

1 ACCEPTED SOLUTION

avatar
Contributor

You're missing hdfs-site.xml in the config, which is where the NN HA details are found. Config requires both hdfs-site and core-site,

i.e., set "Hadoop Configuration Resources" similar to the following:

/etc/hadoop/2.3.4.0-3485/0/hdfs-site.xml,/etc/hadoop/2.3.4.0-3485/0/core-site.xml

Reference:

https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.hadoop.PutHDFS/index.ht...

A file or comma separated list of files which contains the Hadoop file system configuration. Without this, Hadoop will search the classpath for a 'core-site.xml' and 'hdfs-site.xml' file or will revert to a default configuration.

View solution in original post

8 REPLIES 8

avatar
Master Mentor

did you try passing /etc/hadoop/conf/core-site.xml to putHDFS processor @Davide Isoardi

avatar
Expert Contributor

1758-nifi.png

Yes @Artem Ervits, I passing the correct path to core-site.xlm, but putHDFS don't work.

avatar
Master Mentor

avatar
New Contributor

Please provide full path with reference to root directory (absolute path). ?/?/?/etc/hadoop/conf/core-site.xml then it should work fine.

avatar
Expert Contributor

It looks like it cannot resolve the hostname 'provaha' to an IP address - can you check if the provaha is in your /etc/hosts file, or supply the FQDN instead?

avatar
Master Guru

That probably works but defeats the purpose of HA. Hadoop is supposed to dynamically resolve 'provaha' to any of the physical nodes in the HA cluster. I wonder if PutHDFS is not successfully getting the physical node resolved from the HA node name, or if it is an error in config (core-site.xml)

avatar
Contributor

You're missing hdfs-site.xml in the config, which is where the NN HA details are found. Config requires both hdfs-site and core-site,

i.e., set "Hadoop Configuration Resources" similar to the following:

/etc/hadoop/2.3.4.0-3485/0/hdfs-site.xml,/etc/hadoop/2.3.4.0-3485/0/core-site.xml

Reference:

https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.hadoop.PutHDFS/index.ht...

A file or comma separated list of files which contains the Hadoop file system configuration. Without this, Hadoop will search the classpath for a 'core-site.xml' and 'hdfs-site.xml' file or will revert to a default configuration.

avatar
Master Guru

core-site has worked for me