Support Questions

Find answers, ask questions, and share your expertise

hdfs-site.xml and core-site.xml for remote cluster

avatar
Expert Contributor

I copied hdfs-site.xml and core-site.xml from hadoop master node. However, it has private IP and local file system references in it.

I guess I need to replace private FQDN to public one? Apart from this, what about local file system it is referencing to? Can I have a sample hdfs-site.xml and core-site.xml that I can use in PUTHDFS processor for remote HDFS server?

Edit: I have replaced private FQDN with public one and I get

p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #000000; background-color: #ffffff} span.s1 {font-variant-ligatures: no-common-ligatures}

ERROR [StandardProcessScheduler Thread-6] o.apache.nifi.processors.hadoop.PutHDFS PutHDFS[id=487275f5-3155-3e96-6742-77d854d67d43] HDFS Configuration error - org.apache.hadoop.net.ConnectTimeoutException: 1000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=publicfqdn.compute.amazonaws.com/172.31.x.x:8020]: {}

1 ACCEPTED SOLUTION

avatar
Master Guru

Can you ping that node and/or telnet to that port? It's also possible that even if you connect to the name node, that it will send back private IPs for data nodes, etc. In that case you may need to set the "dfs.datanode.use.datanode.hostname" property in hdfs-site.xml to "true" (see here for more information on that property).

Lastly, what is the version of CDH that you are using, what version of Hadoop does it run, and what version of NiFi/HDF are you using? It is possible that Apache NiFi and/or HDF NiFi are built with Hadoop dependencies incompatible with your cluster. Additionally, HDF NiFi is built with HDP dependencies, so it is possible that HDF NiFi would not be compatible with CDH.

View solution in original post

7 REPLIES 7

avatar
Master Guru

If you use Ambari to download the HDFS client configs, the site files you get should be correct for use in NiFi. I'm not sure where you got your site files, but they may have been server-side configs (to use private IPs/names) rather than client configs.

avatar
Expert Contributor

@Matt Burgess: What does this error indicate? HDFS Configuration error - org.apache.hadoop.net.ConnectTimeoutException: 1000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChanne

avatar
Master Guru

Did you get the client configs from Ambari, or just change your existing site files to use the public FQDN? If the latter, perhaps those ports are not exposed via the public FQDN, or perhaps are mapped to different ports for external access?

avatar
Expert Contributor

I am using Cloudera with NIFI so I got my config files from Cloudera interface and replaced private IP to public. In core-site.xml, there is only use of 8020 port which I believe is not mapped to any other port @Matt Burgess

avatar
Master Guru

Can you ping that node and/or telnet to that port? It's also possible that even if you connect to the name node, that it will send back private IPs for data nodes, etc. In that case you may need to set the "dfs.datanode.use.datanode.hostname" property in hdfs-site.xml to "true" (see here for more information on that property).

Lastly, what is the version of CDH that you are using, what version of Hadoop does it run, and what version of NiFi/HDF are you using? It is possible that Apache NiFi and/or HDF NiFi are built with Hadoop dependencies incompatible with your cluster. Additionally, HDF NiFi is built with HDP dependencies, so it is possible that HDF NiFi would not be compatible with CDH.

avatar
Expert Contributor

I needed to whitelist port 8020 an 50071 on Hadoop cluster instance. Worked 🙂 Thank you!

avatar
Expert Contributor

I needed to whitelist port 8020 and 50071 on Hadoop cluster instance. Worked 🙂 Thank you!