Support Questions

Find answers, ask questions, and share your expertise

Who agreed with this solution

avatar
Master Mentor

@Figo C

The reason is by design NiFi as a client communicates with HDFS Namenode on port 8020 and it returns the location of the files using the data node which is a private address. Now that both your HDF and HDF are sandboxes I think you should switch both to host-only-adapter your stack trace will be a statement that the client can’t connect to the data node, and it will list the internal IP instead of 127.0.0.1. That causes the minReplication issue, etc.

Change the HDP and HDF sandbox VM network settings from NAT to Host-only Adapter.

Here are the steps:

1. Shutdown gracefully the HDF sandbox

2. Change Sandbox VM network from NAT to Host-only Adapter It will automatically pick your LAN or wireless save the config.

3. Restart Sandbox VM

4. Log in to the Sandbox VM and use ifconfig command to get its IP address, in my case 192.168.0.45

5. Add the entry in /etc/hosts on my host machine, in my case: 192.168.0.45 sandbox.hortonworks.com

6. Check connectivity by telnet: telnet sandbox.hortonworks.com 8020

7. Restart NiFi (HDF)

By default HDFS clients connect to DataNodes using the IP address provided by the NameNode. Depending on the network configuration this IP address may be unreachable by the clients. The fix is letting clients perform their own DNS resolution of the DataNode hostname. The following setting enables this behavior.

If the above still fails make the below changes in the hdfs-site.xml that NiFi is using set dfs.client.use.datanode.hostname to true in your

<property>
  <name>dfs.client.use.datanode.hostname</name>
  <value>true</value>
  <description>Whether clients should use datanode hostnames when
    connecting to datanodes.
  </description>
</property>


Hope that helps

View solution in original post

Who agreed with this solution