Hi all, I have a test ambari cluster. Each node has two interface 1GbE for management and 25GbE high speed for data.
Both interfaces have had DNS/rDNS configured on a central DNS server.
I have gone through this guide http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsMultihoming.html
and enabled those options through ambari.
When adding a new dataNode, I specify its hostname of the management interface. I did few benchmark, and found that the HDFS does not use the 25GbE network at all.
I am wondering if I am still missing something to fully enable multihoming for HDFS?
When you try to connect to one datanode from another, the connection has to be going through 25GbE network. This is most likely a DNS setup issue. Try with ping and nc to see which network is used when connecting between 2 data nodes.
@Derrick Lin Please take a look at article https://community.hortonworks.com/content/kbentry/24277/parameters-for-multi-homing.html
There is more to multi-home configuration than is in the HDFS document, and the more complete discussion may help you resolve your problem, especially wrt DNS and naming. Biggest question being, do the cluster hosts have the same name on all networks? Hope this helps.
Thanks everyone, I read:
This is not the case on my environment, but that's OK. We will just register all nodes via 25GbE high speed network for now then.