Created on 02-27-2017 12:58 PM - edited 09-16-2022 04:09 AM
Hi,
Do you know about any issues about using network bonding on Hadoop clusters that use Centos 7 as OS?
Any comments appreciated.
Created 02-27-2017 07:26 PM
@Sedat Kestepe , Network bonding is done at the OS level, before Hadoop or other Application-level software knows anything about it. So if you have a properly configured bonded network, you can configure it into Hadoop just like any other network.
The only possibly tricky thing depends on whether the bonded network is the ONLY network available on the server (which therefore means it is the default network), or if there are other networks also available in the server, and maybe some other network is taking the "eth0" default designation. If so, you can either force the bonded network to be the default network (how you do that is dependent on the OS, the NIC, and the BIOS, so I won't try to advise; please consult your IT person), or you can configure Hadoop to use the non-default network.
For information on how to configure a non-default network, you might want to read the article at https://community.hortonworks.com/content/kbentry/24277/parameters-for-multi-homing.html
This article is for "multi-homed" servers (servers with multiple networks) in which you want to use more than one of the networks. But even if you only want to use ONE of the networks (in this case, your bonded network), if it is not the default network then you still need to understand the network configuration parameters in Hadoop, etc.
To get the name and ip addresses of all networks on the server, give the command `ifconfig` while logged in on the server in question.
Hope this helps. If you still have questions after reading the article, please ask follow-up questions here, and include the `ifconfig` output unless it's not relevant.
Created 02-27-2017 03:53 PM
@mjohnson, @Ancil McBarnett I saw your conversation related to this on Centos 6.
Would you have any comments or did you have any experience with Centos 7?
Created 02-27-2017 04:35 PM
There shouldn't be any issue. Network boning is done all the time to increase bandwidth as well as to provide redundancy.
Created 02-27-2017 07:26 PM
@Sedat Kestepe , Network bonding is done at the OS level, before Hadoop or other Application-level software knows anything about it. So if you have a properly configured bonded network, you can configure it into Hadoop just like any other network.
The only possibly tricky thing depends on whether the bonded network is the ONLY network available on the server (which therefore means it is the default network), or if there are other networks also available in the server, and maybe some other network is taking the "eth0" default designation. If so, you can either force the bonded network to be the default network (how you do that is dependent on the OS, the NIC, and the BIOS, so I won't try to advise; please consult your IT person), or you can configure Hadoop to use the non-default network.
For information on how to configure a non-default network, you might want to read the article at https://community.hortonworks.com/content/kbentry/24277/parameters-for-multi-homing.html
This article is for "multi-homed" servers (servers with multiple networks) in which you want to use more than one of the networks. But even if you only want to use ONE of the networks (in this case, your bonded network), if it is not the default network then you still need to understand the network configuration parameters in Hadoop, etc.
To get the name and ip addresses of all networks on the server, give the command `ifconfig` while logged in on the server in question.
Hope this helps. If you still have questions after reading the article, please ask follow-up questions here, and include the `ifconfig` output unless it's not relevant.
Created 02-28-2017 10:22 AM
Thanks @Matt Foley,
Your article is the exact one I need.
I am keeping your offer for folllow-up questions though 🙂
Created 03-01-2017 01:58 PM
Hi @Matt Foley,
I have read your article.
Still have some questions in my mind.
We are planning to connect each server in the cluster to two different switches using two different ethernet cards.
Let's call these two networks "(local) cluster network" and "internal network"
Your article led me to cover most of the things for Hadoop services. Now we are planning to use one of these servers for third party and our custom applications. You mentioned in your article that different services may use different methods for binding to non-default eth. So am I supposed to find which way I can bind them for each service?
And I have two other general questions about your article;
When explaining key parameters for key questions, you touch on binding address:
" ..'Bind Address Properties' provides optional additional addressing information which is used..."
How is service bound to preferred interface? Am I supposed to enter IP address of my server's preferred network interface? Otherwise, I understand 0.0.0.0 notion.
On "Theory and Operations" part, you mention hostnames and DNS servers. You suggest using the same hostname for both interfaces and/or network. Then on 7th practise example you say if we will use hosts files instead of DNS servers (likely for our case. We will not have DNS server for cluster network) hostnames (which also suggested to be identical) would lead to have host files only for one interface on servers. And you went on indicating the ability of clients on using different hosts files. How can they have different hosts files if host names will be the same? Is the following a right example of what you mean?
One of the servers' hosts file:
...
namenode2 192.168.1.2
...
client's hosts file:
...
namenode2 10.0.9.42
...
Created 03-01-2017 10:30 PM
Hi @Sedat Kestepe, regarding "third party and custom apps", this is of course outside the scope of my expertise. But I think that, yes, you will have to find out (a) whether they are even capable of being configured to use non-default network; (b) if so, how to so configure them; and (c) if desired, whether they can use the "0.0.0.0" config, meaning "listen on all networks" -- which is often a cheap way to get desirable results, altho in some environments it has security implications.
Your second question is "How is service bound to preferred interface." Let's be clear that there are two viewpoints, the service and the client. From the service's viewpoint, "binding interface" really means, "what ip address(es) and port number shall I LISTEN on for connection requests?" Regrettably there is no uniformity in how each application answers that question, even among the Hadoop Stack components -- else that article could have been about half the length! Some applications aren't configurable and always use the host's default ip address, and a fixed port number. Sometimes the port number is just the starting point, where the app and client negotiate yet another port number for each connection. Sometimes a list of ip addresses can be configured, to listen on a specific subset of networks available on multi-homed servers. Sometimes the "magic" address "0.0.0.0" is understood to mean "listen on ALL available network ip addresses". (This concept is integrated with most tcp/ip libraries that applications might use.) Almost always, though, a single application port number is used by a given service, which means only one instance of the service per host (although it can be multi-threaded if it uses that port negotiation trick). How the ip address(es) and port number are specified, in one or several config parameters, is up to the application. As noted in the article, sometimes Hadoop Stack services even use their own hostname and run it through DNS to obtain an ip address; the result is dependent on which DNS service is queried, which in turn may be configured by yet another parameter (see for example the Kerberos _HOST resolution feature).
From the client's viewpoint, it must answer the question "where do I FIND that service?" Clients are usually much simpler and are configured with a single ip address and port number where it should attempt to contact a service. That isn't necessarily so; clients often have lists of servers that they try to connect to in sequence, so they could have multiple ip addresses for the same server. But usually we prefer to put complexity in services and keep clients as simple as possible, because clients usually run in less controlled environments than services. It never makes sense to hand "0.0.0.0" to a client, because the client is initiating, not listening. The client has to know SPECIFICALLY where to find the service, and it is typically (tho not necessarily) different than the client's host server. We encourage use of a hostname rather than ip address for this purpose, at which point the client will obtain the ip address from its default DNS server, or (in some cases) a configuration-specified DNS server.
Regarding your third question, in the "Theory of Operations" section, items 5, 6, and 7 should be taken together.
#5 is an example of why it is useful and even necessary to have the same hostname on all networks: "external clients [can] find and talk to Datanodes on an external (client/server) network, while Datanodes find and talk to each other on an internal (intra-cluster) network." If the particular Datanode has a different hostname on the two networks, all parties become confused, because they may tell each other to talk to a different Datanode by name.
#6 actually gives you an escape hatch: "(Network interfaces excluded from use by Hadoop may allow other hostnames.)" The "one hostname for all interfaces" only applies to the interfaces used for the Stack. An interface used only by administrators, for example, not for submitting or processing any Hadoop jobs, could assign the server a different hostname on that network. But if you violate #5, you threaten the stability of your configuration.
#7 describes precisely the constraints within which you can use /etc/hosts files instead of DNS. It should perhaps mention that, for example, it would work to have the cluster members know their peers only via one network, and know its clients via one other network. The clients would know the cluster members by the second network, with the SAME hostnames, and that's okay. As long as clients and servers have different /etc/host files, and you use hostnames not IP addresses for configurations, it can often work. And as #7 points out, if these constraints don't work for your needs, it isn't that hard to add DNS.
The example you gave is a possible result, yes. More broadly, all the servers would share /etc/hosts files in which all the servers are identified with an ip address on the intra-cluster network:
while the clients would share /etc/hosts files in which those servers are identified with an ip address on the client/server access network:
And this is a good example of why you MUST use hostnames not ip addresses, and dfs.client.use.datanode.hostname must be set true, because the clients and servers must be able to specify cluster servers to each other, and the ip addresses won't be mutually accessible.
Created 03-02-2017 04:33 PM
Thank you for the explanation @Matt Foley. 🙂