# Routing for clusters Here are notes I've found useful for managing routing on clusters of GNU/Linux boxes. See more sections on Gnu/Linux . Read some good free chapters on networking at http://www.networkinglinuxbook.com/ ** Diagnosing routing problems ** Make sure your interface card is bound to an IP address (check with ``ifconfig -a''). You should be able to ping all nodes from each other. Make sure ``tracepath'' or ``traceroute'' do not show more hops than necessary. Make sure you have direct routes to all other nodes. Any of the following print route tables, with IP addresses or hostnames: => $ route -n $ route or $ netstat -rn $ netstat -r <= To diagnose problems with routing on a cluster, examine the following information on the mayor node and on a worker node. Make sure that your default installation does not run a firewall packet filter by default. If you have the ``ipchains'', ``iptables'' packages installed, you can disable firewall support by either uninstalling those packages, or running ``lokkit'' and selecting "no firewall". See if you have identified hosts explicitly. => $ cat /etc/hosts <= If so, make sure "hosts" are resolved by "files" first: => $ cat /etc/nsswitch.conf <= We use the lookup order "hosts: files nis dns". If you are not using DNS, then ``/etc/resolv.conf'' must be empty. If you are using DNS, then specify the DNS nameserver by IP address, and specify default domains for completing host names. DHCP may recreate this file for you every time you reboot. Check DNS lookups from other machines with ``host hostname'' ``nslookup hostname'' or ``dig +short hostname'' Check data transfer rates and lags between cluster nodes by running ``ttcp'' on two nodes. Here's the result of a good 100Mb/s connection. Start the receiver with ``-r'' first. => vacn2> ttcp -r -s -n 16384 ttcp-r: buflen=8192, nbuf=16384, align=16384/0, port=5001 tcp ttcp-r: socket ttcp-r: accept from 111.111.212.227 ttcp-r: 134217728 bytes in 11.41 real seconds = 11488.08 KB/sec +++ ttcp-r: 92640 I/O calls, msec/call = 0.13, calls/sec = 8119.62 ttcp-r: 0.1user 2.4sys 0:11real 22% 0i+0d 0maxrss 0+2pf 0+0csw <= => vacn7> ttcp -t -s vacn2 -n 16384 ttcp-t: buflen=8192, nbuf=16384, align=16384/0, port=5001 tcp -> vacn2 ttcp-t: socket ttcp-t: connect ttcp-t: 134217728 bytes in 11.40 real seconds = 11495.07 KB/sec +++ ttcp-t: 16384 I/O calls, msec/call = 0.71, calls/sec = 1436.88 ttcp-t: 0.0user 0.8sys 0:11real 7% 0i+0d 0maxrss 0+2pf 0+0csw <= ** Internal and external addresses ** A special routing problem may occur if a Linux cluster mayor node has two ethernet interfaces --- one for an external address and one for an internal cluster IP address. If the mayor's hostname corresponds to the external address, then the machine will mis-identify itself to other cluster nodes. Those nodes will route through the external interface, if they are able to route at all. A conservative fix would use the internal address of the mayor node as the first-hop gateway for the external address of the mayor node. => $ route add -net 146.27.172.254 netmask 255.255.255.255 gw 172.16.0.1 <= where ``172.16.0.1'' is the internal IP address of the mayor node and ``146.27.172.254'' is the external address of the mayor node. Even better, set the route on all cluster nodes to use the internal address of the mayor as the first-hop gateway for any unknown external address: => $ route add -net 0.0.0.0 netmask 0.0.0.0 gw 172.16.0.1 <= This fix makes the previous fix unnecessary. Outside machines might not have a route to the cluster nodes either. To add a route to a PC that needs to see a cluster node, set the route to use the external address of the mayor node as the first-hop gateway to all cluster node addresses: => $ route add 172.16.0.0 mask 255.255.0.0 146.27.172.254 <= where ``172.16.0.0'' with a mask of ``255.255.0.0'' specifies the address range of the cluster nodes, and ``146.27.172.254'' is the external address of the cluster mayor node. Bill Harlan, 2002-2005