Great exhaustive work! Putting the MAC address of the DR LVS as a permanent
entry in the server arp tables sounds like a good alternative to the patch.
You end up with an extra reply from each server, but that shouldn't be a
problem.
Of course the ARP code in the kernel really does need to be fixed.
Another reason you still might need the patch, and probably the newest one
that I haven't made public yet, is if you have two ethernet interfaces. The
2.2.* kernels up to at least 2.2.13 will allow multiple ethernet interfaces
to arp for each other, sending the traffic back and forth and wasting the
interfaces. I really wonder if this is an issue in some of the Linux vs. NT
benchmarks.
sdw
Joseph Mack wrote:
> Dear Wensong, Lars, Steve,
>
> Summary:
>
> You can avoid the arp problem by hard wiring the MAC address
> of the director as the the MAC address of the VIP in the router
>
> eg $ arp -f /etc/ethers
> or $ arp -s 192.168.1.110 MAC_ADDRESS
>
> The kernels on the realservers do not need to be patched.
>
> You can do VS-DR with the eth0:1 on the realserver (you don't
> need lo) or dummy0, tunl0 or eth1 for realservers with 2.0.36
> and 2.2.x kernels.
>
> You need tunl0 on VS-Tun with 2.2.x kernels
>
> You don't need a route entry for the VIP on the realservers.
>
> Text:
>
> Someone on the net posted a message saying that the dummy0
> device replied to arps and showed how they had tested it.
>
> I had previously shown that I could make an LVS with the
> dummy0 device on a 2.2.x realserver and had put this info in the
> HOWTO. I wanted to find out why one person could get the dummy0
> device to reply to arps and I could get a working LVS using
> the dummy0 device.
>
> First thing I did was to check that my realservers weren't
> patched by mistake and I didn't realise it. Running ipvsadm on
> the realservers showed that the ipvs modules weren't present.
> I also confirmed that the dummy0 device in 2.2.x kernels did
> arp as posted on the net.
>
> I found that the -arp/arp option for ifconfig had no
> effect on any devices back to 2.0.36 kernels with net-tools 1.42.
> If it normally arps then -arp had no effect, if it normally doesn't
> arp, than "arp" doesn't turn it on.
>
> Here's the data
>
> Experiment 1:
>
> node running 2.0.36 kernel, libc5, gcc-2.7.2.3, net-tools 1.42.
> or 2.2.13 kernel, glibc, gcc-2.95, net-tools 1.52
> IP=192.168.1.1/24 with VIP=192.168.1.110/32. The VIP was on either
> dummy0 or eth1 (another NIC). The test was to see if the VIP was
> pingable from another (external) machine on the 192.168.1.0/24 network
> or pingable from the machine itself (ie internally from the console).
> (I assume I had a route add -host for the VIP although I didn't
> record this). The test was done with ifconfig using arp or -arp
> (the output of ifconfig -a didn't change)
>
> -----2.0.36------- -----2.2.13------
> ping from internal external internal external
> VIP device
> dummy ARP + - + +
> NOARP + - + +
> down - - - - (control)
>
> eth1 has cable connected to 192.168.1.0 network
> eth1 ARP + +
> NOARP + +
>
> eth1 cable to network removed
> eth1 ARP + -
> NOARP + -
> works as realserver in LVS - yes
>
> Conclusion: for 2.0.36 dummy0 doesn't arp, and eth1 does arp.
> the arp/-arp option to ifconfig has no effect. LVS works
> as expected since VIP need only be resolved as local on the
> realserver and does not need to be visible to the network.
>
> I confirmed that I got a working LVS with an unpatched
> 2.2.x realserver with the VIP on a dummy0 device. (I have a
> client, director and 2 realservers on the 192.168.1.0 net
> running VS-DR with the VIP also on this net. There are no
> routers.)
>
> So I confirmed my previous observations and those of the
> poster mentioned above. We both are right. What was the explanation?
>
> I found that I could get a working LVS using almost
> anything to hold the VIP on the realservers, including eth0:1
> and eth1 (another NIC in the realserver). These devices carrying
> the VIP were pingable from the client and I could get the
> corresponding MAC addresses in the arp table of the client
> if the director was not setup with a VIP. When I setup a
> working LVS this way, I found each time that the MAC
> address for the VIP in the client's arp cache was the
> director's MAC address. For some reason, that I don't know,
> whenever the client does an arp request for the VIP, it gets
> the the director's MAC address.
>
> Possible reasons for the MAC address of the director always
> being associated with the VIP in my LVS -
>
> 1. I configure the director first (I can't imagine the client
> ask for the MAC address of the VIP untill it makes a request
> - this doesn't happen till after I've configured the
> realservers).
>
> 2. The director is 3 times faster (CPU speed) than the next
> machine in the LVS and it always replies to arp request first.
>
> 3. I was lucky.
>
> With my understanding of LVS as it was about 4 months ago
> - that an LVS will not work if the realservers reply to arp
> requests for the VIP - I (erroneously) concluded that since
> the LVS was working with dummy0 devices on the realservers,
> then the dummy0 device was not arp'ing.
>
> I now know that the dummy0 device for 2.2.x kernels replies to
> arps. (It's the same code as for the 2.0.36 kernels - anyone
> know why it arps with 2.2.x kernels and not with 2.0.36 kernels,
> I thought maybe it was the net-tools, but I'm using 1.47 net-tools
> on the 2.2.x and 1.42 on the 2.0.36. I would expect a change in
> major version number if the arp behaviour was going to change).
>
> Since my hypothesis above is not true (you can make a working
> VS-DR LVS with the realserver VIP on an arp'ing eth0:1 device)
> I decided that the relevent piece of information was
>
> - an LVS will work if the client always gets the MAC address
> of the director when it asks for the MAC address of the VIP
>
> This is easy - you tell the client (or the router) the
> MAC address of the VIP with arp -s or arp -f .
>
> here's my /etc/ethers
>
> lvs.mack.net 00:A0:CC:55:7D:47
>
> After installing the MAC address of the DIP (grumpy) as
> the MAC address of the VIP (lvs) in the arp table
> ($arp -f /etc/ethers) I get
>
> client:/usr/src/temp/lvs# arp -a
> realserver1.mack.net (192.168.1.1) at 00:90:27:66:CE:EB [ether] on eth0
> lvs.mack.net (192.168.1.110) at 00:A0:CC:55:7D:47 [ether] PERM on eth0
> director.mack.net (192.168.1.10) at 00:A0:CC:55:7D:47 [ether] on eth0
>
> notice the "PERM" in the VIP entry on the client.
>
> removing the permanent entry
>
> client:/usr/src/temp/lvs# arp -d lvs.mack.net
> client:/usr/src/temp/lvs# arp -a
> realserver1.mack.net (192.168.1.1) at 00:90:27:66:CE:EB [ether] on eth0
> lvs.mack.net (192.168.1.110) at <incomplete> on eth0
> director.mack.net (192.168.1.10) at 00:A0:CC:55:7D:47 [ether] on eth0
>
> If I edited /etc/ethers changing the MAC address of lvs to
> anything else, the LVS did not work anymore. So the arp
> information is coming from /etc/ethers.
>
> Experiment 2:
>
> Using the /etc/ethers approach for setting the MAC address of the
> VIP I then set up an LVS with pair of realservers serving telnet.
> All IPs are 192.168.1.x, all machines have a route to 192.168.1.0
> via eth0.
>
> 1. 2.0.36, libc5, gcc 2.7.2.3, net-tools 1.42
> 2. 2.2.13, glibc-2.1.2, gcc-2.95, net-tools 1.52
>
> with the following devices holding the VIP, tunl0, eth0:1, lo:0, dummy0,
> eth1. In each case there was no route entry for the VIP device and
> there was no cable connected to eth1. The table below
> shows whether the LVS worked. The VIP is installed with
>
> ifconfig $DEVICE 192.168.1.110 netmask $NETMASK broadcast $BROADCAST
>
> with $NETMASK="255.255.255.255" $BROADCAST="192.168.1.110"
> or $NETMASK="255.255.255.0" $BROADCAST="192.168.1.255"
>
> the result belong to 1 of 3 groups
>
> + works fine
> - doesn't work (at $ prompt on client get
> "unable to connect to remote host. Protocol not available"
> then client gets $ prompt back)
> hang - client hangs, realserver cannot access network anymore, have to
> run rc.inet1 from console prompt on realserver to start network again.
>
> netmask of VIP=255.255.255.255 (normal LVS setup)
>
> LVS type -----VS-Tun------ ----VS-DR------
> kernel 2.0.36 2.2.13 2.0.36 2.2.13
>
> VIP on
> tunl0 + + + +
> eth0:1 + - + +
> lo:0 + - + +
> dummy0 + - + +
> eth1 + - + +
>
> netmask of VIP=255.255.255.0 (not normally used for LVS)
>
> VIP on
> tunl0 + + + +
> eth0:1 + - + +
> lo:0 + hangs + hangs
> dummy0 + - + +
> eth1 + - + +
>
> It would seem that any device and any netmask can be used
> for the VIP on a 2.0.36 realserver.
>
> For 2.2.13 realserver,
> VS-Tun, VIP on a tunl0 device only, any netmask
>
> VS-DR, lo:0 device netmask /32 only
> all other devices any netmask
>
> For VS-DR then on solaris/DEC/HP/NT...
> LVS can probably use a regular eth0 device
> rather than an lo:0 device.
>
> Does anyone know why the lo:0 device has to be /32
> for VS-DR on kernel 2.2.13 while the other devices can be /24?
>
> Experiment 3: Effect of route entry for VIP and connection to VIP
> The VIP normally has an entry in the routing table eg
>
> $ route add -host 192.168.1.110 $DEVICE
>
> I found in Experiment 1 that a route entry was not neccessary
> for the LVS to work when the realserver had the VIP on eth0:1. Since
> I had always used a route entry for the VIP I wanted to find out when
> it was needed. The same LVS was used as for Experiment 2. The variables
> were a route entry/no route entry for VIP/32 and for eth1 whether the
> NIC was connected to the network by a cable.
>
> kernel ------2.0.36------- -------2.2.13-------
> VIP eth1 eth1_nc eth0:1 eth1 eth1_nc eth0:1
>
> no route
> LVS + + + + + +
> ping internal - - - + + +
> ping external + - + + + +
>
> route
> LVS + + + + + +
> ping internal + + + + + +
> ping external + - + + + +
>
> Conclusion 1: LVS works when for both cases of route/no_route
> for the VIP for eth0:1 and eth1.
>
> Conclusion 2: having a network cable/no network cable
> does not affect whether the LVS works.
>
> Conclusion 3: for 2.0.36 kernels you can choose to have
> the VIP pingable from the outside world but not pingable
> by the local host by having it on eth1 with a cable
> connection (this seems wierd and I can't think
> of any use for it just yet) or the reverse - pingable
> from the localhost but not by the external world
> by not have a cable connection.
>
> Joe
> --
> Joseph Mack mack@xxxxxxxxxxx
--
OptimaLogic - Finding Optimal Solutions
Web/Crypto/OO/Unix/Comm/Video/DBMS
sdw@xxxxxxx Stephen D. Williams Senior Consultant/Architect
http://sdw.st
43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax
5Jan1999
|