On Tue, 28 Aug 2001, Joe Peters wrote:
> Thanks all for the feedback about our problem.
> In examining this a little further, I think I stumbled upon a fix but I
> don't fully understand why it works.
> Since we are using IP tunneling, we had set the interface tunl0 to the VIP.
> For example:
> ifconfig tunl0 <VIP> netmask 255.255.255.255 broadcast <VIP>
BTW, the examples in
http://www.linuxvirtualserver.org/VS-IPTunneling.html are not very correct.
It is very dangerous to set the hidden flag after the VIP is configured.
The order should be:
1. start the device (ifconfig device 0.0.0.0 up), then the
/proc/sys/net/ipv4/conf/DEVICE entry appears (if already not)
2. set the device flags, "hidden" in our case
3. add the VIPs
BTW, this issue is explained in the LVS-HOWTO (3.2 The cure(s))
> If instead we set tunl0 to another IP (i.e. the Real Server IP) and then
> create an alias of tunl0 and set that to the VIP, the problem seems fixed.
Are your sure conf/tunl0 exists when you change conf/tunl0/hidden?
> For example:
> ifconfig tunl0:0 <VIP> netmask 255.255.255.255 broadcast <VIP>
You can add VIP to device "lo", for example:
insmod ipip # if module
ifconfig tunl0 up
echo 1 > all/hidden
echo 1 > lo/hidden
ip add add VIP dev lo
> The problem had been that we occasionally would get page not found errors on
> our clustered Web server (Red Hat 6.2, kernel 2.2.14 on the director and two
> real servers). We didn't see the error consistently, which had us looking at
> different browsers etc. The error seemed to only occur on IE.
Then the problem is at another place. Do you test with NS from
the same box where IE is used?
> In looking at the description of the arp problem, it struck me that we might
> be experiencing some flavor of that. I noticed we had overlooked in the arp
> problem description <http://www.linuxvirtualserver.org/arp.html> of first
> bringing up tunl0, then hiding the arp, and then configuring a tunl0 alias
> to respond to the VIP.
This is the right order, VIP after hidden
> As I said, this seems to work, but I don't understand why the alias is
> necessary -- which makes me wonder whether I actually fixed something or
Hm, may be the problem is at another place. It will be very easy
to find with tcpdump the following:
1. who is the real server that causes the problem
2. for which clients (1 one more) the problem occurs
3. whether it is an ARP problem
4. whether it is an content problem
> Joe Peters
> UMass Boston Webmaster
Julian Anastasov <ja@xxxxxx>