LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: Bizarre LVS oddity - one VIP handled find,anothergivesip_rt_bug erro

To: Julian Anastasov <ja@xxxxxx>
Subject: Re: Bizarre LVS oddity - one VIP handled find,anothergivesip_rt_bug errors
Cc: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
From: John Line <jml4@xxxxxxxxxxxxxx>
Date: Tue, 20 Dec 2005 16:01:53 +0000 (GMT)
Thank you for the comments, and sorry about the delay in replying - I left replying until I had time to try some further tests.

On Wed, 7 Dec 2005, Julian Anastasov wrote:


        Hello,

On Wed, 7 Dec 2005, John Line wrote:

With debug level 12, for WPAD request:
IPVS: lookup/in TCP <clientaddr>:2486->131.111.8.68:80 hit
IPVS: Incoming packet: TCP <clientaddr>:2486->131.111.8.68:80
Enter: ip_vs_dr_xmit, net/ipv4/ipvs/ip_vs_xmit.c line 451
ip_rt_bug: <clientaddr> -> 131.111.8.68, eth1
Leave: ip_vs_dr_xmit, net/ipv4/ipvs/ip_vs_xmit.c line 487

        May be some rerouting happens in LOCAL_OUT where saddr=client IP,
do you have any iptables rules in OUTPUT that mangle/modify something in
packet to real server?

I knew that I had not set up *any* iptables output processing myself, but your comment prompted me to have another look at SLES9's /etc/sysconfig/SuSEfirewall2 configuration, and (indirectly) resulted in me testing some changes to that, one of which was setting FW_ROUTE="yes" (was "no").

To my surprise, WPAD connections through the LVS director then worked instead of getting the ip_rt_bug failures seen previously. Still very strange, as the web cache (Squid) traffic on port 8080 had always worked, without that setting - only the WPAD traffic on TCP port 80 encountered the problem.

I then looked at what the firewall setup did differently in the FW_ROUTE="yes" case, and tried the individual commands to pin down which were actually needed to make WPAD work.

My conclusion was that only one extra command was needed:

$IPTABLES  -A PREROUTING -j TOS -m state --state NEW,ESTABLISHED,RELATED \
    -t mangle -p tcp --dport 80 --set-tos Maximize-Throughput

With the original (FW_ROUTE="no") setting and that command added, WPAD now worked. I tried other --set-tos values (to check if it simply needed to be different from the default, zero case), but only that specific value worked.

The firewall setup script (/sbin/SuSEfirewall2, run via other scripts and with many details changeable through the configuration file) appears to add appropriate TOS settings to the OUTPUT chain in the "mangle" table for a variety of standard ports, but only duplicates those settings on the PREROUTING chain when FW_ROUTE="yes" is set.

I suspect that also explains why WPAD requests from my home PC worked previously, rather than getting ip_rt_bug errors. The home PC is running SuSE 9.2 and that has the same --set-tos option setting for port 80 (on both PREROUTING and OUTPUT change - it has FW_ROUTE="yes"), so the test requests from there already had the "magic" option set. If the tests had run for longer, there would probably have been a few successful requests from other clients with similar configurations (but I didn't see any).

HOWEVER - that still leaves me very puzzled. SLES9/SuSE 9.2 do not set up
anything like that for TCP port 8080, but the web cache traffic using that port seems to work fine. So why did most clients get their WPAD connections refused until that TOS value was added to the packets?

Although WPAD is now working using the new LVS directors, I am worried that a future kernel upgrade or other change may break WPAD and/or the web cache. If anyone can explain why adding that TOS setting fixed (or worked around) the problem - or indeed, if the underlying problem can now be identified - I would be very grateful!

                                John Line

--
John Line - web & news development, University of Cambridge Computing Service

<Prev in Thread] Current Thread [Next in Thread>