Thank you for the comments, and sorry about the delay in replying - I left
replying until I had time to try some further tests.
On Wed, 7 Dec 2005, Julian Anastasov wrote:
Hello,
On Wed, 7 Dec 2005, John Line wrote:
With debug level 12, for WPAD request:
IPVS: lookup/in TCP <clientaddr>:2486->131.111.8.68:80 hit
IPVS: Incoming packet: TCP <clientaddr>:2486->131.111.8.68:80
Enter: ip_vs_dr_xmit, net/ipv4/ipvs/ip_vs_xmit.c line 451
ip_rt_bug: <clientaddr> -> 131.111.8.68, eth1
Leave: ip_vs_dr_xmit, net/ipv4/ipvs/ip_vs_xmit.c line 487
May be some rerouting happens in LOCAL_OUT where saddr=client IP,
do you have any iptables rules in OUTPUT that mangle/modify something in
packet to real server?
I knew that I had not set up *any* iptables output processing myself, but
your comment prompted me to have another look at SLES9's
/etc/sysconfig/SuSEfirewall2 configuration, and (indirectly) resulted in
me testing some changes to that, one of which was setting FW_ROUTE="yes"
(was "no").
To my surprise, WPAD connections through the LVS director then worked
instead of getting the ip_rt_bug failures seen previously. Still very
strange, as the web cache (Squid) traffic on port 8080 had always worked,
without that setting - only the WPAD traffic on TCP port 80 encountered
the problem.
I then looked at what the firewall setup did differently in the
FW_ROUTE="yes" case, and tried the individual commands to pin down which
were actually needed to make WPAD work.
My conclusion was that only one extra command was needed:
$IPTABLES -A PREROUTING -j TOS -m state --state NEW,ESTABLISHED,RELATED \
-t mangle -p tcp --dport 80 --set-tos Maximize-Throughput
With the original (FW_ROUTE="no") setting and that command added, WPAD now
worked. I tried other --set-tos values (to check if it simply needed to be
different from the default, zero case), but only that specific value
worked.
The firewall setup script (/sbin/SuSEfirewall2, run via other scripts and
with many details changeable through the configuration file) appears to
add appropriate TOS settings to the OUTPUT chain in the "mangle" table for
a variety of standard ports, but only duplicates those settings on the
PREROUTING chain when FW_ROUTE="yes" is set.
I suspect that also explains why WPAD requests from my home PC worked
previously, rather than getting ip_rt_bug errors. The home PC is running
SuSE 9.2 and that has the same --set-tos option setting for port 80 (on
both PREROUTING and OUTPUT change - it has FW_ROUTE="yes"), so the test
requests from there already had the "magic" option set. If the tests had
run for longer, there would probably have been a few successful requests
from other clients with similar configurations (but I didn't see any).
HOWEVER - that still leaves me very puzzled. SLES9/SuSE 9.2 do not set up
anything like that for TCP port 8080, but the web cache traffic using that
port seems to work fine. So why did most clients get their WPAD
connections refused until that TOS value was added to the packets?
Although WPAD is now working using the new LVS directors, I am worried
that a future kernel upgrade or other change may break WPAD and/or the web
cache. If anyone can explain why adding that TOS setting fixed (or worked
around) the problem - or indeed, if the underlying problem can now be
identified - I would be very grateful!
John Line
--
John Line - web & news development, University of Cambridge Computing Service
|