Hi Kees / folks,
Kees, I read the message on GoT
(http://gathering.tweakers.net/forum/list_messages/793615//). Your
problem there (tcp-reset from load balancer to client) seems to be the
same as my problem with LVS-DR with Squid. I've run 2.6.8 and 2.6.9 with
this problems.
Regards, Janno.
>>> kees@xxxxxxxxxxxx 18-1-2005 11:56:36 >>>
Hello list,
I'm running LVS on a slackware 10/2.6.8.1 box for a couple of weeks
now.
The box isn't in full production yet because i have a small problem
with
the wrr scheduling algoritme.
I check my realservers for their load and adjust the weight of the
servers
around every 10 seconds. This works fine if there aren't to many
incoming
connections, but as soon as there are a lot of connections i get a
connection refused and the client gets an ICMP packet with port
unreachable.
To test it I ran apachebench on a client to generate requests, the load
on
the realservers rise, so their weight is dropping, this is normal and
works correct with two realservers + a concurrency of 2 with
apachebench.
As soon as i set the apachebench concurrency to something like 20 I get
a
connection refused after a couple of requests, running tcpdump shows
that
it gets the previous mentioned ICMP packet.
So i turned on debugging in de LVS module and that showed me that
somehow
the wrr-algoritme couldn't find a destination:
anteros kernel: IPVS: lookup/in TCP client:42499->vip:80 not hit
anteros kernel: IPVS: lookup service: fwm 0 TCP vip:80 hit
anteros kernel: IPVS: ip_vs_wrr_schedule(): Scheduling...
anteros kernel: IPVS: Schedule: no dest found.
Although there are realservers present, and i can edit, delete and
create
them. Although no amount of editting/deleting etc will make the service
work again; i need to clear the table and remake the service.
# ipvsadm -l -n
IP Virtual Server version 1.2.0 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP vip:80 wrr
-> 10.0.1.24:80 Masq 900 0 69
-> 10.0.1.34:80 Masq 250 0 32
By doing some poor-mans-kernel-hacking (i.e. add some more debug
statements
i figured out that dest is set to NULL on line 191 of ip_vs_wrr.c,
because
mark->cl == p.
So somehowe my servers are overloaded (well they are at that point,
they
have quite some load, but they are still responding). The strange thing
is
that the servers are still overloaded long after they are back to a 0
load
average, not quite overloaded, so every new requests fails.
That brings me to my questions; How do i clear that overloaded flag, or
how
can i prevent it from ever being set.
Secondly; what is the best way to adjust the weights of the
realservers,
currently I do a check every 10 secs and adjust the load, or delete the
server if it is not responding (or add it when it responds again).
Somehow
the overload flag is set when the load is rapidly rising (at 0 load the
realservers have a weight of 10k, at a load of 1 that has dropped to
1k,
at a load of 2 it is 500, at 3 it is 250, and above 4 the server gets a
weight of 1).
The overload flag is only set if i change the weights, if they are
staticly
(at say 1000) all goes well.
Hope someone can help me out,
Kees Hoekzema
|