LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Overload flag is not resetting

To: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject: Overload flag is not resetting
From: Kees Hoekzema <kees@xxxxxxxxxxxx>
Date: Tue, 18 Jan 2005 11:56:36 +0100
Hello list,

I'm running LVS on a slackware 10/2.6.8.1 box for a couple of weeks now. 
The box isn't in full production yet because i have a small problem with 
the wrr scheduling algoritme.

I check my realservers for their load and adjust the weight of the servers 
around every 10 seconds. This works fine if there aren't to many incoming 
connections, but as soon as there are a lot of connections i get a 
connection refused and the client gets an ICMP packet with port 
unreachable.
To test it I ran apachebench on a client to generate requests, the load on 
the realservers rise, so their weight is dropping, this is normal and 
works correct with two realservers + a concurrency of 2 with apachebench.
As soon as i set the apachebench concurrency to something like 20 I get a 
connection refused after a couple of requests, running tcpdump shows that 
it gets the previous mentioned ICMP packet.

So i turned on debugging in de LVS module and that showed me that somehow 
the wrr-algoritme couldn't find a destination:

anteros kernel: IPVS: lookup/in TCP client:42499->vip:80 not hit
anteros kernel: IPVS: lookup service: fwm 0 TCP vip:80 hit
anteros kernel: IPVS: ip_vs_wrr_schedule(): Scheduling...
anteros kernel: IPVS: Schedule: no dest found.

Although there are realservers present, and i can edit, delete and create 
them. Although no amount of editting/deleting etc will make the service 
work again; i need to clear the table and remake the service.

# ipvsadm -l -n
IP Virtual Server version 1.2.0 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  vip:80 wrr
  -> 10.0.1.24:80                 Masq    900    0          69
  -> 10.0.1.34:80                 Masq    250    0          32

By doing some poor-mans-kernel-hacking (i.e. add some more debug statements 
i figured out that dest is set to NULL on line 191 of ip_vs_wrr.c, because 
mark->cl == p.

So somehowe my servers are overloaded (well they are at that point, they 
have quite some load, but they are still responding). The strange thing is 
that the servers are still overloaded long after they are back to a 0 load 
average, not quite overloaded, so every new requests fails.

That brings me to my questions; How do i clear that overloaded flag, or how 
can i prevent it from ever being set.

Secondly; what is the best way to adjust the weights of the realservers, 
currently I do a check every 10 secs and adjust the load, or delete the 
server if it is not responding (or add it when it responds again). Somehow 
the overload flag is set when the load is rapidly rising (at 0 load the 
realservers have a weight of 10k, at a load of 1 that has dropped to 1k, 
at a load of 2 it is 500, at 3 it is 250, and above 4 the server gets a 
weight of 1).
The overload flag is only set if i change the weights, if they are staticly 
(at say 1000) all goes well.

Hope someone can help me out,
Kees Hoekzema


<Prev in Thread] Current Thread [Next in Thread>