LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: Taking out realserver for maintenance

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: Taking out realserver for maintenance
From: Horms <horms@xxxxxxxxxxxx>
Date: Tue, 16 Aug 2005 18:14:00 +0900
On Tue, Aug 16, 2005 at 09:51:14AM +0100, Jan Bruvoll wrote:
> Hi,
> 
> thanks for a helpful and thorough reply. Let me just check if I
> understand this correctly (using our current set-up):
> 
>  - our original persistence was set to 360 seconds, intended to be
> longer than the expected recurring request frequency of our application,
> which checks with our server cluster every 300 seconds ("ish")
>  - if I keep the original persistence, any client already known to the
> cluster requesting data from the cluster again -before- the 360 seconds
> expire of that particular client "id", will trigger a persistence
> counter reset for this client

Yes. The request could be opening a fresh connection. Or it could be
the end-user sending (for any LVS forwarding meachism) or recieving (in
the case of LVS-NAT) data for an existing connection.

You can see the persistance teplates, and the progress of their
timeouts, in amongst other connection entries if you run ipvsadm -Lcn.
The persistance entries are the one with a client port of 0.

> However,
>  - if the weight for a particular real-server is set to 0, no -new-
> clients should be allocated to this realserver

Yes, where new clients are one without a persistance template entry.

>  - any clients not "coming back" within the 360 seconds should be
> removed from the persistence map, and any new requests from same clients
> after being removed should be allocated to one of the other realservers

Yes. Though going away basically means no packets for existing
connections, and no attempt to open a new connection.

> Am I right so far?

Yes

> So, my questions are these (not necessarily directly related to each other):
>  - since writing, I tried resetting the persistence manually to 5
> seconds, in order to try and flush the persistence "map" quicker. This
> hasn't had any perceivable effect, as the number of connections to this
> server as I am writing now, still reflects the original weight (some 18
> hours after setting the weight to 0) - but maybe I am being to impatient
> (not meaning for that to sound the wrong way!)

Ok, obviously waiting 18 hours for connections to flush is impractical.

What you are seeing is probably the result of either a bug in lvs,
or very entusiastic end-users (i know they are just programmes, but hey),
that send packets or recieve packets from the virtual service (and thus
the real servers) at least once every 5 seconds.

Some examintion of what is happening using ipvsadm -Lcn should
shed some light on this: watch -n 1 ipvsadm -Lcn

>  - how can it be that the number of active connections actually
> increases on the realserver whose weight is 0?

This is quite possible if a known (i.e. has a persistance template
because it has sent or recieved data within the last 5 seconds (your
timeout)) end-user opens a second connection, and no connections are
closed. I'm not sure if this is actually what is happening, again
ipvsadm -Lcn may help to show what is going on.

> I value your assistance highly, as this is a bit of a problem for us.

No problem.

What you are seeing is a bit strange, and hopefully you can diagnose
exactly what is going on. But please consider setting 
/proc/sys/net/ipv4/vs/expire_quiescent_template to 1, as
it should give behaviour that better suits your needs.

> Thanks for helping.
> 
> Best regards
> Jan
> 
> Horms wrote:
> 
> >On Mon, Aug 15, 2005 at 10:53:23AM -0700, Joseph Mack NA3T wrote:
> >  
> >
> >>On Mon, 15 Aug 2005, Jan Bruvoll wrote:
> >>
> >>    
> >>
> >>>this server have all gone away is expected. However, since I issued the
> >>>command the number of active connections has actually increased,
> >>>      
> >>>
> >>:-(
> >>
> >>You'll have to wait for Horms I'm afraid.
> >>    
> >>
> >
> >Hi,
> >
> >This is a fairly simple problem, that is unfortunately difficult to
> >explain. Let me try:
> >
> >When you set a real server to be quiescent (weight=0), this means that
> >no new connections will be allocated to that real server using the
> >scheduler. However, if you have persistance in effect (which you do),
> >and a new connection is recieved from a end-user that recently made a
> >connection, then that connection will be allocated to the same
> >real server as the previous connection. The trick is, this process
> >by-passes the scheduler, and thus by-passes quiescence.
> >
> >So, for a persistant service a new connection is processed a bit like this:
> >
> >  if (same end-user as a recent connection)
> >     use the real-server of that connection
> >  else
> >        choose non-quiecent real-server using scheduler
> >
> >Obviously this is a bit of a problem, for the reason you describe in
> >your email. In implementation terms the problem is that when
> >a connection for a persistant service is scheduled, a persistant
> >template is created, with a timeout of the persitant timeout.
> >This template is then used to select the real-server for
> >subsequent connections from the same end-user. It stays
> >in effect until its timeout expires. And its timeout is
> >renewed everytime a packet is recieved for an associated
> >connection. Which means in the case of quiesence, as long
> >as end-users that have active persistance templates keep
> >connecting or sending packaets within the persistance timeout,
> >the real-server will keep having connections.
> >
> >The solution to this is quite simple. The patch at the URL below, which
> >has been included in recent kernel versions, adds
> >expire_quiescent_template to proc. By default it is set to 0, which
> >gives the behaviour discribed above, the historical behaviour of LVS
> >(which I might add can be desirable in some situations). However, if you
> >set it to 1, then connection templates associated with a quiesced
> >real-server are expired, at lookup time.  Which, in a nutshell means
> >that the "if" condition above will always fall through the the "else"
> >clause, and thus quiescence is not by-passed.
> >
> >To effect this change just run the following as root
> >
> >echo 1 > /proc/sys/net/ipv4/vs/expire_quiescent_template
> >
> >The change effect new connections immediately.
> >
> >Or, on systems that have sysctl, add the folloging to /etc/sysctl.conf
> >and run sysctl -p net/ipv4/vs/expire_quiescent_template= 1
> >
> >This will also take effect immediately, and has the advantage that
> >the change will be persistant across reboots.
> >
> >http://archive.linuxvirtualserver.org/html/lvs-users/2004-02/msg00224.html
> >
> >  
> >
> 
> _______________________________________________
> LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
> Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
> or go to http://www.in-addr.de/mailman/listinfo/lvs-users

-- 
Horms

<Prev in Thread] Current Thread [Next in Thread>