LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: 'Preference' instead 'persistence'?

To: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: 'Preference' instead 'persistence'?
From: Roberto Nibali <ratz@xxxxxxxxxxxx>
Date: Wed, 09 Oct 2002 00:44:02 +0200
And now _I_ am not sure if I understand this...

One day one of our brains will flourish and pure wisdom will spread over the world, taking it over.

Sure I would - if either of the following is true: 1) the client's IP address is new 2) the realserver that would be picked otherwise (e.g. when using true persistence) is overloaded compared to the others 3) the realserver that would be picked otherwise is down.

Case 1 is already handled with the normal persistence setting and WLC. Case 3 can be enabled using the sysctl setting in the new LVS.

No, case 3 can only be solved, if you have an intelligent user space daemon running which takes out the service. Otherwise the load balancer doesn't really know about the state of a RS.

Case 2 remains then. Suppose you have 9 clients and 3 realservers (all clients have a different IP).

For me case 2 is still represented with the WLC scheduler but I might just be dumb.

In case of true persistency each realserver will serve 3 of these clients, ad infinitum. In case of non-persistency the HTTP requests are spread more or

This is not a realistic case. New clients will come and if RS1 is overloaded with Client1, Client4 and Client7, then the new client is _not_ going to RS1 but to RS2.

less randomly over the realservers. This is extremely bad for session state, since in the end each realserver has to track (and fetch) session data from all 9 clients, instead of having a balanced set of 3 clients per realserver.

Yes, that's why you have persistence. That was what I tried to explain with my ASCII sketch.

As long as this status quo is there persistency works ok and performs better than non-persistent connections.

Ok.

Persistency however is not a 'hard' requirement as it is for e.g. HTTPS, it's only a 'soft' requirement because it performs better.

Ok. You call it soft requirement or non-true persistency if you set persistency where you wouldn't really need it but where you gain from it by not needing to load session IDs for new requests, right?

Thus, if client 1 turns out to be a masquerading gateway for a NAT-ed network it will send realserver 1's load much higher than realservers 2 and 3 if we use the 'normal' persistence.

Yes, and it this load is high, response time sinks, resulting in longer active connections. Result: Once every while the wlc scheduler will not choose this RS.

Therefore it would be nice if LVS could detect that we're only using 'soft persistency' and reassign two clients to the other realservers. Now RS1 has

Ohhhhhhhhhhhhhhhhhh. Now I understand. You mean that the load balancer should realize that the active connections are blasting away the CPU power on one RS and that he should be fair and take those bloody suckers away and put some of them to a next RS? How to you plan on maintaining TCP state information over other nodes?

Did I get it this time? Please, please, please?

the one NAT-ed network, and RS2 and RS3 both serve four other clients, resulting in both a good balance (almost as good as non-persistency) and ALSO a strong preference for a single realserver per IP to avoid hitting the penalty for re-fetching session data.

But in your case established connection would need to reconnect, wouldn't they?

Another advantage is that changing the weighting to 0 for maintenance will almost instantly reassign the clients because it's not a technical problem.

Only new clients. Old client will stay on the quiesced server.

Since the reassign is done only once per IP the performance hit is rather minimal.

Wait a minute, what is being reassigned? Old connections are being assigned according to the exisiting template to their appropriate RS. New connections that do not have an entry in the table get assigned to new RS.

Actually it only confused me a bit :/

It describes a possible example of a few incoming varying srcIPs and their distribution among 3 RS. But never mind.

That's what we're using now and it works fine during normal operation, but when pulling a machine down for maintenance it's a PITA, it takes at least half an hour before all clients are gone from the web sites, and often longer.

Aehm, so clearly people stay at least for (30 Minutes - persistency_timeout) on your website. This is tough luck. You could of course use the described procedure with the sysctrl to take a server out but then you loose them. And that's your point I think. You would like to (this time) reassign them so you can take the RS out more quickly, right?

If I'm right then let me tell you that this is not possible :).

No, you misunderstood me. It's working fine over longer periods of time. But not as good as non-persistency works, and especially for maintenance it's sheer overkill for us.

Ok. Independantly of the issue that your clients seem to like your website a lot to stay there almost half an hour, what are your persistence timeout settings?

Three realservers, and currently about 13 Gb/day traffic, but we've had another site until 2 months ago that pulled 20 Gb/day alone and we expect it to return shortly. So I better anticipate for that now I still have the time :-)

Oh, I thought you were having high volume traffic. If I understand this correctly, you have between 1.2 and 1.5 MBit/s traffic.

ratz@laphish:~ > echo "20*1024*8/(12*3600)/3" | bc -l
1.26419753086419753086
ratz@laphish:~ >

The current 13 Gb are made almost entirely between 11:00 am and 11:00 pm, with the peak in the ~4 hours after dinner. The realservers can handle the load, it's the database that's getting troubles and moving the session state storage there doesn't sound like a good idea...

Ok. I think I can assume that you have appropriate hardware for the DB.

Best regards,
Roberto Nibali, ratz
--
echo '[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq'|dc



<Prev in Thread] Current Thread [Next in Thread>