LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: max connections to load balancer

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: max connections to load balancer
From: Roberto Nibali <ratz@xxxxxx>
Date: Thu, 19 Jun 2003 13:50:00 +0200
Hello,

Which count? ipvsadm -L -n or ab? What exactly do both tell you at the point of saturation? What kind of NICs do you use? Could you check their speed setting with either mii-tool or ethtool, please?


        Supports auto-negotiation: Yes
        Speed: 100Mb/s
        Duplex: Full

Ok, so this is fine.

The count was from ab however I checked when doing the same tests and it
was 500 per real server.

So it's a RS limitation. Maybe I didn't read your email carefully enough but what is the average time to fetch _one_ page and how _big_ is it in bytes? Also what is the load on the RS during the test?

and I was getting this from ipvsadm

TCP  192.168.55.5:http rr
-> 192.168.55.1:http Route 1 500 14699 -> 192.168.55.3:http Route 1 500 15404

Eek, your RS are ill. Sockets are not closed anymore there. I'm very much interested in the page you're trying to fetch now.

I also tried to patch httperf as per their suggestion but when I did
this it just sat and ate cpu time.

;).


Should the patch work or should I be doing something else?

I put a smiley there as I was never able to get meaningful output from httperf either myself. Most probably because of my lack of understanding how it worked.

test ----> LVS ----> RS
test --------------> RS
Yea, that is what I have been doing, sorry for not explaining it
clearly.

Thanks for confirmation; so this and the inactive counters from above to me indicate that your RS application does not close the sockets properly. We'll have to do some more testing then, once I get some more information about the page and its size.

What is also funny is that you have a limit on exactly 500. Bugs or limitations normally don't tend to show up with such an even number ;).

How would I be able to check if this is the case and how would I be able
to solve it?

You could run testlvs [1] but I can derive some numbers as soon as I know the page size and the RTT for one GET.

Well it was able to take more than apache and I tried setting that to
take the most connections it could.  Do you have any better suggestions
on software I should be using client side, even another protocol.

I know, we use thttp for static contents too sometimes because it can handle more connections than apache, but it should be able to get a lot more. I wonder if you set a connection limitation somewhere, something along the throttling part of thttpd. Also check your LINGER_TIME and LISTEN_BACKLOG settings.

Well it is all on it's own switch so I doubt that is the issue.

Yes.

I know it should be able to handle more but it appears there is
something wrong with my tests.

Or the app.

However I do get this from both of the app servers

TCP: time wait bucket table overflow

Too many connections in TW for FW2 state and too little memory too keep sockets. Very interesting!! From your 15k TW state entries and the 128Mb RAM assumption it would still not make too much sense because a socket doesn't need 8500 bytes. I think after the next email I have some tunables for you :)

We'll fiddle with some /proc/sys/net/ipv4/ entries.

I tried google and usenet however I could not find anything useful

Thanks again for your help, take care - RL

[1] http://www.ssi.bg/~ja/testlvs-0.1.tar.gz

Have a nice day,
Roberto Nibali, ratz
--
echo '[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc

<Prev in Thread] Current Thread [Next in Thread>