LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: SV: Introduction and LVS/DR 2.4 realserver questions

To: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: SV: Introduction and LVS/DR 2.4 realserver questions
From: Jeffrey A Schoolcraft <dream@xxxxxxxxxxxxxx>
Date: Thu, 1 Mar 2001 06:55:12 -0500
* msteele@xxxxxxxxxxxxxxxxxxx (msteele@xxxxxxxxxxxxxxxxxxx) wrote:
> You might want to have a look at the kernel
> httpd accelerator (khttpd). It should be blazing fast
> for serving static content as it intercepts
> requests on port 80 for filetypes you define,
> and serves them. It only passes the connection
> to the webserver when it can't figure out what
> to do with a request. The only drawback is that
> it doesn't yet do virtual hosts. So let's
If you need virtual hosting and want a "blazing fast" web server I
suggest you check out:
        http://people.redhat.com/~mingo/TUX-patches/

You'll need to grab both the kernel patch and the user space module but
it's being actively developed (and I'm not sure khttpd is).  Tux also
supports cgi.  

> Another thing you might want to consider is
> dropping mysql. I've read that postgresql is
> much faster and handles concurrency much
> better than mysql. I'm currently starting
> to have problems with mysql and I'm going
> to switch to postgres shortly. Postgresql
> will shortly have database replication support,
> which means that you'll be able to distribute
> requests to different database servers
> throughout your network. 

Mysql's biggest issue with concurrency is it's lack of row level
locking.  It can do replication and again it's being developed a lot
more I think than postresql.  The option there is to replicate the
database across several machines and even put them behind LVS.  The only
caveat is that you must alter your programs to insert/update/delete only
to master.db.yourdomain.com (which is the master db server).  But
selects can go to cluster.db.yourdomain.com (and hit any of the
replicated servers, and even the master depending on your level of
comfort and how often you change the db).

> > We have actually considered both options for quite some time and for now
> > httpd+mysql on same host seems to be the best configuration. I'll explain
> > why:
> > Each mahine can take up to about 200 http connections with 1 Gb RAM (all
> > realservers have 1 Gb RAM). Since we use PHP the http processes are at least
> > 5 Mb in size, so we have big memory problems if we go any further than that.

I'm serving static files with TUX so this may or may not be useful.  On
a 2 p3 933 with 2GB RAM and 1 acenic gigabit ethernet card I was able to
synthetically simulate (through httperf on 8 clients) 7120 successful
simultaneous connections (through a 100M port on the switch).  Switching
the gigabit card to the gig port we got all 8000 connections to be
handled.  Our tests started 1000 simulated requests for a 200K file per
client.  I admit, we could have a truer result with more clients and a
more randomized request list.

> > At that load the local MySQLd runs fine having about 200 connections.
> > 
> > In a dedicated database setup we have big problems running over 500
> > connections to MySQL. So instead of having, for example, 2 webservers
> > sharing 1 database machine and still just get to handle 400 connections we
> > choose to have 2 web+db machines that can also take 400 connections.
> > That saves us the cost of 1 db mahine per 2 web servers. In total we save 8
> > machines, and that's http 1600 connections :)
> > If anyone have comments on this i'd love to hear them.

200 http connections seems low.  In a production environment (dual p3
800, raid, 1GB RAM), serving large files (3 - 10MB) we easily saw 800 
simultaneous connections using thttpd (www.acme.com/software/thttpd/) 
and likely 500 with apache.
Again all of my content is static, but possible with mod perl as
suggested or php caching you could decrease the work a web server has to
do.

> > One thing we've been thinking about is to serve static images from separate
> > webservers running boa or some other small and efficent webserver. The
> > problem with it is that it would make life complicated for the designers and
> > webmasters unless there was a way of doing it behind the scenes.
> > One scenario could be that the load balancer would look at the requests and
> > direct all /images/* to the static virtual server. I know that some
> > loadbalancers can do that, how about LVS?

Definitely this would speed things up.  It shouldn't make life that
complicated for people, just use absolute paths for images, something
like <img src=http://images.mydomain.com/images/someimage.png>.  Then
have images.mydomain.com point to an LVS cluster of webservers running
tux or thttpd serving images, or just a single machine.  As long as your
developers/web masters get in a habit of using a name
(images.mydomain.com) you can change it from a single server to lvs
through DNS or /etc/hosts and you'll be fine.  Obviously you'll have to
keep your content synced across servers (but you could have a simple
script making that part easier for developers).

> > > servers. We're running Linux 2.2 with Apache+PHP and MySQL on all
> > > realservers, about 16 in total.

You should also see quite an improvement going from linux 2.2.x to 2.4.2
(or 1).  

> > > We use custom/scp script for file mirroring but are looking for
> > > better ways

This takes care of images.yourdomain.com problem.
I've played with this problem a lot though, and if you want to use scp
like tool I'd say rsync, I'm a little more comfortable making sure
everything is in sync rather than hoping it all got copied.  Another
option is to have a master server somewhere, and have all your client
machines wget -m (to mirror) or -N (use timestamping to only get
changes) the files they need.  Then your script could be:

ssh image1.mydomain.com "updateImages.sh"

You could have it record it's success to a file or a database and just
have another script check the database for any problems (if it could
report a problem), or any process that didn't report finish after it
reported start.

From what I've seen wget is a lot faster than rsync, and even when one
of our servers is maxed out and I can't get an rsync to accomplish I can
ssh in and wget the needed files.


Not really sure who's email I ended up addressing, but that's my 2 cents
worth.

Jeffrey Schoolcraft



<Prev in Thread] Current Thread [Next in Thread>