LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: Gotchas with VS-TUN with Linux Kernel 2.2.12?

To: Michael Sparks <michael.sparks@xxxxxxxxx>
Subject: Re: Gotchas with VS-TUN with Linux Kernel 2.2.12?
Cc: Wensong Zhang <wensong@xxxxxxxxxxxx>, lvs-users@xxxxxxxxxxxxxxxxxxxxxx
From: "Stephen D. WIlliams" <sdw@xxxxxxx>
Date: Fri, 17 Sep 1999 20:04:07 +0000
Michael Sparks wrote:

> On Fri, 17 Sep 1999, Wensong Zhang wrote:
> > > OK, I see. *That's* why I haven't seen a reply. There aren't any. (As long
> > > as you remember to put the stuff in the kernel :-)))
> >
> > Look at http://www.linuxvirtual.org/VS-IPTunneling.html for its
> > working principle and instructions.
>
> Read & Printed that out before I started.
>
> > If your squid servers run on kernel 2.2.xx, apply one of the
> > patch posted in the mailing list to make the tunnel device not
> > do ARP response.
>
> Hmm... I must've missed that comment - I'll have a dig around before I
> start shifting load on next week. (Ramping up slowly as you'd imagine)

I run dr (Direct Route) since that allows the most bandwidth and least
bottlenecked systems.

I've had to solve a number of problems, including debugging the kernel a little
and writing the patch mentioned.  If you don't use the patch you'll find that 
the
'active' box will bounce from machine to machine as each one sends an ARP reply
that is heard last.  Additionally you will get TCP Reset's as connections that
were on one box suddenly start going to others.  Very nasty and unusable.

I've also mostly solved the problems of dynamically changing the configuration
that will be needed in our failover configuration.
Remember this trick because you'll need it: whenever you want to down/delete an
alias, first set it's netmask to 255.255.255.255.
This avoids also automatically downing aliases that are on the same netmask and
are considered 'secondaries' by the kernel.


> BTW, what's the largest load anyone out there on the list has put on these
> sorts of systems? We're looking to have a sustained level of traffic at
> around 50-70Mbit/s (*) between all the servers - spread over a couple of
> geographic locations for about 10-15 hours per day shortly, and hence the
> interest in this aspect.

I've been testing with httperf and at the moment only with two systems.  I've
been able to sustain 800+ http/GET connections with squid in front of Apache on 
a
pair of boxes.  That's limited mainly by the 1024 descriptor limitation in
httperf at the moment.  On one box I can do about 500/sec.  When I can clean up
httperf I'll find out how much I really lose for each additional box and how 
much
the routing impacts the test.  In this configuration, the boxes are acting both
as a LocalNode LinuxDirector/DR and also as servers.  I have not been testing
volume yet but as DR data doesn't flow back through the director, it should run
at whatever each raw box can handle.

>   (*) Or in terms of user requests 30 Million per day + 20-25 Million per
>       day generated by our servers - to origin servers and their ilk.

Ignoring the impact of actually having to send a significant amount of data or a
range that can't be cached, the above dual system configuration can handle about
69M - 86M / day.  It's a good benchmark for overhead.

> Also has anyone tried this using 2 or more masters - each master with it's
> own IP? (*) From what I can see theoretically all you should have to do is
> have one master on IP X, tunneled to clients who recieve stuff via tunl0,
> and another master on IP Y, tunneled to clients on tunl1 - except when I
> just tried doing that I can't get the kernel to accept the concept of a
> tunl1... Is this a limitation of the IPIP module ???

Do aliasing.  I don't see a need for tunl1.  In fact, I just throw a dummy
address on tunl0 and do everything with tunl0:0, etc.

We plan to run at least two LinuxDirector/DR systems with failover for moving 
the
two (or more) public IP's between the systems.
We also use aliased, movable IP's for the real server addresses so that they can
failover also.

>   (*) This way we give the clients multiple parents in the evnt of
>       catastrophic failure, that can and does happen.
>
> Michael.
> --
> National & Local Web Cache Support        R: G117
> Manchester Computing                      T: 0161 275 7195
> University of Manchester                  F: 0161 275 6040
> Manchester UK M13 9PL                     M: Michael.Sparks@xxxxxxxxxxxxxxx
>
> ----------------------------------------------------------------------
> LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
> To unsubscribe, e-mail: lvs-users-unsubscribe@xxxxxxxxxxxxxxxxxxxxxx
> For additional commands, e-mail: lvs-users-help@xxxxxxxxxxxxxxxxxxxxxx

sdw

--
OptimaLogic - Finding Optimal Solutions     Web/Crypto/OO/Unix/Comm/Video/DBMS
sdw@xxxxxxx   Stephen D. Williams  Senior Consultant/Architect   http://sdw.st
43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999



<Prev in Thread] Current Thread [Next in Thread>