LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

RE: lvs with bonding, heartbeat, etc.

To: "'lvs-users@xxxxxxxxxxxxxxxxxxxxxx'" <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: RE: lvs with bonding, heartbeat, etc.
From: Paul Lantinga <prl@xxxxxx>
Date: Fri, 8 Mar 2002 15:23:18 -0500
Thanks for the responses Roberto... see below for my responses.  -paul.
> From: Roberto Nibali [mailto:ratz@xxxxxxxxxxxx]
> > Ethernet bonding with LVS can be fun - even without much 
> documentation.
> > Is anyone else out there using LVS with the Linux bonding/ifenslave
> > setup?  I feel a document coming on and would be happy to 
> include any
> > additional wisdom others have gleaned from using and/or 
> setting up linux
> > bonding[1].
> 
> What are the benefits?

Greater than 100mb/s throughput and layer 2 failover via link-status
monitoring are at least two of the benefits.  Actually, I modified my
setup so that I bonded 4 ports one one side of the lvs and just the 1
gigE on the other side.   Using various load tools (see comments on
apachebench near bottom), I've >210mbits/s through the lvs bond0
interface alone.

> > I have yet to get heartbeat to work with it, but it should 
> work given
> > enough elbow grease.  Also, I have yet to get 2 separate 
> bonds (bond0,
> > bond1, etc) to come up.

BTW, heartbeat works quite flawlessly with the bond0 interface and
brings up the cip/vip's on bond0:0 and eth6:0 respectively.

> > Here's what I wanted to do:
> >   client network  lvs server (dell 2550)
> >   cisco 2924      ipvs1.0.0 on 2.4.18
> >   port group 1    interface bond0                
> >   port 1 <------->eepro100 dualNIC port#1
> >   port 2 <------->eepro100 dualNIC port#2       
> > 
> >   server network  lvs server (dell 2550)
> >   cisco 3548      ipvs1.0.0 on 2.4.18
> >   port group 2    interface bond1                
> >   port 3 <------->eepro100 dualNIC port#1
> >   port 4 <------->eepro100 dualNIC port#2
> > (for the sake of clarity in this example, the lvs servers mentioned
> > above both refer to the same box, not two separate lvs servers)
> 
> What is the purpose of this setup?

We're using it as a testbed platform for see how well LVS works under
heavy load and over time.  It will probably run for 4 days or so under
heavy load.  Any memory leaks or latent bugs, if any, will hopefully
come out or forever keep their peace.
 
> > Interface bond0 comes up fine with eth1 and eth2 no problem.  Bond1
> > fails miserably every time.  I'm going to take that issue up on the
> > bonding mailing list.
> 
> Which patch did you try? Is it the following:
> 
> http://prdownloads.sourceforge.net/bonding/bonding-2.4.18-20020226

Yup, that's the patch.  I posted to the bonding list and got a speedy
response that I was missing a paramter in my modules.conf for the
bonding kernel module.  I missed the reference in
<kernel-source-tree/Documentation/network/bonding.txt which mentions it.
Essentially, for additional bonds, you need to use the '-o' argument in
the options for the extra bond interfaces.  Along the lines of:
alias bond0 bonding
alias bond1 bonding
options bond0 miimon=100 mode=0 downdelay=0 updelay=5000
options bond1 -o bonding1 miimon=100 mode=0
probeall bond0 eth1 eth2 bonding
probeall bond1 eth3 eth4 bonding1

and lsmod gives:
Module                  Size  Used by
bcm5700                65776   1 
bonding1               11952   2  (autoclean)
eepro100               18080   4  (autoclean)
bonding                11952   2 

So, what gives with the bonding1 you ask?  I couldn't get it to work any
other way... so I had to go with the rather uelegant method of copying
the bonding.o module to bonding1.o.  

> Did you pass max_bonds=2 when you loaded the bonding.o 
> module? Without 
> that you have no chance. Read the source (if you haven't 
> already) to see 
> what other fancy parameters you might want to pass.

Yeah, I *did* read the source.  ;)  From what I understand and from
responses I got from the bonding list, the max_bonds parameter is for
the total number of interfaces enslaved.   Of course, I could easily be
wrong.  ;)

> 
> And if you're talking to them you might ask them if they'd 
> like to clean 
> up the bonding_init(void) function with the 
> *dev_bond/*dev_bonds mess :)

heh, sure thing.
 
> > Anyhow, here's what I ended up with:
> >   client network  lvs server (dell 2550)
> >   cisco 2924 sw   ipvs1.0.0 on 2.4.18  
> >   port group 1    interface bond0                
> >   port fa0/1 <------->eepro100 dualNIC port#1
> >   port fa0/2 <------->eepro100 dualNIC port#2       
> 
> Cool.
> 
> >   server network  lvs server (dell 2550)
> >   cisco 3548      ipvs1.0.0 on 2.4.18
> >   port g0/1 <---->onboard GiGENIC (broadcomm chipset)
> > 
> > This is driven in part by our desire to see how far we can 
> push lvs.  I
> > know it does 100mb/s in and out.  If it can keep 2 channels 
> full, I'll
> > add a thirds, fourth,fifth, etc as necessary.
> 
> Read [http://www.sfu.ca/acs/cluster/nic-test.html] to get the 
> impression 

I'll do that... thanks.

> of what happens if you try to bond too many NICs. Been there, 
> done that. 
> But again, feel free to excel. You might find out some more and maybe 
> even be able to linearly scale the throughput.
> 
> > Using apache bench (ab) this time.  ;)
> 
> How do you want to stress LVS with ab? :) You need much 
> stronger tobacco 
> for that!

well, 2 of my clients are running ab and 3 are running webstress from
MS.  Apachebench is a mixed bag.  My first impressions of it can be
pretty much summed up as: 'what a piece of crap, this thing barely
works.'  Running ab for a long time = ab quietly stops.  Telling it to
run for too long using '-t' or telling it to run until it hits a preset
number of hits using '-n' results in premature exit or seg fault.  So, I
found that running it once a minute via crontab and letting ab run for
57 seconds works well.   Both of those machines sustain just under
100mb/s of traffic.  The webstress app never crashes, but it only
generates about 70mb/s of traffic from each machine on average.

-regards,

Paul L.

Attachment: smime.p7s
Description: S/MIME cryptographic signature

<Prev in Thread] Current Thread [Next in Thread>