Thanks for the responses Roberto... see below for my responses. -paul.
> From: Roberto Nibali [mailto:ratz@xxxxxxxxxxxx]
> > Ethernet bonding with LVS can be fun - even without much
> documentation.
> > Is anyone else out there using LVS with the Linux bonding/ifenslave
> > setup? I feel a document coming on and would be happy to
> include any
> > additional wisdom others have gleaned from using and/or
> setting up linux
> > bonding[1].
>
> What are the benefits?
Greater than 100mb/s throughput and layer 2 failover via link-status
monitoring are at least two of the benefits. Actually, I modified my
setup so that I bonded 4 ports one one side of the lvs and just the 1
gigE on the other side. Using various load tools (see comments on
apachebench near bottom), I've >210mbits/s through the lvs bond0
interface alone.
> > I have yet to get heartbeat to work with it, but it should
> work given
> > enough elbow grease. Also, I have yet to get 2 separate
> bonds (bond0,
> > bond1, etc) to come up.
BTW, heartbeat works quite flawlessly with the bond0 interface and
brings up the cip/vip's on bond0:0 and eth6:0 respectively.
> > Here's what I wanted to do:
> > client network lvs server (dell 2550)
> > cisco 2924 ipvs1.0.0 on 2.4.18
> > port group 1 interface bond0
> > port 1 <------->eepro100 dualNIC port#1
> > port 2 <------->eepro100 dualNIC port#2
> >
> > server network lvs server (dell 2550)
> > cisco 3548 ipvs1.0.0 on 2.4.18
> > port group 2 interface bond1
> > port 3 <------->eepro100 dualNIC port#1
> > port 4 <------->eepro100 dualNIC port#2
> > (for the sake of clarity in this example, the lvs servers mentioned
> > above both refer to the same box, not two separate lvs servers)
>
> What is the purpose of this setup?
We're using it as a testbed platform for see how well LVS works under
heavy load and over time. It will probably run for 4 days or so under
heavy load. Any memory leaks or latent bugs, if any, will hopefully
come out or forever keep their peace.
> > Interface bond0 comes up fine with eth1 and eth2 no problem. Bond1
> > fails miserably every time. I'm going to take that issue up on the
> > bonding mailing list.
>
> Which patch did you try? Is it the following:
>
> http://prdownloads.sourceforge.net/bonding/bonding-2.4.18-20020226
Yup, that's the patch. I posted to the bonding list and got a speedy
response that I was missing a paramter in my modules.conf for the
bonding kernel module. I missed the reference in
<kernel-source-tree/Documentation/network/bonding.txt which mentions it.
Essentially, for additional bonds, you need to use the '-o' argument in
the options for the extra bond interfaces. Along the lines of:
alias bond0 bonding
alias bond1 bonding
options bond0 miimon=100 mode=0 downdelay=0 updelay=5000
options bond1 -o bonding1 miimon=100 mode=0
probeall bond0 eth1 eth2 bonding
probeall bond1 eth3 eth4 bonding1
and lsmod gives:
Module Size Used by
bcm5700 65776 1
bonding1 11952 2 (autoclean)
eepro100 18080 4 (autoclean)
bonding 11952 2
So, what gives with the bonding1 you ask? I couldn't get it to work any
other way... so I had to go with the rather uelegant method of copying
the bonding.o module to bonding1.o.
> Did you pass max_bonds=2 when you loaded the bonding.o
> module? Without
> that you have no chance. Read the source (if you haven't
> already) to see
> what other fancy parameters you might want to pass.
Yeah, I *did* read the source. ;) From what I understand and from
responses I got from the bonding list, the max_bonds parameter is for
the total number of interfaces enslaved. Of course, I could easily be
wrong. ;)
>
> And if you're talking to them you might ask them if they'd
> like to clean
> up the bonding_init(void) function with the
> *dev_bond/*dev_bonds mess :)
heh, sure thing.
> > Anyhow, here's what I ended up with:
> > client network lvs server (dell 2550)
> > cisco 2924 sw ipvs1.0.0 on 2.4.18
> > port group 1 interface bond0
> > port fa0/1 <------->eepro100 dualNIC port#1
> > port fa0/2 <------->eepro100 dualNIC port#2
>
> Cool.
>
> > server network lvs server (dell 2550)
> > cisco 3548 ipvs1.0.0 on 2.4.18
> > port g0/1 <---->onboard GiGENIC (broadcomm chipset)
> >
> > This is driven in part by our desire to see how far we can
> push lvs. I
> > know it does 100mb/s in and out. If it can keep 2 channels
> full, I'll
> > add a thirds, fourth,fifth, etc as necessary.
>
> Read [http://www.sfu.ca/acs/cluster/nic-test.html] to get the
> impression
I'll do that... thanks.
> of what happens if you try to bond too many NICs. Been there,
> done that.
> But again, feel free to excel. You might find out some more and maybe
> even be able to linearly scale the throughput.
>
> > Using apache bench (ab) this time. ;)
>
> How do you want to stress LVS with ab? :) You need much
> stronger tobacco
> for that!
well, 2 of my clients are running ab and 3 are running webstress from
MS. Apachebench is a mixed bag. My first impressions of it can be
pretty much summed up as: 'what a piece of crap, this thing barely
works.' Running ab for a long time = ab quietly stops. Telling it to
run for too long using '-t' or telling it to run until it hits a preset
number of hits using '-n' results in premature exit or seg fault. So, I
found that running it once a minute via crontab and letting ab run for
57 seconds works well. Both of those machines sustain just under
100mb/s of traffic. The webstress app never crashes, but it only
generates about 70mb/s of traffic from each machine on average.
-regards,
Paul L.
smime.p7s
Description: S/MIME cryptographic signature
|