LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: What happened here? strange peaks in straffic

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: What happened here? strange peaks in straffic
From: Todd Lyons <tlyons@xxxxxxxxxx>
Date: Wed, 15 Jun 2005 08:41:05 -0700
Jan Klopper wanted us to know:

>I have a cluster of 5 machines, 2 directors, 3 realservers, and 
>localhost feature.

I've never used the localhost feature, so don't know what kind of impact
it would have on things.

>Thi morning it saw some strange peeks in the mrtg graphs from my switch.

The peaks are not the problem.

>No  15 is the primary director. (and localhost realserver)
>No 16 is the failover director, (trough hearthbeat), and will do 
>localhost realserver.
>No  17 is a realserver
>No  18 is a realserver
>No  19 is a realserver

I'll ignore 16 because it's the failover.  All references to "director"
are the primary director, no 15.

If you line all the graphs up vertically, you will see that at midnight,
the amount of traffic at realserver 17 spiked.  At the same time, it
dropped on the remaining 2 real servers *AND* on the director.  So
realserver 17 is arping for that IP.  Gotta fix that.

Then at 4 AM, realserver 19 came back up.  That's good right?  No,
because at that same time, all traffic at the director was still gone,
and the other 2 realservers had no traffic either.  So realserver 19 is
arping for that IP.  Gotta fix that.

It's a good bet that realserver 18 is probably also not configured to
filter those arps properly, so you'll probably need to fix that one as
well.

Finally at 8 AM, the director answered an arp for that IP and started
receiving packets from the router again, so it could properly load
balance again.  At that point, all the traffic levels returned to
normal.

The only disparity I see between the real servers during the "normal"
times are the blue lines.  Is that incoming or outgoing bandwidth?
Why is realserver 17 around 0K most of the time, realserver 18 around
35K most of the time, and realserver 19 arond 21K most of the time?
That's just odd.  I would suspect that is traffic from whatever app is
monitoring the realservers.  If that is the case, something isn't
configured right or isn't resolving right because it would seem that
realserver 17's polling is actually going to realserver 18.  I'd look
into that.

Just an observation, not really relevant:  Since an arp request comes
every four hours, your router arp timeout is probably about 14400
seconds.  That seems reasonable.

>These are just the lan ports, and its a direct routing setup, so the 
>traffic is returned by the realservers on their internet port.
>Anyone an idea?

http://www.austintek.com/LVS/LVS-HOWTO/HOWTO/LVS-HOWTO.arp_problem.html
-- 
Regards...              Todd
They that can give up essential liberty to obtain a little temporary 
safety deserve neither liberty nor safety.       --Benjamin Franklin
Linux kernel 2.6.11-6mdksmp   3 users,  load average: 0.14, 0.17, 0.38

<Prev in Thread] Current Thread [Next in Thread>