LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

RE: No buffer space available

To: "''lvs-users@xxxxxxxxxxxxxxxxxxxxxx' '" <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: RE: No buffer space available
Cc: 'Peter Mueller' <pmueller@xxxxxxxxxxxx>
From: Joseph Mack NA3T <jmack@xxxxxxxx>
Date: Mon, 30 Sep 2002 09:34:56 -0700 (PDT)
On Mon, 30 Sep 2002, Jeremy Kusnetz wrote:

> I posted a detailed response to Roberto.  Unfortunitely the response was


unless you have material that the 500 people on the mailing list need to
see, it would be best to go offline then


as well your post below contains a lot of material from previous posts.
I'm reading this over a jammed 28k line and it takes minutes to scroll
through

 Joe


> over 40K, so I'm waiting for the



>
> My primary director box had been running kernel 2.4.7 with the LVS
> version
> that was current at the time 2.4.7 came out.  It was running perfectly
> stable for at least 6 months, never a single problem.
>
> I'm running LVS-NAT.  There are 53 VIPs on the box, pointing to 6
> realservers.  Each VIP forwards, mail, pop, dns, radius, http, https to
> virtual interfaces running on the realservers.
>
> The director consists of a SMP PIII 1gig, with 512 ram.  Two built in
> Intel
> nics, eth0 has the 53 VIPs, eth1 is the gateway to the realservers.
> Then
> there is a 3com nic which is running heartbeat with our secondary
> director.
>
> About a month ago I got an alert that the primary director was down.  I
> logged in, and it was up, I could ping out, I could ping the RIPs, but I
> couldn't ping any of it's own interfaces, even the loopback.  (the
> realservers could ping those interfaces on the director) Pinging these
> interfaces from the director gave me  the following error:
>
> ping: sendto: No buffer space available
>
> A reboot fixed the problem.
>
> This problem started happening more and more frequently, until it was
> happening at least once a day, usually in the middle of the night.
>
> I figured it was time to upgrade the kernel and LVS version.  I upgraded
> to
> 2.4.19 and LVS 1.0.6.  I hoped this would fix the problem, but it did
> not.
>
> Next I swapped out all the hardware, everything but the drives, and the
> cables.  This box had the same amount of memory, but slightly slower
> CPUs,
> 800mhz, but I figured even those are probably overkill.
>
> This did not help either.
>
> I changed the driver for the intel NICs from eepro100, to the latest
> e100
> from intel.  I've always had problems with eepro100 drivers, but when I
> was
> running the old version of LVS, it had problems with the e100 drivers.
> But
> now with the latest version, e100 seems to work.
>
> But alas this did not fix the problem either.
>
> The only thing that had changed before I started upgrading everything
> was
> the amount of VIPs on the director.  A couple of realservers had been
> added
> to the mix too, along with more RIPs.  We had also added some iptable
> rules
> to drop SMTP connections from some some external IPs that were really
> bad
> spammers.  This list grew to about 50 chains.
>
> After changing drivers to e100, and it not fixing the problem, I changed
> the
> iptable rules to reject the packets instead of dropping them.  This had
> a
> slight change to the symptoms.  Instead of not being able to ping any of
> it's own interfaces on the director, I can no longer ping random RIPs,
> to
> the point where I start losing services because the LVS can't forward
> connections to those IPs.
>
> I've now removed the iptable rules completely, hoping that was the
> cause.
> Didn't help.
>
> The only thing I can do besides rebooting that helps is I can bring down
> and
> up eth2, which is the heartbeat interface.  Sometimes that will clear
> the
> problem for a few hours or longer, but sometime it will only clear up
> the
> problem for a few seconds.  I've now reverted to a cron job that brings
> down
> and up this interface once a minute.  It sort of helps, but I still end
> up
> having to reboot.  Bringing up and down the loopback sometimes works too
> for
> a few seconds.  I tried bring up and down the eth0 and eth1, but that
> didn't
> seem to have any affect.
>
> Is there some sort of tuning I need to do to the /proc file system.  I
> have
> no idea what else to do, I've upgraded everything I can think of.
> Apparently it's not an issue of buggy software.  Please help, being
> paged at
> all hours of the night for the past few weeks is getting really old!  If
> you
> can't help, what would be a good list to post this question to?
>
> _______________________________________________
> LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
> Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
> or go to http://www.in-addr.de/mailman/listinfo/lvs-users
>

-- 
Joseph Mack NA3T EME(B,D), FM05lw North Carolina
mailto:jmack@xxxxxxxx azimuthal equidistant map
generator at http://www.wm7d.net/azproj.shtml



<Prev in Thread] Current Thread [Next in Thread>