LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

No buffer space available

To: "'lvs-users@xxxxxxxxxxxxxxxxxxxxxx'" <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: No buffer space available
Cc: Jeremy Kusnetz <JKusnetz@xxxxxxxx>
From: Jeremy Kusnetz <JKusnetz@xxxxxxxx>
Date: Sun, 29 Sep 2002 19:34:03 -0400
Not sure if this an LVS problem, but maybe someone here is smart enough to
at least get me looking in the right direction.

My primary director box had been running kernel 2.4.7 with the LVS version
that was current at the time 2.4.7 came out.  It was running perfectly
stable for at least 6 months, never a single problem.

I'm running LVS-NAT.  There are 53 VIPs on the box, pointing to 6
realservers.  Each VIP forwards, mail, pop, dns, radius, http, https to
virtual interfaces running on the realservers.

The director consists of a SMP PIII 1gig, with 512 ram.  Two built in Intel
nics, eth0 has the 53 VIPs, eth1 is the gateway to the realservers.  Then
there is a 3com nic which is running heartbeat with our secondary director.

About a month ago I got an alert that the primary director was down.  I
logged in, and it was up, I could ping out, I could ping the RIPs, but I
couldn't ping any of it's own interfaces, even the loopback.  (the
realservers could ping those interfaces on the director) Pinging these
interfaces from the director gave me  the following error: 

ping: sendto: No buffer space available

A reboot fixed the problem.

This problem started happening more and more frequently, until it was
happening at least once a day, usually in the middle of the night.

I figured it was time to upgrade the kernel and LVS version.  I upgraded to
2.4.19 and LVS 1.0.6.  I hoped this would fix the problem, but it did not.

Next I swapped out all the hardware, everything but the drives, and the
cables.  This box had the same amount of memory, but slightly slower CPUs,
800mhz, but I figured even those are probably overkill.

This did not help either.

I changed the driver for the intel NICs from eepro100, to the latest e100
from intel.  I've always had problems with eepro100 drivers, but when I was
running the old version of LVS, it had problems with the e100 drivers.  But
now with the latest version, e100 seems to work.

But alas this did not fix the problem either.

The only thing that had changed before I started upgrading everything was
the amount of VIPs on the director.  A couple of realservers had been added
to the mix too, along with more RIPs.  We had also added some iptable rules
to drop SMTP connections from some some external IPs that were really bad
spammers.  This list grew to about 50 chains.

After changing drivers to e100, and it not fixing the problem, I changed the
iptable rules to reject the packets instead of dropping them.  This had a
slight change to the symptoms.  Instead of not being able to ping any of
it's own interfaces on the director, I can no longer ping random RIPs, to
the point where I start losing services because the LVS can't forward
connections to those IPs.

I've now removed the iptable rules completely, hoping that was the cause.
Didn't help.

The only thing I can do besides rebooting that helps is I can bring down and
up eth2, which is the heartbeat interface.  Sometimes that will clear the
problem for a few hours or longer, but sometime it will only clear up the
problem for a few seconds.  I've now reverted to a cron job that brings down
and up this interface once a minute.  It sort of helps, but I still end up
having to reboot.  Bringing up and down the loopback sometimes works too for
a few seconds.  I tried bring up and down the eth0 and eth1, but that didn't
seem to have any affect.

Is there some sort of tuning I need to do to the /proc file system.  I have
no idea what else to do, I've upgraded everything I can think of.
Apparently it's not an issue of buggy software.  Please help, being paged at
all hours of the night for the past few weeks is getting really old!  If you
can't help, what would be a good list to post this question to?


<Prev in Thread] Current Thread [Next in Thread>