LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: module problem

To: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: module problem
From: Horms <horms@xxxxxxxxxxxx>
Date: Tue, 20 May 2003 20:00:14 +0900
On Sat, May 17, 2003 at 02:59:11AM +0200, B Metzdorf wrote:
> Hi,
> 
> I have a weird problem with ipvs modules.
> 
> My setup:
> 
> Debian woody
> Linux 2.4.20 (kernel source package from unstable, patches applied: lvm,
> hidden-arp, 3ware)
> linux-2.4.20-ipvs-1.0.8.patch.gz
> ipvsadm 1.0.8
> keepalived 0.6.9
> modprobe version 2.4.15
> rmmod version 2.4.15 (both from modutils-2.4.15-1)
> 
> using ip_vs and ip_vs_wlc.
> 
> I compiled the kernel (ipvs as modules), installed it, ran depmod,
> everything seems fine.
> 
> Loading the modules works perfectly (modprobe ip_vs && modprobe ip_vs_wlc),
> but when it comes to remove the modules on shutdown, rmmod hangs while
> trying to remove ip_vs.o . The shutdown script hangs (it does a plain rmmod
> ip_vs_wlc && rmmod ip_vs), and the rmmod process grabs all cpu time
> available...

Hi,

I believe that I have found the cause of your problem.
The culprit is the following line which was added to
ip_vs_sltimer_init() in ip_vs_timer.c

sltimer_jiffies = jiffies;

Else where in the code sltimer_jiffies is initialised to zero.
This new initialisation overrides that. This however, creates a problem.
The timers are implemented by inserting them into an array.
(Actually several arrays but that is not relevant).
Which array element to insert a timer into is calculated based on
sltimer_jiffies. 

Periodically (once per second) run_sltimer_list() which works
its way through the array, executing the timers. How far
it works through the array is based on making sltimer_jiffies catch
up to jiffies, the former is incremented for each itteration of the
loop.

Unfortunately, which slot to inspect and thus which timers to 
execute is determined by an index. It too is incremented
for each iteration of the loop. But it is not initialised in
ip_vs_sltimer_init() when sltimer_jiffies is intialised.
Thus unless ( ( jiffies << 6 ) & ( (1 << 8) - 1) ) == 0
at the time that LVS is initialised then the index will
not be correctly possitioned. Observing that this has a probablility 
of 1/2^8 this isn't so hot. And the more non-zero that value is,
the further out the index will be. 

The result is two fold.

Firtly as the index ends up lagging sltimer_jiffies, timers
are executed up to 2^14 jiffies late, on an intel system
there are 100 jiffies/seccond, so this means timers
can be exuted up to 163 secconds late. This isn't particularly
important. Except that it means that entries linger in the 
connection table and shop up with a really large (actually negative)
timeout. It also isn't so good if the machine is busy as
it uses up unneccessary resources, particularly memory.

It also appears to have the more severe side effect that
entries are not correctly cleared when an attempt
is made to remove the lvs module from the kernel. Leading
to such an operation hanging if there are entries
in the timer array that should have been expired.

I have attached a patch that should resolve this problem
by initialising the index correctly. After all that,
it is a whole one new line :)

You can also resolve it by removing the offending line from
ip_vs_sltimer_init().  

Note that you should use one of these solutions, not both!

-- 
Horms

Attachment: ipvs-1.0.8.sltimer_init.patch
Description: Text document

<Prev in Thread] Current Thread [Next in Thread>