LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: [lvs-users] RFC: Forking ldirecterd [PATCH]

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: [lvs-users] RFC: Forking ldirecterd [PATCH]
From: Simon Horman <horms@xxxxxxxxxxxx>
Date: Thu, 29 Nov 2007 17:05:39 +0900
On Wed, Nov 28, 2007 at 01:56:18AM -0800, Ryan Castellucci wrote:
> The attached patch modifies ldirectord to fork a process for each
> virtual server to speed up response time with large numbers of virtual
> servers. I am testing this vs multiple instances of ldirectord, two
> virtual servers, three real servers each, and it uses about 25MB less
> ram over that, and starts up a lot quicker.

Hi Ryan,

this patch seems very nice, thanks.

> Other things to note
> 
> $0 is set for the children so you can see what virtual server each is
> managing, and what real server it's checking from ps

This will probably work on Linux, but it probably won't work on
Solaris - they beileve that changing $0 and having that reflected
in ps is a security problem because it allows people to hide processes
- i.e. I can hide "fork-bomb" as "/bin/sh". This isn't a big problem
with regards to your patch, just something I thought might
be interesting.

> All children are supervised by the parent, and restarted if they exit.

Nice

> Due to issues with state tracking, when a child starts, it forces all
> of it's real servers down until it rechecks them.  This fixed issues
> with the state of the real servers changing between when a child dies
> and when it is restarted.

I think it would be nice if it could be a bit more clever.
Leaving toggling the real-servers as neccessary. Do you
think this is at all possible.

> Reloading the config kills all children due to not being able to muck
> about with their state.  Due to reasons stated above, this may cause a
> brief service interruption.

Killing the children seems fine, but interrupting the service
will likely annoy many people.

> I'd like feedback on this patch, and any constructive criticism,
> suggestions, bug fixes, etc are welcome.
> 
> Do please note that I coded this after being sick and awake for about
> 20 hours straight (cough was keeping me up), so it probably isn't my
> best work.
> 
> Standard disclaimer: This is not well tested code.  Don't run it in
> your massive data center.  If you do anyway, I'm not responsible for
> any failures that result from it's use.

I wonder if you could make this a configuration option.
We could intially set it to off to give people a chance to test it.
Then make it the default later if it is successful.

Another request, would it be possible to make the diff
against ldirectord.in that is in the linux-ha mecurial tree?

Lastly, there is some (perhaps overly-complex logic) in ldirectord to
only test a real-server once if it appears in multiple virtual services.
I think that your changes will basically disable that code - so perhaps
it could be removed if your code is successful. This would be nice as
the complexity of the code has been a pain to maintain.

On the other hand, that code was added for a real life case of many many
virtual services all with the same real server, which seems like it
would be a pathalogical case for your change. So perhaps we need to keep
the old way too.

-- 
Horms



<Prev in Thread] Current Thread [Next in Thread>