Re: firewall marks + tunneling + persistence = ERR! state

To: Joseph Mack NA3T <jmack@xxxxxxxx>
Subject: Re: firewall marks + tunneling + persistence = ERR! state
Cc: " users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
From: Horms <horms@xxxxxxxxxxxx>
Date: Wed, 29 Nov 2006 11:34:17 +0900
On Tue, Nov 28, 2006 at 02:14:33PM -0800, Joseph Mack NA3T wrote:
> On Tue, 28 Nov 2006, Jaroslav Lib�k wrote:
> >Hello
> >
> >When i run that I get some connections with ERR! state.
> I'll let Horms handle that.

I will look into it and get back to you.

> >When I click refresh in firefox several times while viewing load
> >balanced page, I get a FIN_WAIT connection for every refresh. So I
> >set tcpfin parameter using ipvsadm to 15 seconds to get rid of them
> >fast (is this ok btw?, it was like 2 minutes before which I think is
> >way too long).
> tcp timeouts have the values they do for a good reason. If you
> understand your system and are prepared to deal with the consequences
> of changing the timeouts, then this being a GPL project you can go
> ahead and change anything you like.

There has long been a plan to allow the timeout values to be manipulated
from user space. I think it actually was possible using /proc at some
stage, but the code was removed for various (good) reasons. Then there
was a plan to implement the feature by extending the sysctl interface.
I suspect that this, or using sysfs is currently the prefered option
by the upstream kernel guys.

A really worthwhile contribution to LVS would be to complete this
code. I can find out from the upstream people what their prefered option
for implementing this is if you are interested in having a crack at it.
I don't imagine the code will be that hard.

> >What is worse, I get "established" connection on the slave for every
> >refresh.  I have read this is due to a simplification in the
> >synchronization code.
> the simplification being that the backup only has to track connections
> that it will take over if it becomes the master.

I understand that your concern is memory preasure on the slave in
the case of a DoS attack. And it is true that the simplification
in the synchronisation protocol can exasabate that problem.
However, by doing it this way the synchronisatin traffic is actually
reduced, including in the case of a DoS attack. So expanding it
may actually just move the problem else where. 

Keeping in mind that a connection entry is in the vicintity of 128
bytes, it is my opinion that unless you have an extreemly small ammount
of memory available on the system to start with, DoSing the machine in
this way is quite hard. I did try once, DoSing a box from istelf, and
basically the default timeouts were easily able to keep up with the DoS,
and I think the total memory used never exceded a few hundred Mb.

That said, if you have some ideas on how to improve this, then
this is the right place to discuss them.

> >I'm using hash table size 2^20 (which doesn't limit the maximum
> >number of values in it, it just sets the number of rows, then each
> >row has a linked list). Doesn't it cause some slowdown in the LVS?
> have you found a slowdown?

I would be very surprised if increasing the value would cause a
slowdown, it does hoever increase the memory required for the array that
forms the base of the hash - at 2^20 you are looking at order 2^20 = 1Mb
for the size of that array. For larger values, like 32 (=4Gb), this
starts to become rediculous. Decreasing it can, in theory, cause a
slowdown if you have a lot of connections. But in practice I don't think
it does unless you make it very small.

In short, 20 should be fine, though you can probably get
the same preformance with 16. 10 is probably a bit too small.


<Prev in Thread] Current Thread [Next in Thread>