LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: [lvs-users] TCP connection dropping ~7% of the time

To: LinuxVirtualServer.org users mailing list. <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: [lvs-users] TCP connection dropping ~7% of the time
From: Jay Faulkner <jay.faulkner@xxxxxxxxxxxxx>
Date: Tue, 21 Jul 2009 16:17:58 -0500
Comments inline.

Jason Faulkner
Linux Engineer, Rackspace Email & Apps
jason.faulkner@xxxxxxxxxxxxx
o: (540) 443-2101 (ex. 505-2101)

> > The problem we're experiencing now is that somewhere
> > between 3% and 7% of all connections are dropping -
> 
> you don't say what "dropping" is. Is it entries disappearing
> from the ipvsadm table? clients getting RST?
> 

I'm not sure, to be honest. This is the "needle" in the haystack -- out of 
maybe 60k connections, 1 will fail, and I've yet to find it on a packet capture.

> > same behavior you'd see with an iptables DROP rule or a
> > missing return route. We aren't seeing the issue when we
> > transition from a LVS to a direct DNAT to a VIP.
> 
> this is a wierd one. Looks a bit like arp hopping, so
> let's assume it's a routing problem. Let's look for wierd
> reasons.
> 
> o make sure there are no iptables rules on any of the
> machines.
> 

Iptables rules are on the machine, but didn't change at all between it working 
and it not working.

> o test with local client (not through proxy) connecting to
> VIP
> 
> o make sure there is no physical route from the realserver
> to the client except throught the DIP - ie no routes that an
> imcp redirect that change. check your routes with
> 
> ip route show
> 
> (not `route -n`)
> 
> o test with a simple client. telnet is best (to port 23)
> then next best is telnet to port 80 and do a GET
> 

We've never reproduced the problem hitting the VIP directly from anywhere but 
the server the proxy lives on. From that server, we see basic connection 
failures, similar to an iptables drop, regardless of client (we wrote an 
automated python testing tool). 

The routes are clean, and the traceroute is the same both ways (so there isn't 
a circular routing problem).

This looks so much like it has to be an issue with the network on the side of 
the proxy server; but as soon as we moved it to a DNAT the problem resolved 
itself. My conjecture has been that perhaps ipvs has some sort of limit on the 
number of active/inactive connections sourcing from the same IP address?

Thanks,
Jay

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

<Prev in Thread] Current Thread [Next in Thread>