LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Bug: Persistence doesn't associate TCP-->UDP connections to same server

To: linux-ha@xxxxxx, lvs-users@xxxxxxxxxxxxxxxxxxxxxx, "Domsch, Matt" <Matt_Domsch@xxxxxxxx>, "Rusk, Mark" <Mark_Rusk@xxxxxxxx>, "Painter, Shane" <Shane_Painter@xxxxxxxx>, piranha-list@xxxxxxxxxx
Subject: Bug: Persistence doesn't associate TCP-->UDP connections to same server
From: Michael E Brown <Michael_E_Brown@xxxxxxxx>
Date: Mon, 23 Oct 2000 10:48:43 -0500
Hi all,

   I have encountered a showstopper bug in trying to create an LVS
cluster that I need to have some help with.  Basically, we have a
process that uses NFS and FTP that we would like to port to an LVS
cluster.

   Before getting into NFS issues, please understand that I realize all
of the drawbacks of using NFS against an LVS cluster, and they are not a
factor in our case.  Again, we have a special process, I understand the
failover issues between two NFS servers with different inode numbers for
the same file, and that is not an issue in our process.

  Our test setup looks something like this:


(1 -- 150) Clients ------> LVS   ---------> FTP/NFS Server
                                                                   |

|------> FTP/NFS Server
                                                                   |

|------> FTP/NFS Server
                                                                   |

|------> FTP/NFS Server

The linux clients are using Kernel 2.2.12.   The LVS server is running
Redhat's kernel 2.2.16-4.  I don't have access to the servers right now
(they are in another building, on a separate test lan), or I would
include configs now as well.  I was trying to use Redhat Piranha out of
the box, but basically ended up doing the following manually (on the LVS
server, of course).  (no pulse or lvm daemons running)

ipchains -A input -d  vir.ip.addr -m 1
ipvsadm -A -f 1 -s rr -p 60
ipvsadm -a  -f 1 -r  real.server.1.ip  -i -w 1
ipvsadm -a  -f 1 -r  real.server.2.ip  -i -w 1
# ipvsadm -a  -f 1 -r  real.server.3.ip  -i -w 1
# ipvsadm -a  -f 1 -r  real.server.4.ip  -i -w 1


What would end up happening with a two node cluster, is that the TCP and
UDP portions of the client mount request ended up getting routed to two
separate servers half the time. It looks like this:

Realserver 1
  portmapper TCP  111
  rpc.mountd  UDP port 743  (or other random port)
  rpc.nfsd  UDP 2049

Realserver 2
  portmapper TCP 111
  rpc.mountd UDP port 789  (or other random port)
  rpc.nfsd  UDP 2049

Client -->TCP--->LVS Server  Port:111   (portmapper)  "Which port is
rpc.mountd running on?"
                             |-----> redirected to Realserver 1

Client -->UDP---->LVS Server Port 743   "I want to mount X"
                             |----->redirected to Realserver 2, who
sends an ICMP port unreachable to the client


About half of the clients would work correctly with a two node cluster,
and half would fail. And because they are being persistent connections,
once a client fails, it is forever after failed.  With a four node
cluster, about three-quarters of the clients would fail.  I have run
TCPDUMP on the LVS server, and each realserver and seen the exact
exchange above.

Possible workarounds:
  1)  Tell the client to use UDP to contact the portmapper, instead of
TCP.  I dont' know how to do this. I could find no mention in any
documentation on how to do this.

  2)  Tell the rpc.mountd on each machine to listen on the same port.
Same issue as above, I don't see anything in the rpc.mound source code
that would indicate that you can tell it what port to listen on.

  3) Tell LVS to redirect both UDP and TCP connections to the same
server... This would be the best solution, I think.

Thank you,
Michael Brown.








<Prev in Thread] Current Thread [Next in Thread>