LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

[lvs-users] LVS not not failing over properly

To: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject: [lvs-users] LVS not not failing over properly
From: James Martin <jmartin@xxxxxxxxxxxxxxxxxxx>
Date: Tue, 26 Aug 2008 05:16:15 -0400
I have a LVS-NAT implementation in the lab that sort of works.    I have 
a primary and hot backup lvs node, and two web servers behind it a5ll 
running RHEL/CentOS 5.2. I can happily point my web browser at the 
virtual IP and I get the apache test page just fine.  I check the httpd 
access logs on the two real web servers and see that the load is being 
distributed.
The problem lies when I try to test the failover of the lvs nodes.  I 
shut the primary node down, and I see that it at least attempts to fail 
over, and seems to do so successfully:

Aug 25 18:21:44 lb2 pulse[5064]: partner dead: activating lvs
Aug 25 18:21:44 lb2 lvs[5083]: starting virtual service glassfish 
active: 80
Aug 25 18:21:44 lb2 avahi-daemon[3136]: Registering new address record 
for 10.11.12.10 on eth1.
Aug 25 18:21:44 lb2 avahi-daemon[3136]: Withdrawing address record for 
10.11.12.10 on eth1.
Aug 25 18:21:44 lb2 avahi-daemon[3136]: Registering new address record 
for 10.11.12.10 on eth1.
Aug 25 18:21:44 lb2 avahi-daemon[3136]: Registering new address record 
for 10.100.13.220 on eth0.
Aug 25 18:21:44 lb2 avahi-daemon[3136]: Withdrawing address record for 
10.100.13.220 on eth0.
Aug 25 18:21:44 lb2 avahi-daemon[3136]: Registering new address record 
for 10.100.13.220 on eth0.
Aug 25 18:21:44 lb2 lvs[5083]: create_monitor for glassfish/gf1 running 
as pid 5094
Aug 25 18:21:44 lb2 nanny[5094]: starting LVS client monitor for 
10.100.13.220:80
Aug 25 18:21:44 lb2 nanny[5095]: starting LVS client monitor for 
10.100.13.220:80
Aug 25 18:21:44 lb2 lvs[5083]: create_monitor for glassfish/gf2 running 
as pid 5095
Aug 25 18:21:44 lb2 nanny[5094]: making 10.11.12.1:80 available
Aug 25 18:21:44 lb2 nanny[5095]: making 10.11.12.2:80 available
Aug 25 18:21:49 lb2 pulse[5085]: gratuitous lvs arps finished


The problem is that attempts from my web browser to refresh the page are 
unsuccessful.  The lvs.cf is synchronized between the lvs nodes.  Here's 
a copy of the config:


serial_no = 49
primary = 10.100.13.96
primary_private = 10.11.12.8
service = lvs
backup_active = 1
backup = 10.100.13.87
backup_private = 10.11.12.9
heartbeat = 1
heartbeat_port = 539
keepalive = 6
deadtime = 10
network = nat
nat_router = 10.11.12.10 eth1:1
nat_nmask = 255.255.255.0
debug_level = NONE
monitor_links = 1
virtual glassfish {
    active = 1
    address = 10.100.13.220 eth0:1
    vip_nmask = 255.255.255.0
    port = 80
    send = "GET / HTTP/1.0\r\n\r\n"
    expect = "HTTP"
    use_regex = 0
    load_monitor = none
    scheduler = wlc
    protocol = tcp
    timeout = 6
    reentry = 15
    quiesce_server = 0
    server gf1 {
        address = 10.11.12.1
        active = 1
        weight = 1
    }
    server gf2 {
        address = 10.11.12.2
        active = 1
        weight = 1
    }
}


I believe the problem lies in arping, but I'm not sure how to diagnose 
this.  There are no firewalls between my browser and the lvs, and I'm 
using a fairly dumb 100mb switch (also tried with a smarter switch).

Any help would be greatly appreciated.

Thanks,

James




<Prev in Thread] Current Thread [Next in Thread>