LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: persistence

To: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: persistence
From: Casey Zacek <cz@xxxxxxxxxxxx>
Date: Wed, 30 Mar 2005 21:41:04 -0600
Graeme Fowler wrote (at Wed, Mar 30, 2005 at 10:05:29PM +0100):
> Output from "ipvsadm -L -n" would be useful (you sent that already), as
> would your keepalived configs (suitably anonymised).

VIP1 is the live site; VIP2 was used for testing LVS compatibility
for the site before cutover from another load-balancer using VIP1,
and will be removed in the near future (assuming we can iron out this
bug and not cut back to another LB solution).

global_defs [snipped -- nothing relevant]

vrrp_instance vi_130 {
    interface eth1.19
    lvs_sync_daemon_interface eth1.19
    smtp_alert
    virtual_router_id 130
    priority 100
    authentication {
        auth_type PASS
        auth_pass <passwd>
    }
    virtual_ipaddress {
        VIP1 dev eth0
        VIP2 dev eth0
    }
}

virtual_server_group 130_CustomerName {
    VIP1 0
    VIP2 0
}

virtual_server group 130_CustomerName {
    delay_loop 20
    lb_algo wlc
    lb_kind TUN
    persistence_timeout 7200
    protocol TCP

    real_server RIP1 0 {
        weight 50
        HTTP_GET {
            url {
                path /healthcheckpath.cfm
                digest 30a491d5c7060d2e9d16124950334b57
                status_code 200
            }
            connect_port 80
            connect_timeout 120
            nb_get_retry 4
            delay_before_retry 3
        }
    }
    real_server RIP2 0 {
        weight 50
        HTTP_GET {
            url {
                path /healthcheckpath.cfm
                digest e3fe176953a4e2d4b7a88735e23dead3
                status_code 200
            }
            connect_port 80
            connect_timeout 120
            nb_get_retry 4
            delay_before_retry 3
        }
    }
}

> I'm guessing that
> the patch you mention was actually sent to the keepalived-users list,
> not this one, and was the one for healthchecking Win2K/IIS+Coldfusion;

Yeah, that's the one -- I'm on too many damn lists.  It's rather
amazing to me that this is the same customer that brought that bug
about as well.  I have lots of other customers on the same LVS servers
that have never mentioned a problem (aside from The MTU Thing).

> not that healthchecking has an enormous amount to do directly with the
> way your LVS gets set up...

Agreed.

> One thing worth bearing in mind there is your use of "port 0". I'm using
> specific application ports in my LVS (80 and 443 in this case) and the
> persistence works perfectly, all the time, every time. Have you tried it
> with specific ports?

Nope.

> Does the application hand state across sessions on
> different ports (say via a backend DB), or does the persistence really
> only need to be on a single session (on a single port)?

The application is basically just a ColdFusion website with a mixture
of HTTP and HTTPS.  I need for each client ip to reach the same
realserver every time (within the timeframe set by the persistence
timeout), regardless of client port and regardless of realserver
(== virtualserver in my case) port.

I shouldn't need to create another service entry, splitting port 80
from port 443 virtually, should I?

I looked at ip_vs_conn.c a little bit, and I'm tempted to just always
set cp->*port to 1 (because 0 is pseudo-special somehow that I haven't
grasped fully yet) for connection hashing/lookup purposes only, which
should, at the very least, reduce quite a bit of overhead that I don't
need for my desired "sticky" persistence level anyway.  I really don't
want to have to maintain a personal kernel patch, though, and I don't
have the kernel-hacking kung-fu to do it the right way which would be
to introduce different connection hashing method options.

I ran this afternoon with /proc/sys/net/ipv4/vs/debug_level set at 10
(no docs on what possible values do here that I have found), and I'll
coordinate session failure timeframes with the customer tomorrow
hopefully and compare them to the 5GB log.

-- 
Casey Zacek
Senior Engineer
NeoSpire, Inc.

<Prev in Thread] Current Thread [Next in Thread>