Hey everyone. I implemented an LVS-NAT/ldirectord setup for a web site and I'm
having some strange problems. What appears to be happening is that somehow the
real servers are not always able to send data back to the client that made the
request. When I view apache's status page (/server-status/) I'll see many
connections in the "W" (Sending Reply) state, and they'll stay there until they
either time out or I restart apache.
>From the users perspective it appears as though the server is not going to
>reply, and so they click again, starting a new request. What happens then is
>that the php that's running from the previous attempt eventually gives up,
>writes the session to the database, and quits. This causes any session updates
>that occurred after this request to be overwritten.
If I trace the php scripts I can see that the hangups always occur at points
where data would be flushed to the client, which is what leads me to believe
that the server is somehow not able to get the data back to the client.
I've tried with and without persistence, and it has made no difference.
Here's my setup:
2 directors - running heartbeat, LVS-NAT (kernel is 2.6.18-gentoo-r6), and
ldirectord
11 real servers, 9 in one pool, 2 in the other
ldirectord config:
logfile="local0"
autoreload = yes
checkinterval = 5
checktimeout = 15
quiescent = no
fallback = 127.0.0.1
virtual = xxx.xxx.xxx.1:80
real = web01:80 masq 1000
real = web02:80 masq 1000
real = web03:80 masq 1000
real = web04:80 masq 1000
real = web05:80 masq 1000
real = web06:80 masq 1000
real = web07:80 masq 1000
real = web08:80 masq 1000
real = web09:80 masq 1000
scheduler = wlc
persistent = 60
checktype = 3
protocol = tcp
request = "/testpage_web.html"
receive = "Web server seems to be up."
virtual = xxx.xxx.xxx.1:443
real = web01:443 masq 1000
real = web02:443 masq 1000
real = web03:443 masq 1000
real = web04:443 masq 1000
real = web05:443 masq 1000
real = web06:443 masq 1000
real = web07:443 masq 1000
real = web08:443 masq 1000
real = web09:443 masq 1000
scheduler = wlc
persistent = 60
protocol = tcp
request = "/testpage_web.html"
receive = "Web server seems to be up."
virtual = xxx.xxx.xxx.2:80
real = media01:80 masq 1000
real = media02:80 masq 1000
scheduler = wlc
checktype = 3
protocol = tcp
request = "/testpage_media.html"
receive = "Media server seems to be up."
virtual = xxx.xxx.xxx.2:443
real = media01:443 masq 1000
real = media02:443 masq 1000
scheduler = wlc
protocol = tcp
request = "/testpage_media.html"
receive = "Media server seems to be up."
As this is my first LVS setup I'm not really sure where to look next, so any
suggestions would be fantastic.
Thanks in advance.
-Brian
|