The problem has been solved. It's something related to iptables.
Stopping iptables on director and the connection rate goes from 200 to
Nx2000, where N is the number of real server. After that, I tried to
figure out which iptables rules conflict with ipvs and found that it's
default argument generated from system-config-securitylevel that cause
this. Replace "-m state --state NEW -m tcp -p tcp --dport 80" with just
"-m tcp -p tcp --dport 80" make everything works perfectly.
Roberto Nibali wrote:
Hello,
I want to setup a web cluster on Centos 4.2 (RHEL4u2). Somehow
benchmark result show that using LVS is extremely slow compared to
single machine web server. It can perform only about 100-200
connection/s, while single machine can perform up to 2000/s. The
experiment was done using the same machine as single web server and
real server of LVS pool. My setup details are below
Hardware Setup
===========
3 nodes web cluster + 1 LVS director
All are using the same Hardware (ACER with P4 3.0GHz + HyperThreading
, 1GB memory and on board gigabit ethernet (tg3 module)).
Network: 3com 8 ports gigabit switches ( can't remember the model,
sorry). The switch should work fine since it has been used in a
working compute cluster until yesterday.
Do you also have a gateway node in your setup where all the clients'
request get in and forwarded to the cluster?
Software Setup
==========
Distros: Centos 4.2 (RHEL 4u2), ipvsadm 1.2.1. kernel version
2.6.9-22.0.2.ELsmp
Benchmark tools: WebStone 2.5 using 50 clients on 15 machines.
Networking: All machine has only single ethernet card. All configured
with real IP.
LVS: Direct Routing + WLC (I've tried RR but the result is about the
same)
Director setup : using iproute2 "ip addr add xx" command
Real server setup: using iproute2 on local loop back "ip addr add
xx/32 dev lo scope host". + arp_announce = 2 & arp_ignore = 3,
ip_conntrack_max = 262144
Disable ip_conntrack completely when doing such tests, especially on
the director. Please share the whole command set and output here. Also
the output of ethtool on the involved NICs.
Web server: Apache httpd
Am I doing anything wrong? This is the second time that I tried LVS,
the first time was many years ago when kernel 2.4 is just came out.
Any suspicious kernlog entries (dmesg -s 1000000)? I currently get
12000 HTTP/1.0 requests/s onto a 2.4.x RS using 2.4.x kernel and
/dev/epoll interface.
Give us some more input please, and your exact test conducts.
Best regards,
Roberto Nibali, ratz
|