LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: [lvs-users] ipvs connections sync and CPU usage

To: "'Julian Anastasov'" <ja@xxxxxx>
Subject: Re: [lvs-users] ipvs connections sync and CPU usage
Cc: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
From: "Aleksey Chudov" <aleksey.chudov@xxxxxxxxx>
Date: Thu, 12 Jan 2012 18:23:45 +0200
Hello Julian,

I successfully patched Linux Kernel 2.6.39.4 with "port 0", "HZ/10" and
"sync" patched.
After reboot and transition Backup server to Master state I see increase in
sync traffic and cpu load respectively.


1. "port 0", "HZ/10" patches on Master, "port 0", "HZ/10" patches on Backup

Master # sysctl -a | grep net.ipv4.vs.sync
net.ipv4.vs.sync_version = 1
net.ipv4.vs.sync_threshold = 3  10

Results: sync traffic 60 Mbit/s, 4000 packets/sec, 40 %sys CPU on Backup, 40
%soft CPU on Master

PersistConn: 93.5305%
ActiveConn: 98.1211%
InActConn: 99.4691%
(less on Backup server)


2. "port 0", "HZ/10" and "sync" patches on Master, "port 0", "HZ/10" patches
on Backup

Master # sysctl -a | grep net.ipv4.vs.sync
net.ipv4.vs.sync_version = 1
net.ipv4.vs.sync_threshold = 3  10
net.ipv4.vs.sync_refresh_period = 0
net.ipv4.vs.sync_retries = 0

Results: sync traffic 300 Mbit/s (raised from 200 to 300 for 10 minutes
after start), 25000 packets/sec,
98 %sys CPU on Backup, 70 - 90 %soft CPU on Master (all cores)

PersistConn: 99.6622%
ActiveConn: 250.664%
InActConn: 35.802%

Yes, Number of ActiveConn on the Backup server is more than twice as many!
Also memory usage on Backup server increased.
And this is the only test in which load on Master server increased.
Thus, the impression that something went wrong.


3. "port 0", "HZ/10" and "sync" patches on Master, "port 0", "HZ/10" patches
on Backup

Master # sysctl -a | grep net.ipv4.vs.sync
net.ipv4.vs.sync_version = 1
net.ipv4.vs.sync_threshold = 0  0
net.ipv4.vs.sync_refresh_period = 10
net.ipv4.vs.sync_retries = 0

Results: sync traffic 90 Mbit/s, 8000 packets/sec, 70 %sys CPU on Backup, 40
%soft CPU on Master

       PersistConn ActiveConn  InActConn
Master:    4897491    7073690    7663812
Backup:    5057332    7073285    7625001
           103.26%     99.99%     99.49%


4. "port 0", "HZ/10" and "sync" patches on Master, "port 0", "HZ/10" patches
on Backup

Master # sysctl -a | grep net.ipv4.vs.sync
net.ipv4.vs.sync_version = 1
net.ipv4.vs.sync_threshold = 0  0
net.ipv4.vs.sync_refresh_period = 100
net.ipv4.vs.sync_retries = 0

Results: sync traffic 60 Mbit/s, 5000 packets/sec, 50 %sys CPU on Backup, 40
%soft CPU on Master

       PersistConn ActiveConn  InActConn
Master:    5170205    7270767    7808097
Backup:    5036484    7244686    7716304
            97.41%     99.64%     98.82%


5. "port 0", "HZ/10" and "sync" patches on Master, "port 0", "HZ/10" patches
on Backup

Master # sysctl -a | grep net.ipv4.vs.sync
net.ipv4.vs.sync_version = 1
net.ipv4.vs.sync_threshold = 0  0
net.ipv4.vs.sync_refresh_period = 1000
net.ipv4.vs.sync_retries = 0

Results: sync traffic 45 Mbit/s, 4000 packets/sec, 40 %sys CPU on Backup, 40
%soft CPU on Master

       PersistConn ActiveConn  InActConn
Master:    5226648    7691901    8251039
Backup:    5100281    7576195    8159248
            97.58%     98.50%     98.89%

Of course it's quick results. To get the right counters more time needed.

Do I understand correctly that maximum safe sync_refresh_period depends on
persistent timeout?

Have you thought about the possibility to distribute sync load on multiple
processors?


Looked closely at ipvsadm -lnc output and found that we have the following
connections:

pro expire state       source             virtual            destination
IP  05:02  NONE        Client IP:0    0.0.0.1:0          Real IP:0
TCP 14:10  ESTABLISHED Client IP:50610 Virtual IP:80   Real IP:80
TCP 14:10  ESTABLISHED Client IP:50619 Virtual IP:443   Real IP:443
...

Do we really need "Virtual IP:Port" information for Fwmark? Can we use
"0.0.0.1:80" or better "0.0.0.1:0"?
1. With "0.0.0.1:80" we can sync connections to LVS servers with different
VIPs (in different data centers for example) - very useful for scalability
2. With "0.0.0.1:0" we can reduce the number of connections entries by
aggregating them
Are there any potential problems?



Best regards,
Aleksey


_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

<Prev in Thread] Current Thread [Next in Thread>