LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: super slow speeds from director

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: super slow speeds from director
From: Roberto Nibali <ratz@xxxxxxxxxxxx>
Date: Thu, 22 Feb 2007 08:48:08 +0100
Hello Matthew,

 Recently we've had this affliction where if you goto www.omnovia.com,
everything is super ass slow. But if you goto wwwdb1.omnovia.com (or
wwwdb2) everything is blazing fast.

 I'm talking huge differences here. People on a T1 downloading a file
from www are getting around 10KB/sec and that same file from wwwdb1 is
around 110KB/sec.

I reckon www is mapped to the VIP, wwwdb[12] are mapped to the RS? And we're talking about one file only, correct? Is there a traffic shaper in between your clients and your servers?

 This problem started this morning for the 2nd time. It happened about
2 weeks ago but we did nothing to the setup and the problem seemed to
fix itself. But now that it's happened again we need some answers.

Our config as of right now: (ip addys changed to protect the innocent)

.35  VIP (ip that www.omnovia.com points to)
.50  RS1
.130 RS2

[root@lb1 ~]# ipvsadm -L -n
IP Virtual Server version 1.2.0 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  75.52.166.35:80 rr
  -> 75.52.166.50:80              Tunnel  1      12         95
  -> 75.52.166.130:80             Tunnel  1      12         107
TCP  75.52.166.35:443 rr
  -> 75.52.166.50:443             Tunnel  1      18         1331
  -> 75.52.166.130:443            Tunnel  1      28         1351
TCP  75.52.166.35:3306 rr
  -> 75.52.166.130:3306           Tunnel  1      4          0
  -> 75.52.166.50:3306            Tunnel  1      5          0

We have 1 iptable rule on both RS's to combat the POST packet size issue:

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
TCPMSS     tcp  --  75.52.166.35         0.0.0.0/0           tcp
flags:0x16/0x12 TCPMSS set 1440

I've been pretty passive on the LVS list for a couple of months, so what exactly did I miss with regard to POST packets? Could you send me a link where I can update myself in this matter?

We've had this in place since Jan 3rd so I don't see how suddenly this
could be causing a problem.

Well, it's all dynamic and certain subtle bugs only show up after some amount of time; for example memory leaks or some such.

Can anyone offer any suggestions on what to check, look for, diagnose,
etc on what this problem is?

netstat -i
netstat -s
dmesg -s 1000000
grep . /proc/sys/net/ipv4/*
cat /proc/slabinfo

And of course: real time tcpdumps of one flow when it happens.

I'd also add that my connection from home doesn't have this problem.
 From home, www, wwwdb1, wwwdb2 are all blazing fast. But we just had a
customer call from Chicago who was getting slow speeds and here in our
office its slow as well to www but not to db1, db2.

Is this reproducible? If so, could you check your MSS sizes in your routing cache? BTW, from here in Switzerland all three VIP, RS1 and RS2 access are not too fast either. And from China it's dog slow, and from my account in the US it's rather fast:

(CH) # tracepath www.omnovia.com
1: 192.168.1.32 (192.168.1.32) 0.366ms pmtu 1500
 1:  192.168.1.1 (192.168.1.1)                              1.736ms
2: 212.55.210.209 (212.55.210.209) asymm 3 2.252ms
 3:  zhalb-gw1-fe00-1.cyberlink.ch (195.226.12.1)          11.587ms
 4:  zhalb-cr1.cyberlink.ch (212.55.192.145)               31.591ms
 5:  glbix-br1.cyberlink.ch (212.55.192.198)               94.863ms
 6:  pos5-0.gw4.zur4.alter.net (139.4.71.37)               47.533ms
7: so-3-0-0.XR2.ZUR4.ALTER.NET (146.188.4.193) asymm 8 95.024ms 8: so-1-0-0.TR2.ZUR3.ALTER.NET (146.188.5.133) asymm 9 47.408ms 9: so-2-0-0.IR2.NYC12.ALTER.NET (146.188.8.178) asymm 10 141.743ms
10:  0.so-1-0-0.IL2.NYC9.ALTER.NET (152.63.23.69)         asymm 11  99.661ms
11:  0.so-7-0-0.XL4.NYC4.ALTER.NET (152.63.17.97)         asymm 12 100.214ms
12:  0.ge-5-1-0.BR2.NYC4.ALTER.NET (152.63.3.122)         asymm 13  99.517ms
13:  204.255.173.54 (204.255.173.54)                      asymm 12 104.101ms
14:  ae-32-56.ebr2.NewYork1.Level3.net (4.68.97.190)      asymm 12 120.824ms
15:  ae-1-100.ebr1.NewYork1.Level3.net (4.69.132.25)      asymm 12 112.435ms
16:  ae-1-100.ebr1.Washington1.Level3.net (4.69.132.29)   asymm 12 116.605ms
17:  ae-2.ebr1.Atlanta2.Level3.net (4.69.132.85)          asymm 12 131.241ms
18:  ae-14-53.car4.Dallas1.Level3.net (4.68.122.80)       asymm 12 148.308ms
19:  ae-14-55.car4.Dallas1.Level3.net (4.68.122.144)      asymm 12 140.875ms
20:  THE-PLANET.car4.Dallas1.Level3.net (4.71.122.2)      asymm 13 177.893ms
21:  te7-2.dsr02.dllstx3.theplanet.com (70.87.253.26)     asymm 14 173.259ms
22:  vl2.car02.dllstx6.theplanet.com (12.96.160.55)       asymm 14 180.037ms
23:  vl2.car02.dllstx6.theplanet.com (12.96.160.55)       asymm 15 191.469ms
24: 23.a6.344a.static.theplanet.com (74.52.166.35) asymm 15 176.687ms reached
     Resume: pmtu 1500 hops 24 back 15

Please note that my PMTU is set to 1500 for all 24 hops!

(US) # /usr/sbin/tracepath www.omnovia.com
 1?: [LOCALHOST]     pmtu 1500
 1:  virt9.johncompanies.com (69.55.226.161)                0.287ms
2: 69-55-233-156.in-addr.arpa.johncompanies.com (69.55.233.156) asymm 3 0.803ms 3: 69-55-233-161.in-addr.arpa.johncompanies.com (69.55.233.161) asymm 2 1.270ms 4: 69.43.129.83 (69.43.129.83) asymm 5 1.356ms 5: ge0-0-ext-4.castleaccess.com (69.43.169.68) asymm 6 1.894ms 6: ge-5-1-123.hsa1.SanDiego1.Level3.net (4.79.33.253) asymm 14 7.518ms 7: so-6-1-0.mp2.SanDiego1.Level3.net (4.68.113.37) asymm 15 7.288ms 8: ae-0-0.bbr2.Dallas1.Level3.net (64.159.1.110) asymm 13 39.331ms 9: ae-24-56.car4.Dallas1.Level3.net (4.68.122.176) asymm 13 56.716ms
10:  THE-PLANET.car4.Dallas1.Level3.net (4.71.122.2)      asymm 15  47.689ms
11:  te7-2.dsr02.dllstx3.theplanet.com (70.87.253.26)     asymm 14  39.762ms
12:  vl22.dsr02.dllstx2.theplanet.com (70.85.127.76)      asymm 15  40.307ms
13:  vl2.car02.dllstx6.theplanet.com (12.96.160.55)       asymm 16  40.022ms
14: 23.a6.344a.static.theplanet.com (74.52.166.35) asymm 17 40.060ms reached
     Resume: pmtu 1500 hops 14 back 17

Cheers,
Roberto Nibali, ratz
--
echo '[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc

<Prev in Thread] Current Thread [Next in Thread>