Re: RedHat ES3 LVS-Nat

To:	Michael Sztachanski <michael.sztachanski@xxxxxxxxxxxxxxxxxxxxxx>
Subject:	Re: RedHat ES3 LVS-Nat - Arp issues
Cc:	"LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
From:	Roberto Nibali <ratz@xxxxxxxxxxxx>
Date:	Tue, 28 Sep 2004 11:37:53 +0200

Dear Michael,

Do you use Hubs or Switches?


Cisco Switches, not sure of their config as they are handled by our network 
services department.

Ok, so you definitely have a secluded collision domain. I'm only askingbecause if you were still using hubs your L2 collision domain would spanfurther than anticipated and could thus cause those massive arp entriesin the routing cache you're seing.

Am getting copious amounts of arpping and caching at eth0 on both LVS
routers. I'm expecting 4000 users to go though this LVS, will that much
arp traffic on the eth0 side kill connections? I have already increased
the arp cache size to 4096, but I'm
still getting overflows.


Which settings did you perform exactly?


Adjusted the gc_thresh from 1024 in  
/proc/sys/net/ipv4/neigh/default/gc_thresh3 to 4096.

Just to make sure, how do your other values look? in this directory?Also note that setting ../default/<key>=<value> does not help thecurrent situation. {default} in proc-fs is used only when a new deviceis created, which is not the case in your setup. So I reckon your/proc/sys/net/ipv4/neigh/{eth0,eth1}/gc_thresh3 are still as low as theywere at boot time. You would need to adjust those as well.

What are your gc_thresh* settings? How big is your neighbour table?


there are over 1900 entries

Either it's the thing I mentioned in the last paragraph or the dst cacheGC doesn't kick in for some reason which I would be very interested indebugging :).

nat_router = 172.24.24.1 eth1:1
nat_nmask = 255.255.255.0
debug_level = NONE
virtual gnetest {
    active = 1
    address = 10.0.1.99 eth0:1
    vip_nmask = 255.255.248.0


why not 255.255.255.255?

Are you asking about the vip_netmask or the nat_netmask?


The vip_nmask.

The netmaks shown are our internal masks.


Yes, and this is also correct.

The values are as per the RH documentation.
There was no mention of the value you have mentioned.

Strange. It's not an absolute must but it's an advantage because thenonly arp probe for VIP is used instead of a collapsing probe range.

    port = 80
    persistent = 3600


do you need such a high persistency?


The Users talk to Web App on IIS web servers that talk to a database
that requires a minium of 1hr persistancy.

Ok, I also see from your GUI output that either your application or yourwebservers are kind of extremely busy for a request. Must be a complex site.

The VIP should have 255.255.255.255 as a mask.The RH Doco had 255.255.255.0. Sorry for my ignorancy. What is the reason for this.

The reason is that your netmask for the VIP has an overlap with yourprimary IP on your physical interface which sends arp replies for bothIPs. It would be wise to only have a VIP/32 which would not reply(stack-wise) to {eth0}/21.

eth0:88 Link encap:Ethernet HWaddr 00:0D:60:9C:08:86inet addr:10.0.1.88 Bcast:10.0.7.255 Mask:255.255.248.0eth0:89 Link encap:Ethernet HWaddr 00:0D:60:9C:08:86inet addr:10.0.1.89 Bcast:10.0.7.255 Mask:255.255.248.0
What are eth0:88 and eth0:89 for?
These are external address 10.0.1.88 and 89 that I'vve NATed in
the iptables rules to 172.24.24.2 and 3.
This is so the developers can RDP to each box individually.

Why? It's your internal network. Why can't they be reached over the10.0.0.0/21 net? Your network setup is rather confusing to me ;). Andwhy do you need to NAT at all to reach a local address?

Taken at 16:25 form the RH Web GUI.
IP Virtual Server version 1.0.8 (size=65536)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.0.1.99:80 rr persistent 360000 FFFFFFFF

                                 ^^^^^^
                      just to make sure: this is what you want?

-> 172.24.24.21:80 Masq 1 540 8
-> 172.24.24.22:80 Masq 1 639 1

Those are really big numbers of active connections, I wonder what kindof application is taking so long to serve a request.


Take care,
Roberto Nibali, ratz
--

echo'[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc

<Prev in Thread]	Current Thread	[Next in Thread>
RedHat ES3 LVS-Nat - Arp issues, Michael Sztachanski Re: RedHat ES3 LVS-Nat - Arp issues, Roberto Nibali Re: RedHat ES3 LVS-Nat - Arp issues, Michael Sztachanski Re: RedHat ES3 LVS-Nat - Arp issues, Roberto Nibali <=

Previous by Date:	Re: RedHat ES3 LVS-Nat - Arp issues, Michael Sztachanski
Next by Date:	[OT] Re: Problem with /etc/ha.d/resource.d/apache script, Roberto Nibali
Previous by Thread:	Re: RedHat ES3 LVS-Nat - Arp issues, Michael Sztachanski
Next by Thread:	"No daemon is running" -- BUT IT IS!, Casey Zacek
Indexes:	[Date] [Thread] [Top] [All Lists]

Re: RedHat ES3 LVS-Nat - Arp issues