I have a system that I'm working on that doesn't seem to quite do what
I'm expecting. I'll lay out the system with obfuscated external IP
addresses. It's not that it's super secret, I'm just paranoid enough
that I don't like handing out too much info about the internal
configuration of the network. We're using LVS-DR.
The problem that we are having seems to be with the pop and imap
services being load balanced across the same set of machines. Here's
how the system is laid out, then I'll get more specific with my question
at the end.
We're load balancing smtp across 2 machines (sendmail), pop and imap
across 2 machines (courier-imap), and www across 2 machines (apache).
The issue I have is that the webmail box uses imap for authentication so
we want it to access the VIP'd (load balanced) external address rather
than specifying the RIP's (would defeat some of the goals of load
balancing).
All machines are Gentoo boxen with a 2.6.5 kernel.
The configuration on the director is as follows:
miniip root # ipvsadm --list -n
IP Virtual Server version 1.2.0 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 64.xxx.xxx.31:25 rr
-> 10.1.1.240:25 Route 1 0 0
-> 10.1.1.241:25 Route 1 0 0
TCP 64.xxx.xxx.32:110 rr
-> 10.1.1.242:110 Route 1 0 0
-> 10.1.1.243:110 Route 1 0 0
TCP 64.xxx.xxx.34:80 rr
-> 10.1.1.245:80 Route 1 0 0
-> 10.1.1.244:80 Route 1 0 0
TCP 64.xxx.xxx.33:143 rr
-> 10.1.1.242:143 Route 1 0 0
-> 10.1.1.243:143 Route 1 0 0
miniip root # ifconfig 2>&1 | egrep -v "(RX|TX|collisions)"
eth0 Link encap:Ethernet HWaddr 00:90:27:E0:1E:81
inet addr:64.xxx.xxx.6 Bcast:64.14.201.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
eth0:1 Link encap:Ethernet HWaddr 00:90:27:E0:1E:81
inet addr:64.xxx.xxx.33 Bcast:64.14.201.255 Mask:255.255.255.255
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
eth0:2 Link encap:Ethernet HWaddr 00:90:27:E0:1E:81
inet addr:64.xxx.xxx.32 Bcast:64.14.201.255 Mask:255.255.255.255
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
eth0:3 Link encap:Ethernet HWaddr 00:90:27:E0:1E:81
inet addr:64.xxx.xxx.31 Bcast:64.14.201.255 Mask:255.255.255.255
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
eth0:4 Link encap:Ethernet HWaddr 00:90:27:E0:1E:81
inet addr:64.xxx.xxx.34 Bcast:64.14.201.255 Mask:255.255.255.255
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
eth1 Link encap:Ethernet HWaddr 00:50:DA:7C:54:F9
inet addr:10.1.1.15 Bcast:10.1.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:18 Base address:0x1080
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:16436 Metric:1
The 64.xxx.xxx.6 IP address is not part of the load balanced system. We
have not put static arp entries in the router for these load balanced
IP's. We would like to not have to do that, but if it becomes
necessary, then we will (but we think it should not be necessary).
Here is what the pop and imap realservers look like:
mail1 root # ifconfig
eth0 Link encap:Ethernet HWaddr 00:0B:DB:95:1B:50
inet addr:10.1.1.242 Bcast:10.255.255.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:737079 errors:0 dropped:0 overruns:0 frame:0
TX packets:697588 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:143763814 (137.1 Mb) TX bytes:135652668 (129.3 Mb)
Interrupt:16 Memory:fcf30000-fcf40000
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING NOARP MTU:16436 Metric:1
RX packets:822 errors:0 dropped:0 overruns:0 frame:0
TX packets:822 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:66285 (64.7 Kb) TX bytes:66285 (64.7 Kb)
lo:0 Link encap:Local Loopback
inet addr:64.xxx.xxx.32 Mask:255.255.255.255
UP LOOPBACK RUNNING NOARP MTU:16436 Metric:1
RX packets:822 errors:0 dropped:0 overruns:0 frame:0
TX packets:822 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:66285 (64.7 Kb) TX bytes:66285 (64.7 Kb)
lo:1 Link encap:Local Loopback
inet addr:64.xxx.xxx.33 Mask:255.255.255.255
UP LOOPBACK RUNNING NOARP MTU:16436 Metric:1
RX packets:822 errors:0 dropped:0 overruns:0 frame:0
TX packets:822 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:66285 (64.7 Kb) TX bytes:66285 (64.7 Kb)
To try and avoid the arp problem:
echo '2' > /proc/sys/net/ipv4/conf/lo/arp_announce
echo '1' > /proc/sys/net/ipv4/conf/lo/arp_ignore
ifconfig lo -arp
mail1 root # route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
64.xxx.xxx.34 10.1.1.15 255.255.255.255 UGH 0 0 0 eth0
64.xxx.xxx.31 10.1.1.15 255.255.255.255 UGH 0 0 0 eth0
10.1.1.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
127.0.0.0 127.0.0.1 255.0.0.0 UG 0 0 0 lo
0.0.0.0 10.1.1.1 0.0.0.0 UG 0 0 0 eth0
Note that we had to add specific routes for the other LVS'd services. I
do not know why we had to do this, but I know that it started working
consistently to/from the smtp and www realservers.
The www and smtp realservers look similar except for IP addresses and
that they each only have 1 VIP.
In summary we're load balancing 64.xxx.xxx.31 across two smtp machines,
64.xxx.xxx.32 and 64.xxx.xxx.33 across two pop/imap machines, and
64.xxx.xxx.34 across two www machines. I can draw an ASCII picture of
this if necessary, though it will be tight.
Question: Load balancing across the www and smtp machines works great
from the outside AND from the other load balanced machines. Load
balancing across the pop/imap machines works fine from the outside,
inconsistently from the other load balanced machines, and never from my
workstation. Can anyone explain why? Can anyone suggest a fix? Is
there any more information required to try and pinpoint this problem?
I just ssh'd to the two imap machines and ran tests at the time of this
message. It load balanced incoming imap requests from an external IP,
and from one of the www boxen, but did NOT load balance from my
workstation (a 192.168.100.* address, across two Cisco routers). Using
pop, it load balanced properly for both external and internal IP
addresses. Repeating the test with imap, it only worked properly from
the external IP and the www box (but not from my workstation).
I have a theory that load balancing two different services across the
same two machines is causing the arp issue that (I think) I am seeing.
Any comments and suggestions would be appreciated.
--
Regards... Todd
They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety. --Benjamin Franklin
Linux kernel 2.6.3-8mdkenterprise 2 users, load average: 0.02, 0.03, 0.00
|