so, i have a pretty complex (for me, that is) setup on this one machine
that acts as a nameserver and mail server and some other stuff and answers
to a handful of ips. it's also a "real server" behind an lvs director.
the machine in question is running a modified redhat 6.2 with a 2.2.17ext3
kernel (stock 2.2.17 + ext3 patches + nfs patches).
let me try to describe this as best i can.
our external network is 64.211.224.160/28. 161 is the router/gateway to
the rest of the world. 162 is an auth nameserver. 163 is an auth
nameserver. 164 is the ip used for outgoing connections from behind
masquerading. 165 is for web traffic. 166 is for incoming mail. and i
just put 169 in as a standalone machine.
the 164 masquerading server allows the nameserver/mailserver to send
requests to the outside world:
MASQ all ------ 192.168.1.21 0.0.0.0/0 n/a
the lvs director basically handles all incoming traffic and forwards it to
the right place:
IP Virtual Server version 1.0.0-beta1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 64.211.224.165:443 lc persistent 360
-> 192.168.1.101:443 Route 1 0 0
-> 192.168.1.102:443 Route 1 0 0
UDP 64.211.224.162:53 lc
-> 192.168.1.11:53 Route 1 0 349
UDP 64.211.224.163:53 lc
-> 192.168.1.12:53 Route 1 0 183
TCP 64.211.224.163:53 lc
-> 192.168.1.12:53 Route 1 0 0
TCP 64.211.224.162:53 lc
-> 192.168.1.11:53 Route 1 0 0
TCP 64.211.224.166:22 lc
-> 192.168.1.21:22 Route 1 0 0
TCP 64.211.224.168:22 lc
-> 192.168.1.21:22 Route 1 16 0
TCP 64.211.224.166:25 lc
-> 192.168.1.21:25 Route 1 0 0
TCP 64.211.224.165:80 lc
-> 192.168.1.101:80 Route 1 0 3
-> 192.168.1.102:80 Route 1 0 1
then there's the "phl" machine which handles dns and mail:
[root@phl /root]# /sbin/ifconfig
eth0 Link encap:Ethernet HWaddr 00:D0:B7:65:EC:48
inet addr:192.168.1.21 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:24535885 errors:0 dropped:0 overruns:0 frame:0
TX packets:24655159 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
Interrupt:11 Base address:0x2800
eth0:0 Link encap:Ethernet HWaddr 00:D0:B7:65:EC:48
inet addr:192.168.1.11 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:11 Base address:0x2800
eth0:1 Link encap:Ethernet HWaddr 00:D0:B7:65:EC:48
inet addr:192.168.1.12 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:11 Base address:0x2800
eth0:2 Link encap:Ethernet HWaddr 00:D0:B7:65:EC:48
inet addr:192.168.1.13 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:11 Base address:0x2800
eth0:3 Link encap:Ethernet HWaddr 00:D0:B7:65:EC:48
inet addr:192.168.1.14 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:11 Base address:0x2800
eth0:4 Link encap:Ethernet HWaddr 00:D0:B7:65:EC:48
inet addr:192.168.1.10 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:11 Base address:0x2800
eth1 Link encap:Ethernet HWaddr 00:C0:95:E2:85:40
inet addr:192.168.2.21 Bcast:192.168.2.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:20102464 errors:0 dropped:0 overruns:0 frame:0
TX packets:19892838 errors:6 dropped:0 overruns:3 carrier:6
collisions:0 txqueuelen:100
Interrupt:11 Base address:0x3000
eth1:0 Link encap:Ethernet HWaddr 00:C0:95:E2:85:40
inet addr:192.168.2.13 Bcast:192.168.2.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:11 Base address:0x3000
eth1:1 Link encap:Ethernet HWaddr 00:C0:95:E2:85:40
inet addr:192.168.2.14 Bcast:192.168.2.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:11 Base address:0x3000
eth1:2 Link encap:Ethernet HWaddr 00:C0:95:E2:85:40
inet addr:192.168.2.10 Bcast:192.168.2.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:11 Base address:0x3000
eth2 Link encap:Ethernet HWaddr 00:C0:95:E2:85:41
inet addr:192.168.3.21 Bcast:192.168.3.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:74336 errors:0 dropped:0 overruns:0 frame:0
TX packets:111705 errors:16 dropped:0 overruns:2 carrier:28
collisions:0 txqueuelen:100
Interrupt:10 Base address:0x3080
eth2:0 Link encap:Ethernet HWaddr 00:C0:95:E2:85:41
inet addr:192.168.3.13 Bcast:192.168.3.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:10 Base address:0x3080
eth2:1 Link encap:Ethernet HWaddr 00:C0:95:E2:85:41
inet addr:192.168.3.14 Bcast:192.168.3.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:10 Base address:0x3080
eth2:2 Link encap:Ethernet HWaddr 00:C0:95:E2:85:41
inet addr:192.168.3.10 Bcast:192.168.3.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:10 Base address:0x3080
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:3924 Metric:1
RX packets:191349 errors:0 dropped:0 overruns:0 frame:0
TX packets:191349 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
lo:0 Link encap:Local Loopback
inet addr:64.211.224.162 Mask:255.255.255.240
UP LOOPBACK RUNNING MTU:3924 Metric:1
lo:1 Link encap:Local Loopback
inet addr:64.211.224.163 Mask:255.255.255.240
UP LOOPBACK RUNNING MTU:3924 Metric:1
lo:2 Link encap:Local Loopback
inet addr:64.211.224.166 Mask:255.255.255.240
UP LOOPBACK RUNNING MTU:3924 Metric:1
lo:3 Link encap:Local Loopback
inet addr:64.211.224.168 Mask:255.255.255.240
UP LOOPBACK RUNNING MTU:3924 Metric:1
[root@phl /root]# /sbin/route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use
Iface
64.211.224.166 0.0.0.0 255.255.255.255 UH 0 0 0 lo
192.168.2.10 0.0.0.0 255.255.255.255 UH 0 0 0
eth1
192.168.2.13 0.0.0.0 255.255.255.255 UH 0 0 0
eth1
192.168.1.21 0.0.0.0 255.255.255.255 UH 0 0 0
eth0
192.168.3.21 0.0.0.0 255.255.255.255 UH 0 0 0
eth2
64.211.224.162 0.0.0.0 255.255.255.255 UH 0 0 0 lo
64.211.224.163 0.0.0.0 255.255.255.255 UH 0 0 0 lo
192.168.2.14 0.0.0.0 255.255.255.255 UH 0 0 0
eth1
192.168.1.11 0.0.0.0 255.255.255.255 UH 0 0 0
eth0
192.168.1.10 0.0.0.0 255.255.255.255 UH 0 0 0
eth0
192.168.3.10 0.0.0.0 255.255.255.255 UH 0 0 0
eth2
192.168.1.13 0.0.0.0 255.255.255.255 UH 0 0 0
eth0
192.168.3.13 0.0.0.0 255.255.255.255 UH 0 0 0
eth2
192.168.2.21 0.0.0.0 255.255.255.255 UH 0 0 0
eth1
192.168.1.12 0.0.0.0 255.255.255.255 UH 0 0 0
eth0
64.211.224.168 0.0.0.0 255.255.255.255 UH 0 0 0 lo
192.168.1.14 0.0.0.0 255.255.255.255 UH 0 0 0
eth0
192.168.3.14 0.0.0.0 255.255.255.255 UH 0 0 0
eth2
64.211.224.160 0.0.0.0 255.255.255.240 U 0 0 0
eth0
192.168.3.0 0.0.0.0 255.255.255.0 U 0 0 0
eth2
192.168.2.0 0.0.0.0 255.255.255.0 U 0 0 0
eth1
192.168.1.0 0.0.0.0 255.255.255.0 U 0 0 0
eth0
127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo
[root@phl /root]# cat /etc/sysctl.conf
# Disables packet forwarding
net.ipv4.ip_forward = 1
# Enables source route verification
net.ipv4.conf.all.rp_filter = 1
# Disables automatic defragmentation (needed for masquerading, LVS)
net.ipv4.ip_always_defrag = 0
# Disables the magic-sysrq key
kernel.sysrq = 1
# -tcl.
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.eth0.send_redirects = 0
net.ipv4.conf.all.hidden = 1
net.ipv4.conf.lo.hidden = 1
#
[root@phl /root]# tail --lines 30 /etc/rc.d/rc.local
#
# -tcl.
#
# the whole static-routes / network scripts / lo:# / gateway being on a
# different device than ips on the same network / bl ah blah lah sajdhsd.
# totally flaky. let's just do it all here.
#
/sbin/sysctl -p
/sbin/route add -net 64.211.224.160 netmask 255.255.255.240 dev eth0
#/sbin/route add default gw 64.211.224.161 dev eth0
##/sbin/arp -s 64.211.224.161 00:30:B6:67:00:40
/sbin/arp -s 64.211.224.161 00:30:B6:67:00:AA
#/sbin/ip rule add prio 100 from 192.168.1.0/24 table 100
#/sbin/ip route add table 100 0/0 via 192.168.1.1 dev eth0
/sbin/ifconfig lo:0 64.211.224.162 netmask 255.255.255.240 broadcast
64.211.224.175 up
/sbin/route add -host 64.211.224.162 dev lo:0
/sbin/ifconfig lo:1 64.211.224.163 netmask 255.255.255.240 broadcast
64.211.224.175 up
/sbin/route add -host 64.211.224.163 dev lo:1
/sbin/ifconfig lo:2 64.211.224.166 netmask 255.255.255.240 broadcast
64.211.224.175 up
/sbin/route add -host 64.211.224.166 dev lo:2
/sbin/ifconfig lo:3 64.211.224.168 netmask 255.255.255.240 broadcast
64.211.224.175 up
/sbin/route add -host 64.211.224.168 dev lo:3
#/sbin/ip rule add prio 33000 from 192.168.1.0/24 table 100
/sbin/ip route add table 100 0/0 via 192.168.1.1 dev eth0
#/sbin/ip rule add prio 34000 from 0/0 table 200
/sbin/ip route add table 200 0/0 via 64.211.224.161 dev eth0
/sbin/ip rule add prio 33000 from 64.211.224.160/28 table 200
/sbin/ip rule add prio 34000 from 0/0 table 100
#
[root@phl /root]# ip rule
0: from all lookup local
32766: from all lookup main
32767: from all lookup 253
33000: from 64.211.224.160/28 lookup 200
34000: from all lookup 100
[root@phl /root]# ip route
64.211.224.166 dev lo scope link src 64.211.224.166
192.168.2.10 dev eth1 scope link src 192.168.2.10
192.168.2.13 dev eth1 scope link src 192.168.2.13
192.168.1.21 dev eth0 scope link
192.168.3.21 dev eth2 scope link
64.211.224.162 dev lo scope link src 64.211.224.162
64.211.224.163 dev lo scope link src 64.211.224.163
192.168.2.14 dev eth1 scope link src 192.168.2.14
192.168.1.11 dev eth0 scope link src 192.168.1.11
192.168.1.10 dev eth0 scope link src 192.168.1.10
192.168.3.10 dev eth2 scope link src 192.168.3.10
192.168.1.13 dev eth0 scope link src 192.168.1.13
192.168.3.13 dev eth2 scope link src 192.168.3.13
192.168.2.21 dev eth1 scope link
192.168.1.12 dev eth0 scope link src 192.168.1.12
64.211.224.168 dev lo scope link src 64.211.224.168
192.168.1.14 dev eth0 scope link src 192.168.1.14
192.168.3.14 dev eth2 scope link src 192.168.3.14
64.211.224.160/28 dev eth0 scope link
192.168.3.0/24 dev eth2 proto kernel scope link src 192.168.3.21
192.168.2.0/24 dev eth1 proto kernel scope link src 192.168.2.21
192.168.1.0/24 dev eth0 proto kernel scope link src 192.168.1.21
127.0.0.0/8 dev lo scope link
[root@phl /root]# ip route list table 100
default via 192.168.1.1 dev eth0
[root@phl /root]# ip route list table 200
default via 64.211.224.161 dev eth0
[root@phl /root]#
the end result of this is that, well, for example, a nameservice query get
directed through the lvs director to the phl real server, which answers it
via direct routing. phl can also get to the outside world to deliver mail
/ make dns queries of its own via the masquerading. the policy routing
says that traffic with a source ip of 64.211.224.160/28 gets sent via
64.211.224.161 (direct routing instead of nat/masq), whereas traffic with
a source ip of anything else should go through 192.168.1.1 and be
masqueraded. those 192.168.2 and .3 and whatever other networks on there
can be ignored.
/me breathes.
ok. so all that has been working perfectly for months. the problem is
that now i added a machine on 64.211.224.169 to do mail serving and stuff
for our employees and some other stuff. for example, mail to
@mybiz-inc.com gets delivered to 64.211.224.169, while mail to @mybiz.com
gets directed to 64.211.224.166 (through the lvs director and to phl).
the problem is that phl can't send traffic to 64.211.224.169 -- phl seems
to think that 64.211.224.169 is on its loopback interface. 64.211.224.169
tries to make nameservice queries for 169.160-175.224.211.64.in-addr.arpa
and *.mybiz.com to 64.211.224.162 and 64.211.224.163 (the auth nameservers
for that -- phl handles them), but phl never responds. phl also tries to
deliver mail to 64.211.224.169, but it can't send traffic there.
check out:
[root@phl /root]# tcpdump -n host 64.211.224.169 and not port 53 &
[1] 20668
User level filter, protocol ALL, datagram packet socket
tcpdump: listening on all devices
[root@phl /root]# ping -n -c 5 64.211.224.169
PING 64.211.224.169 (64.211.224.169) from 64.211.224.169 : 56(84) bytes of
data.
14:04:36.653475 lo > 64.211.224.169 > 64.211.224.169: icmp: echo request
14:04:36.653475 lo < 64.211.224.169 > 64.211.224.169: icmp: echo request
14:04:36.653506 lo > 64.211.224.169 > 64.211.224.169: icmp: echo reply
14:04:36.653506 lo < 64.211.224.169 > 64.211.224.169: icmp: echo reply
64 bytes from 64.211.224.169: icmp_seq=0 ttl=255 time=63 usec
14:04:37.649412 lo > 64.211.224.169 > 64.211.224.169: icmp: echo request
14:04:37.649412 lo < 64.211.224.169 > 64.211.224.169: icmp: echo request
14:04:37.649430 lo > 64.211.224.169 > 64.211.224.169: icmp: echo reply
14:04:37.649430 lo < 64.211.224.169 > 64.211.224.169: icmp: echo reply
64 bytes from 64.211.224.169: icmp_seq=1 ttl=255 time=34 usec
14:04:38.649446 lo > 64.211.224.169 > 64.211.224.169: icmp: echo request
14:04:38.649446 lo < 64.211.224.169 > 64.211.224.169: icmp: echo request
14:04:38.649462 lo > 64.211.224.169 > 64.211.224.169: icmp: echo reply
14:04:38.649462 lo < 64.211.224.169 > 64.211.224.169: icmp: echo reply
64 bytes from 64.211.224.169: icmp_seq=2 ttl=255 time=28 usec
14:04:39.649495 lo > 64.211.224.169 > 64.211.224.169: icmp: echo request
14:04:39.649495 lo < 64.211.224.169 > 64.211.224.169: icmp: echo request
14:04:39.649516 lo > 64.211.224.169 > 64.211.224.169: icmp: echo reply
14:04:39.649516 lo < 64.211.224.169 > 64.211.224.169: icmp: echo reply
64 bytes from 64.211.224.169: icmp_seq=3 ttl=255 time=37 usec
14:04:40.649527 lo > 64.211.224.169 > 64.211.224.169: icmp: echo request
14:04:40.649527 lo < 64.211.224.169 > 64.211.224.169: icmp: echo request
14:04:40.649545 lo > 64.211.224.169 > 64.211.224.169: icmp: echo reply
14:04:40.649545 lo < 64.211.224.169 > 64.211.224.169: icmp: echo reply
64 bytes from 64.211.224.169: icmp_seq=4 ttl=255 time=31 usec
--- 64.211.224.169 ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max/mdev = 0.028/0.038/0.063/0.014 ms
[root@phl /root]# fg
tcpdump -n host 64.211.224.169 and not port 53
158 packets received by filter
[root@phl /root]#
when 169 tries to telnet to 166 port 25 (which gets directed to phl):
[root@phl /root]# tcpdump -n host 64.211.224.169 and not port 53
User level filter, protocol ALL, datagram packet socket
tcpdump: listening on all devices
14:05:20.460200 eth0 B arp who-has 64.211.224.169 tell 64.211.224.162
14:05:50.883915 eth0 B arp who-has 64.211.224.166 tell 64.211.224.169
14:05:50.884155 eth0 < 64.211.224.169.1058 > 64.211.224.166.smtp: S
4151665104:4151665104(0) win 32120 <mss 1460,sackOK,timestamp 25658644
0,nop,wscale 0> (DF)
14:05:53.879424 eth0 < 64.211.224.169.1058 > 64.211.224.166.smtp: S
4151665104:4151665104(0) win 32120 <mss 1460,sackOK,timestamp 25658944
0,nop,wscale 0> (DF)
725 packets received by filter
no response is ever sent.
when phl tries to send mail to mybiz-inc.com:
[root@phl /root]# dnsmx mybiz-inc.com
0 mail.mybiz-inc.com
[root@phl /root]# dnsip mail.mybiz-inc.com
64.211.224.169
[root@phl /root]# telnet 64.211.224.169 25
Trying 64.211.224.169...
Connected to inc.mybiz.com (64.211.224.169).
Escape character is '^]'.
220 phl.usa.mybiz ESMTP
^]q
Connection closed.
[root@phl /root]#
14:07:39.001323 lo > 64.211.224.169.1549 > 64.211.224.169.smtp: S
4291120419:4291120419(0) win 31072 <mss 3884,sackOK,timestamp 441773751
0,nop,wscale 0> (DF)
14:07:39.001323 lo < 64.211.224.169.1549 > 64.211.224.169.smtp: S
4291120419:4291120419(0) win 31072 <mss 3884,sackOK,timestamp 441773751
0,nop,wscale 0> (DF)
14:07:39.001367 lo > 64.211.224.169.smtp > 64.211.224.169.1549: S
200723:200723(0) ack 4291120420 win 31072 <mss 3884,sackOK,timestamp
441773751 441773751,nop,wscale 0> (DF)
14:07:39.001367 lo < 64.211.224.169.smtp > 64.211.224.169.1549: S
200723:200723(0) ack 4291120420 win 31072 <mss 3884,sackOK,timestamp
441773751 441773751,nop,wscale 0> (DF)
14:07:39.001390 lo > 64.211.224.169.1549 > 64.211.224.169.smtp: . 1:1(0)
ack 1 win 31072 <nop,nop,timestamp 441773751 441773751> (DF)
14:07:39.001390 lo < 64.211.224.169.1549 > 64.211.224.169.smtp: . 1:1(0)
ack 1 win 31072 <nop,nop,timestamp 441773751 441773751> (DF)
14:07:39.007531 lo > 64.211.224.169.smtp > 64.211.224.169.1549: P
1:26(25) ack 1 win 31072 <nop,nop,timestamp 441773752 441773751> (DF)
14:07:39.007531 lo < 64.211.224.169.smtp > 64.211.224.169.1549: P
1:26(25) ack 1 win 31072 <nop,nop,timestamp 441773752 441773751> (DF)
14:07:39.007570 lo > 64.211.224.169.1549 > 64.211.224.169.smtp: . 1:1(0)
ack 26 win 31047 <nop,nop,timestamp 441773752 441773752> (DF)
14:07:39.007570 lo < 64.211.224.169.1549 > 64.211.224.169.smtp: . 1:1(0)
ack 26 win 31047 <nop,nop,timestamp 441773752 441773752> (DF)
Connected to inc.mybiz.com (64.211.224.169).
it tries to send to itself.
does anyone have any idea why phl would think 64.211.224.169 is on its lo?
it seems to think that for all of 64.211.224.160/28. if i telnet to port
25 on any ip in that range, phl directs the request to itself on lo just
like 169.
anyone even understand this? heh. i'm seriously confused myself.
i'd love to hear any ideas.
-tcl.
|