LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: dns + lvs dr.

To: Julian Anastasov <ja@xxxxxx>
Subject: Re: dns + lvs dr.
Cc: lvs-users@xxxxxxxxxxxxxxxxxxxxxx, tc lewis <tcl@xxxxxxxxx>
From: tc lewis <tcl@xxxxxxxxx>
Date: Sun, 12 Nov 2000 18:17:51 -0500 (EST)


> >     ICMP error => no listener?
> > 
> >     Before I fully understand your message (the ARP talks for
> > example) one question:
> > 
> >     Is the DNS server in the real server(s) started before the VIPs
> > are configured (in the real servers). Look at the logs (/var/log/messages?)
> > whether the DNS server is listening on VIP:53. There can be a problem if
> > you start your network scripts (VIPs, routes, etc) in rc.local while the
> > DNS server was started long before from the rc.d levels. If you
> > use "bind" I assume there is no listener(s) for the VIPs you add after
> > starting the server.
> > 
> >     For the telnet service there is no problem. It just listens for
> > 0.0.0.0 and when you add new VIP later it just works.
> 
> yeah, that stuff is all ok (it actually might not have been on that
> tcpdump--not sure).  so i think it all comes back to the priority
> routing thing.  cause i can't nslookup from the real server (which is the
> name server) for outside domains.  but even with priority routing off
> strange things happen.  ok ignore that.  check the following out.  it
> looks like some odd udp problem, but tcp is ok.
> 
> director:
> UDP  64.211.224.164:53 lc
>   -> 192.168.1.11:53             Route   1      0          2         
> TCP  64.211.224.164:53 lc
>   -> 192.168.1.11:53             Route   1      0          1         
> 
> tcp to port 53:
> 12:18:51.332533 eth1 < 208.219.36.76.61236 > 64.211.224.164.domain: S
> 857678275:857678275(0) win 32120 <mss 1460,sackOK,timestamp 131143386
> 0,nop,wscale 0> (DF)
> 12:18:51.332582 eth1 > 208.219.36.76.61236 > 64.211.224.164.domain: S
> 857678275:857678275(0) win 32120 <mss 1460,sackOK,timestamp 131143386
> 0,nop,wscale 0> (DF)
> 12:18:51.343790 eth1 < 208.219.36.76.61236 > 64.211.224.164.domain: .
> 857678276:857678276(0) ack 721543635 win 32120 <nop,nop,timestamp
> 131143387 251713> (DF)
> 12:18:51.343838 eth1 > 208.219.36.76.61236 > 64.211.224.164.domain: .
> 0:0(0) ack 1 win 32120 <nop,nop,timestamp 131143387 251713> (DF)
> 12:18:53.606644 eth1 < 208.219.36.76.61236 > 64.211.224.164.domain: F
> 0:0(0) ack 1 win 32120 <nop,nop,timestamp 131143613 251713> (DF)
> 12:18:53.606688 eth1 > 208.219.36.76.61236 > 64.211.224.164.domain: F
> 0:0(0) ack 1 win 32120 <nop,nop,timestamp 131143613 251713> (DF)
> 12:18:53.617657 eth1 < 208.219.36.76.61236 > 64.211.224.164.domain: .
> 1:1(0) ack 2 win 32119 <nop,nop,timestamp 131143614 251940> (DF)
> 12:18:53.617694 eth1 > 208.219.36.76.61236 > 64.211.224.164.domain: .
> 1:1(0) ack 2 win 32119 <nop,nop,timestamp 131143614 251940> (DF)
> 12:18:56.330262 eth1 > arp who-has 192.168.1.11 tell 192.168.1.2
> (0:c0:95:e2:a8:b1)
> 12:18:56.330399 eth1 < arp reply 192.168.1.11 is-at 0:d0:b7:65:ec:48
> (0:c0:95:e2:a8:b1)
> (successful)
> 
> udp to port 53 (nslookup):
> 12:19:09.385187 eth1 < 208.219.36.76.61214 > 64.211.224.164.domain: 50126+
> PTR? 164.224.211.64.in-addr.arpa. (45)
> 12:19:09.385222 eth1 > 208.219.36.76.61214 > 64.211.224.164.domain: 50126+
> PTR? 164.224.211.64.in-addr.arpa. (45)
> (unsuccessful)
> 
> 
> 
> real server:
> [root@phl /root]# /sbin/ipchains -L -n
> Chain input (policy ACCEPT):
> target     prot opt     source                destination           ports
> REDIRECT   tcp  ------  0.0.0.0/0            64.211.224.164        * ->
> 53 => 53
> REDIRECT   udp  ------  0.0.0.0/0            64.211.224.164        * ->
> 53 => 53
> Chain forward (policy ACCEPT):
> Chain output (policy ACCEPT):
> 
> (i wasn't using the horms/ipchains method before, but i am now).
> 
> tcp to port 53:
> 12:18:48.170150 eth0 < 208.219.36.76.61236 > 64.211.224.164.domain: S
> 857678275:857678275(0) win 32120 <mss 1460,sackOK,timestamp 131143386
> 0,nop,wscale 0> (DF)
> 12:18:48.170349 eth0 > 64.211.224.164.domain > 208.219.36.76.61236: S
> 721543634:721543634(0) ack 857678276 win 32120 <mss 1460,sackOK,timestamp
> 251713 131143386,nop,wscale 0> (DF)
> 12:18:48.181404 eth0 < 208.219.36.76.61236 > 64.211.224.164.domain: .
> 1:1(0) ack 1 win 32120 <nop,nop,timestamp 131143387 251713> (DF)
> 12:18:50.444385 eth0 < 208.219.36.76.61236 > 64.211.224.164.domain: F
> 1:1(0) ack 1 win 32120 <nop,nop,timestamp 131143613 251713> (DF)
> 12:18:50.444411 eth0 > 64.211.224.164.domain > 208.219.36.76.61236: .
> 1:1(0) ack 2 win 32120 <nop,nop,timestamp 251940 131143613> (DF)
> 12:18:50.444499 eth0 > 64.211.224.164.domain > 208.219.36.76.61236: F
> 1:1(0) ack 2 win 32120 <nop,nop,timestamp 251940 131143613> (DF)
> 12:18:50.455392 eth0 < 208.219.36.76.61236 > 64.211.224.164.domain: .
> 2:2(0) ack 2 win 32119 <nop,nop,timestamp 131143614 251940> (DF)
> 12:18:53.168113 eth0 < arp who-has 192.168.1.11 tell 192.168.1.2
> 12:18:53.168143 eth0 > arp reply 192.168.1.11 (0:d0:b7:65:ec:48) is-at
> 0:d0:b7:65:ec:48 (0:c0:95:e2:a8:b1)
> (successful)
> 
> udp to port 53 (nslookup):
> 12:19:06.223900 eth0 < 208.219.36.76.61214 > 64.211.224.164.domain: 50126+
> PTR? 164.224.211.64.in-addr.arpa. (45)
> 12:19:06.224186 eth0 > 192.168.1.21.domain > 208.219.36.76.61214: 50126
> NXDomain 0/1/0 (116)
> (unsuccessful)
> 
> 
> i ctrl-c'd the nslookups cause they were obviously stalling, hence the
> short udp tcpdump outputs.  i don't know if i can actually do a
> nameservice request over tcp, as i don't really know what one looks like
> to type it in, and nslookup doesn't seem to support tcp, but the port
> connection itself was successful.  the udp one looks odd to me--as if the
> real server is putting its rip as the source ip of the return packets,
> instead of using the vip, but it does use the vip with that tcp
> connection.
> 
> i built these kernels myself, so it's possible i missed something, but i
> don't know what that something would be.
> 
> director is 2.2.17 with ipvs 1.0.0beta and ext3 0.0.3b (i think).
> real server is 2.2.17 with ext3 0.0.3b (ditto).
> 
> comments?  i'm confused.


bah.  the ipchains redirect method won't work here as i can't get named to
bind to an ip that's not physically there, so i'm trying hidden dummy
interfaces now.  it's doing all sorts of wack shit.  this is so
frustrating.  i should get my queries working before trying this with lvs
tho, so once again let's ignore all the above.  and this part has little
to do with lvs, but i have no clue where else to turn.


masquerading machine:

[root@lga /root]# /sbin/route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.1.1     0.0.0.0         255.255.255.255 UH    0      0        0 eth1
64.211.224.162  0.0.0.0         255.255.255.255 UH    0      0        0 eth2
192.168.0.1     0.0.0.0         255.255.255.255 UH    0      0        0 eth2
64.211.224.160  0.0.0.0         255.255.255.240 U     0      0        0 eth2
192.168.1.0     0.0.0.0         255.255.255.0   U     0      0        0 eth1
192.168.0.0     0.0.0.0         255.255.255.0   U     0      0        0 eth2
127.0.0.0       0.0.0.0         255.0.0.0       U     0      0        0 lo
0.0.0.0         64.211.224.161  0.0.0.0         UG    0      0        0 eth2
[root@lga /root]# /sbin/ipchains -L -n
Chain input (policy ACCEPT):
Chain forward (policy DENY):
target     prot opt     source                destination           ports
MASQ       all  ------  192.168.1.0/24       0.0.0.0/0             n/a
Chain output (policy ACCEPT):
[root@lga /root]# 


normal router/gateway to the outside is 64.211.224.161


"real server" / client machine / whatever.

[root@phl /root]# /sbin/ipchains -L -n
Chain input (policy ACCEPT):
Chain forward (policy ACCEPT):
Chain output (policy ACCEPT):
[root@phl /root]# /sbin/route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.1.21    0.0.0.0         255.255.255.255 UH    0      0        0 eth0
192.168.3.21    0.0.0.0         255.255.255.255 UH    0      0        0 eth2
192.168.1.11    0.0.0.0         255.255.255.255 UH    0      0        0 eth0
192.168.2.21    0.0.0.0         255.255.255.255 UH    0      0        0 eth1
192.168.1.12    0.0.0.0         255.255.255.255 UH    0      0        0 eth0
64.211.224.160  0.0.0.0         255.255.255.240 U     0      0        0 eth0
192.168.3.0     0.0.0.0         255.255.255.0   U     0      0        0 eth2
192.168.2.0     0.0.0.0         255.255.255.0   U     0      0        0 eth1
192.168.1.0     0.0.0.0         255.255.255.0   U     0      0        0 eth0
127.0.0.0       0.0.0.0         255.0.0.0       U     0      0        0 lo
0.0.0.0         64.211.224.161  0.0.0.0         UG    0      0        0 eth0
[root@phl /root]# tail /etc/rc.d/rc.local 
    cp -f /etc/issue /etc/issue.net
    echo >> /etc/issue
fi

# -tcl.
/sbin/sysctl -p
/sbin/arp -s 64.211.224.161 00:30:B6:67:00:40
/sbin/ip rule add prio 100 from 192.168.1.0/24 table 100
/sbin/ip route add table 100 0/0 via 192.168.1.1 dev eth0
#
[root@phl /root]# cat /etc/sysctl.conf 
# Disables packet forwarding
net.ipv4.ip_forward = 1
# Enables source route verification
net.ipv4.conf.all.rp_filter = 1
# Disables automatic defragmentation (needed for masquerading, LVS)
net.ipv4.ip_always_defrag = 0
# Disables the magic-sysrq key
kernel.sysrq = 1
[root@phl /root]# /sbin/ip route show table 100
default via 192.168.1.1 dev eth0 
[root@phl /root]# /sbin/ip rule show
0:      from all lookup local 
100:    from 192.168.1.0/24 lookup 100 
32766:  from all lookup main 
32767:  from all lookup 253 
[root@phl /root]# 


is net.ipv4.ip_always_defrag supposed to be 0 or 1?  which/where/huh?  i'm
assuming 0 is correct for basically everywhere in my network.

feel free to ignore all those other devices on this client machine.


the priority routing and masq works fine normally:
[root@phl /root]# ping -n -c 5 206.245.168.220
PING 206.245.168.220 (206.245.168.220) from 192.168.1.21 : 56(84) bytes of
data.
64 bytes from 206.245.168.220: icmp_seq=0 ttl=244 time=80.577 msec
64 bytes from 206.245.168.220: icmp_seq=1 ttl=244 time=78.180 msec
64 bytes from 206.245.168.220: icmp_seq=2 ttl=244 time=79.224 msec
64 bytes from 206.245.168.220: icmp_seq=3 ttl=244 time=78.135 msec
64 bytes from 206.245.168.220: icmp_seq=4 ttl=244 time=78.227 msec

--- 206.245.168.220 ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max/mdev = 78.135/78.868/80.577/0.994 ms
[root@phl /root]# ping -n -c 5 64.208.32.100
PING 64.208.32.100 (64.208.32.100) from 192.168.1.21 : 56(84) bytes of
data.
64 bytes from 64.208.32.100: icmp_seq=0 ttl=60 time=636 usec
64 bytes from 64.208.32.100: icmp_seq=1 ttl=60 time=3.190 msec
64 bytes from 64.208.32.100: icmp_seq=2 ttl=60 time=420 usec
64 bytes from 64.208.32.100: icmp_seq=3 ttl=60 time=414 usec
64 bytes from 64.208.32.100: icmp_seq=4 ttl=60 time=488 usec

--- 64.208.32.100 ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max/mdev = 0.414/1.029/3.190/1.083 ms
[root@phl /root]# telnet 206.245.168.220 23
Trying 206.245.168.220...
Connected to 206.245.168.220 (206.245.168.220).
Escape character is '^]'.

Red Hat Linux release 5.2 (Apollo)
Kernel 2.2.14 on an i686
login: 
telnet> q
Connection closed.
[root@phl /root]# 


but not for domain queries:

[root@phl /root]# nslookup www.linuxvirtualserver.org
Server:  phl.usa.mybiz
Address:  0.0.0.0

*** phl.usa.mybiz can't find www.linuxvirtualserver.org: Non-existent
host/domain
[root@phl /root]# 

tcpdump on phl / client / real server:

15:08:44.650610   lo > 127.0.0.1.1025 > 127.0.0.1.domain: 59219+ A?
www.linuxvirtualserver.org. (44)
15:08:44.650610   lo < 127.0.0.1.1025 > 127.0.0.1.domain: 59219+ A?
www.linuxvirtualserver.org. (44)
15:08:44.651059 eth0 > 192.168.1.21.1024 > 128.8.10.90.domain: 42360 A?
www.linuxvirtualserver.org. (44)
15:08:45.005379 eth0 > 192.168.1.21.1024 > 198.41.0.4.domain: 4432 NS? .
(17)
15:08:48.005380 eth0 > 192.168.1.21.1024 > 193.0.14.129.domain: 42360 A?
www.linuxvirtualserver.org. (44)
15:08:49.005379 eth0 > 192.168.1.21.1024 > 192.203.230.10.domain: 4432 NS?
. (17)
15:08:49.645346   lo > 127.0.0.1.1025 > 127.0.0.1.domain: 59219+ A?
www.linuxvirtualserver.org. (44)
15:08:49.645346   lo < 127.0.0.1.1025 > 127.0.0.1.domain: 59219+ A?
www.linuxvirtualserver.org. (44)
15:08:52.005381 eth0 > 192.168.1.21.1024 > 192.112.36.4.domain: 42360 A?
www.linuxvirtualserver.org. (44)
15:08:53.005378 eth0 > 192.168.1.21.1024 > 198.41.0.10.domain: 4432 NS? .
(17)
15:08:56.005380 eth0 > 192.168.1.21.1024 > 192.33.4.12.domain: 42360 A?
www.linuxvirtualserver.org. (44)
15:08:57.005379 eth0 > 192.168.1.21.1024 > 192.36.148.17.domain: 4432 NS?
. (17)
15:08:59.645402   lo > 127.0.0.1.1025 > 127.0.0.1.domain: 59220+ A?
www.linuxvirtualserver.org.usa.mybiz. (54)
15:08:59.645402   lo < 127.0.0.1.1025 > 127.0.0.1.domain: 59220+ A?
www.linuxvirtualserver.org.usa.mybiz. (54)
15:08:59.645571   lo > 127.0.0.1.domain > 127.0.0.1.1025: 59220 NXDomain*
0/1/0 (120)
15:08:59.645571   lo < 127.0.0.1.domain > 127.0.0.1.1025: 59220 NXDomain*
0/1/0 (120)
15:09:00.005392 eth0 > 192.168.1.21.1024 > 198.41.0.4.domain: 42360 A?
www.linuxvirtualserver.org. (44)
15:09:01.005381 eth0 > 192.168.1.21.1024 > 198.32.64.12.domain: 4432 NS? .
(17)


tcpdump on the masq machine showed absolutely 0 traffic.


not even requesting from a remote nameserver:

[root@phl /root]# nslookup www.linuxvirtualserver.org 206.245.168.220
*** Can't find server name for address 206.245.168.220: No response from server
*** Default servers are not available
[root@phl /root]# 

tcpdump on phl:

15:10:53.005436 eth0 > 192.168.1.21.1024 > 193.0.14.129.domain: 4432 NS? .
(17)
15:10:57.824356 eth0 > 192.168.1.21.1025 > 206.245.168.220.domain: 45849+
PTR? 220.168.245.206.in-addr.arpa. (46)
15:11:01.005417 eth0 > 192.168.1.21.1024 > 192.112.36.4.domain: 4432 NS? .
(17)
15:11:02.815348 eth0 > 192.168.1.21.1025 > 206.245.168.220.domain: 45849+
PTR? 220.168.245.206.in-addr.arpa. (46)
15:11:09.005382 eth0 > 192.168.1.21.1024 > 192.33.4.12.domain: 4432 NS? .
(17)


again, no traffic on the masq machine (lga).


what could i possibly be missing here?  does the ip program not jive with
udp?  i can't see how it would make a difference, but...i'm at a total
loss here.

i offer my soul to whoever has the answer i need here.  or maybe just a
bag of chips or something.

-tcl.



<Prev in Thread] Current Thread [Next in Thread>