Hello,
I'm setting up a cluster with 3 lvs nodes and try to find the best mode
for me to work. A masqueraded cluster (all machines 2.2.17, director
with kernel patch) was not a problem, but since it doesn't scale fine
and my tests showed a very high load on the director, I decided to give
the VS-DR scheme a try. I was working on it now for the past three days
without a satisfaying result, and after reading the instructions and arp
problem pages again and again I give a shot there, maybe one of the more
experienced cluster-users can give me the hint I'm missing.
I set up the following net (192.168.121.61 is VIP):
| 192.168.121.0 network
|
---------------------------------------------
| DIRECTOR ganga |
| eth1 192.168.121.71, eth1:1 192.168.121.61|
| eth0 192.168.124.100 |
---------------------------------------------
|
| 192.168.124.0 network, directly switched
|
------------------------
| NODE ganesh |
| eth0 192.168.124.101 |
| lo0:1 192.168.121.61 |
| eth1 192.168.121.73 |
------------------------
eth1 is currently not plugged in to the 192.168.121.0 net.
Forwarding and all the kernel options are activated on the director and
since the VIP is not on a direct connection with the real servers I
don't assume I've got an arp problem here. All other options are enabled
on real servers, too. I tried both variants on the real servers, with
the VIP on lo:1 and eth1 (not plugged in). The problem that I saw is
that the director passes the packets correctly to the real server, but
it can't respond to them correctly. I added a tcpdump snipplet for
further information:
director:
13:46:43.765414 obiwan.johoho.1217 > 192.168.121.61.telnet: S
2448685530:2448685530(0) win 32120 <mss 1460,sackOK,timestamp
25530192[|tcp]> (DF)
13:46:46.762534 obiwan.johoho.1217 > 192.168.121.61.telnet: S
2448685530:2448685530(0) win 32120 <mss 1460,sackOK,timestamp
25530492[|tcp]> (DF)
13:46:48.755522 arp who-has ganesh.joho.ho tell ganga.joho.ho
13:46:48.755679 arp reply ganesh.joho.ho is-at 0:50:4:3c:27:f3
and here the info from the node:
ganesh:~# tcpdump
eth0: Setting promiscuous mode.
tcpdump: listening on eth0
13:46:45.848093 192.168.121.17.1217 > 192.168.121.61.telnet: S
2448685530:2448685530(0) win 32120 <mss 1460,sackOK,timestamp
25530192[|tcp]> (DF)
13:46:48.845260 192.168.121.17.1217 > 192.168.121.61.telnet: S
2448685530:2448685530(0) win 32120 <mss 1460,sackOK,timestamp
25530492[|tcp]> (DF)
13:46:50.838276 arp who-has ganesh.joho.ho tell ganga.joho.ho
13:46:50.838330 arp reply ganesh.joho.ho is-at 0:50:4:3c:27:f3
both are synchronized via ntp, so the timestamps should be correct.
obiwan (192.168.121.17) was the machine from ehich I tried to telnet.
Here's my ipvsadm output:
IP Virtual Server version 1.0.2 (size=131072)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 192.168.121.61:telnet wlc
-> ganesh.joho.ho:telnet Route 1 0 0
I hope someone can give me a clue what I'm doing wrong. Thanks in
advance,
--
Regards,
Wiktor Wodecki
|