LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: [lvs-users] LdirectorD LVS and CentOS/Fedora/RedHat

To: partysoft@xxxxxxxxx
Subject: Re: [lvs-users] LdirectorD LVS and CentOS/Fedora/RedHat
Cc: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
From: "L.S. Keijser" <leon@xxxxxxxx>
Date: Thu, 22 Oct 2009 23:22:15 +0200
Hi,

I'm replying to you and to the lvs-users list so everyone can
participate and help. It's a fairly long reply with detailed
instructions on how to get it working. I could just point you to the
HOWTO (as i already did) but i'm in a good mood. Also, since it's so
long and detailed, i hope i got it right else it's a waste of time :P

Oh and i'm going to make a lot of assumptions while replying, so please
correct me if i'm wrong:

On Thu, 2009-10-22 at 12:38 -0700, partysoft@xxxxxxxxx wrote:
> Thank you Léon for the reply. I apreciate it so much.
> Yes i actually have 3 public ip's like (none are with 192..)
> XX.XX.XX.234 (this is the lvs..)

by 'the lvs' i assume you mean 'the director'

> XX.XX.XX.235 real server (web ngix)
> YYY.YYY.YYY.163  real server (web apache) - I don't really care about this 
> one, i can move it into the

into the .. ? cat caught your tongue? :P

> we will use XX.XX.XX.236 as virtual..(there isn't an ip on the net with that 
> number up). of course it will be simpler to use a 192.. but, i have tried 
> that also ,and no luck

I assume that by 'virtual' you mean Virtual IP configured on the
director. 

> same subnet as the first 2 ones, i just want to make it work from the 234 -> 
> 235..but it gives me a timeout on the browser...

You mean from .236 -> .235 because clients won't connect to the IP of
the director. Instead they'll connect to the VIP.

> Here's what i did:
> 
> [root@linux ~]# cat /etc/ha.d/ldirectord.cf
> checktimeout=3
> checkinterval=10
> autoreload=yes
> logfile="/var/log/ldirectord.log"
> quiescent=no

I'm not sure 'quiescent=no' is a valid option in ldirectord.cf .. What
were you trying to accomplish here?

> virtual=XX.XX.XX.236:80
>         fallback=127.0.0.1:80
>         real=XX.XX.XX.235:80 gate
>         real=YYY.YYY.YYY.235:80 gate

Where does the YYY.YYY.YYY.235 come from? Assuming X != Y, this will
never work as the two realservers are in different subnets. Again i'm
assuming this because of your ifconfig post later in this mail. Either
move the 2nd realserver into the subnet, leave it out, or extend your
subnet to include it (probably not possible).

>         service=http
>         request="test.html"
>         receive="Still alive"
>         scheduler=rr #here i've tried with172.18.24.15 wlr as well

The scheduler is irrelevant for now. Let's just first try to get it
working period. Anyway, 'wlr'? I'm not even sure this is a valid
scheduler. Don't you mean 'wlc' or 'wrr'? :)

>         protocol=tcp
>         checktype=negotiate
> 
> [root@linux ~]# /usr/sbin/ldirectord -d /etc/ha.d/ldirectord.cf start

It's better (to learn LVS) if you don't use ldirectord for now, but ok,
let's try it anyway :P

-snip heartbeat output-

> If i go to the webserver i can see that ldirector is actually testing the 
> test.html..every 10 seconds like in the conf

Good, at least the director can access one of the realservers. 

> I am sure that is because of the configurations of the IPs and the aditional 
> eth0:0 and lo:0 and that's why it doesn't work, i will paste everything that 
> i did, maybe, just maybe you can help me out on this one, i'm really 
> stuck..probably because i don't know lots of stuff on how the OSI layer is 
> build and how arp works

Okay, tiny summary of what's supposed to happen in LVS-DR:

1) client sends request to VIP
2) director receives packet, checks LVS table for available realserver
and forwards the packet to it without changing destination_ip
3) realserver receives packet and sees destination_ip matches the
configured ip on its loopback device 
4) realserver handles request and replies to src_address (the original
client) thereby using its default gateway, bypassing the director
5) client receives reply

What probably happened is that you didn't solve the ARP problem and
instead of receiving a reply from the IP configured on the realserver's
loopback device (that's the VIP, the same one configured on the
director), the realserver's RIP replies to the client. Your client never
sent a request to _that_ IP so it drops the packet, endlessly waiting
for a reply from the VIP.

> Aditional Network conf..
> XX.XX.XX.234 (this is the lvs..) is spawned on eth1..
> so i spawned another eth1:0
> [root@linux ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth1:0
> DEVICE=eth1:0
> IPADDR=XX.XX.XX.236  # this is from the same subnet and i doesn't conflict 
> with an existent one
> NETMASK=255.255.255.0
> NETWORK=XX.XX.XX.0
> BROADCAST=XX.XX.XX.255
> GATEWAY=XX.XX.XX.233 # i've put the same gateway as the default ip 
> XX.XX.XX.234 had
> ONBOOT=yes
> 
> i also enabled port forwarding

Port forwarding isn't required for LVS-DR.

> net.ipv4.ip_forward = 1
> net.ipv4.conf.default.rp_filter = 1
> net.ipv4.conf.default.accept_source_route = 0
> kernel.sysrq = 0
> kernel.core_uses_pid = 1
> net.ipv4.tcp_syncookies = 1
> kernel.msgmnb = 65536
> kernel.msgmax = 65536
> kernel.shmmax = 68719476736
> kernel.shmall = 4294967296
> 
> And i did the modprobe with all those modules for IPV
> 
> modprobe ip_vs_dh
> modprobe ip_vs_ftp
> modprobe ip_vs_dh
> modprobe ip_vs_ftp
> modprobe ip_vs
> modprobe ip_vs_lblc
> modprobe ip_vs_lblcr
> modprobe ip_vs_lc
> modprobe ip_vs_nq
> modprobe ip_vs_rr
> modprobe ip_vs_sed
> modprobe ip_vs_sh
> modprobe ip_vs_wlc
> modprobe ip_vs_wrr

IPVS will load the module it requires for the configured scheduler
automatically. It's not necessary to load any module manually.

> and that's all i did for the LVS server..
> 
> Now for the webserver
> XX.XX.XX.235 (this is the lvs..) is spawned on eth0..
> root@linux ~]# cat /etc/sysconfig/network-scripts/ifcfg-lo:0
> DEVICE=lo:0
> IPADDR=XX.XX.XX.236
> NETMASK=255.255.255.255
> NETWORK=XX.XX.XX.XX.0
> BROADCAST=XX.XX.XX.255
> ONBOOT=yes
> NAME=loopback

AFAIK it's best not to use the sysconfig scripts to create the loopback
device on the realserver. RedHat does an ARPING to determine if the
interface you're trying to configure is already up. It might confuse
things. You could just as easily configure everything
from /etc/rc.local. See below the next bit:

> and
> 
> net.ipv4.ip_forward = 0
> net.ipv4.conf.lo.arp_ignore = 1   #here i have tried with eth0 instead of lo, 
> no luck..
> net.ipv4.conf.lo.arp_announce = 2  #here i have tried with eth0 instead of 
> lo, no luck..
> net.ipv4.conf.all.arp_ignore = 1
> net.ipv4.conf.all.arp_announce = 2

Okay, you're starting okay in trying to solve the ARP problem. But
you're really not... Configure arp ignore/announce and the loopback
device from within /etc/rc.local like this:

# solve the 'ARP problem'
echo 1 > /proc/sys/net/ipv4/conf/lo/arp_ignore
echo 1 > /proc/sys/net/ipv4/conf/eth0/arp_ignore
echo 2 > /proc/sys/net/ipv4/conf/lo/arp_announce
echo 2 > /proc/sys/net/ipv4/conf/eth0/arp_announce
/sbin/ifconfig lo:0 XX.XX.XX.236 netmask 255.255.255.255 up

(reboot your realserver after this, or bring down lo:0 and
run /etc/rc.local)

> net.ipv4.conf.default.rp_filter = 1
> net.ipv4.conf.default.accept_source_route = 0
> kernel.sysrq = 0
> kernel.core_uses_pid = 1
> net.ipv4.tcp_syncookies = 1
> kernel.msgmnb = 65536
> kernel.msgmax = 65536
> kernel.shmmax = 68719476736
> kernel.shmall = 4294967296
> 
> here i've tried with ip forward 0 and 1 , no luck the requests simply don't 
> reach this server only the direct ones

You don't configure ip_forward on the realservers. 

> 
> Now i've understand that this is an ARP problem, and as CentOS doesn't 
> support the arp hidden flag on sysctl , i tried with /etc/init.d/arptables_jf

-snip arptables output-

Again, assuming you have a fairly recent kernel, you don't need
arptables. Just for fun, please post your kernel version.

> -----------------------------------------------
> 
> i've probed with ipvsadm or something like it, to see the active connections, 
> and they are always to 0
> 
> i've modprobed here the same modules, no luck..
> 
> So from this point i'm really stuck and don't know what to do...
> 
> Here's the ifconfig from both servers if that helps
> from the LVS (xx.234)
> 
> eth1      Link encap:Ethernet  HWaddr 00:1B:21:46:3E:A9  
>           inet addr:XX.XX.XX.234  Bcast:XX.XX.XX.239  Mask:255.255.255.248

Okay, this pretty much rules out the possibility that both your
realservers are in the same subnet, so fix that first by either removing
one that isn't in the same subnet as the director, or by moving it into
the subnet.

> eth1:0    Link encap:Ethernet  HWaddr 00:1B:21:46:3E:A9  
>           inet addr:XX.XX.XX.236  Bcast:XX.XX.XX.255  Mask:255.255.255.0

Your configured VIP is in a different subnet than your real IP. Probably
doesn't matter much in this case though, but still..

-snip rest of output-

> here's the one from the webserver
> 
> eth0      Link encap:Ethernet  HWaddr 00:24:1D:72:61:AB  
>           inet addr:XX.XX.XX.235  Bcast:XX.XX.XX.239  Mask:255.255.255.248

-snip rest of eth0 and lo output-

> lo:0      Link encap:Local Loopback  
>           inet addr:XX.XX.XX.236  Mask:255.255.255.255
>           UP LOOPBACK RUNNING  MTU:16436  Metric:1

Good, it has the VIP configured.

So after all these changes/checks/reboots, run this command from a
client (not the director or any of the realservers!) :

$ arping XX.XX.XX.236

You should get a reply from 00:1B:21:46:3E:A9 (MAC on the director). If
you get any reply from 00:24:1D:72:61:AB (MAC on the realserver) you
haven't solved the ARP problem yet. Assuming (here we go again) that you
have now configured the lo:0 on the realserver correctly (and removed it
from /etc/sysconfig/network-scripts/ifcfg-lo:0 !!), clear the arp cache
on the client:

$ arp -d XX.XX.XX.236

and try the arping command again.

Also, disable any firewalls on both director and realservers while
setting things up. You can always later lock it down (while keeping an
eye on functionality). This way it won't disturb setting up LVS.

And if you rather not use ldirectord (recommended for first-time usage):

$ service ldirectord stop
$ ipvsadm -A -t XX.XX.XX.236:80 -s rr
$ ipvsadm -a -t XX.XX.XX.236:80 -r XX.XX.XX.235:80 -g -w 1

Then from a client (not the director or realserver) try it out:

$ telnet XX.XX.XX.236 80

If you get a response like this:

Trying XX.XX.XX.236...
Connected to XX.XX.XX.236.
Escape character is '^]'.

Then it's working.

Good luck :)

-- 
Léon


_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://lists.graemef.net/mailman/listinfo/lvs-users
<Prev in Thread] Current Thread [Next in Thread>