On Wed, 24 Oct 2007, John Donath wrote:
> Hi,
>
> I have setup a 2 node HA cluster based on the Streamline High
> availability and Load Balancing concept.
>
> The weird thing is that it works fantastic for tcp/80 but it doesn't
> work properly for a udp service like radius (up/1812).
There are conceptual problems loadbalancing UDP, as there is
no connection (see UDP in the HOWTO, there are solutions but
all have problems). As well do you understand the many
reader/single writer problem when loadbalancing databases?
> Assume we have both the http and radius service down on the failover
> director (grind12):
>
> [root@grind11 ~]# ipvsadm
> IP Virtual Server version 1.2.0 (size=4096)
> Prot LocalAddress:Port Scheduler Flags
> -> RemoteAddress:Port Forward Weight ActiveConn InActConn
> UDP 172.31.1.10:radius rr
> -> 172.31.1.11:radius Local 1 0 0
> TCP 172.31.1.10:http rr persistent 600
> -> 172.31.1.11:http Local 1 0 0
(not related to your problem) persistence has problems. You
could look at the -SH scheduler instead.
> I now can access the webserver but I don't get any response from the
> radius service.
how can you access a service when the service is down?
Is Radius listening on the VIP? (it should be, see writeup
for LocalNode)
> Here are results from tcpdump on both nodes when a radius request is
> initiated:
> [root@grind11 ~]# tcpdump -ni any -p udp and host 83.162.10.97
> 14:41:10.069858 IP 83.162.10.97.32843 > 172.31.1.10.radius: RADIUS,
> Access Request (1), id: 0xdb length: 65
> 14:41:10.069891 IP 172.31.1.11.radius > 83.162.10.97.32843: RADIUS,
> Access Accept (2), id: 0xdb length: 26
>
> As you will note the wrong source address is used !!
> It's responding with the realnode IP instead of the VIP and that's
> causing the problem.
No idea. I assume that Radius is listening on x.x.x.11
(instead of x.x.x.10), in which case I can't imagine how
Radius is getting packets at all.
> I am puzzled why this problem does not exist when testing http (tcp/80)
> as yo can see from this:
> 14:43:53.399206 IP 83.162.10.97.41143 > 172.31.1.10.http: F 553:553(0)
> ack 268 win 1728 <nop,nop,timestamp 496389562 507325571>
> 14:43:53.399224 IP 172.31.1.10.http > 83.162.10.97.41143: . ack 554 win
> 1724 <nop,nop,timestamp 507325582 496389562>
>
> Might this be UDP related?
possibly (since I have no idea what's wrong yet).
> [root@grind12 ~]# tcpdump -ni any -p udp and host 83.162.10.97
> ** nothing of course **
I'm sorry, this went over my head. Why "of course"?
> If I reverse the situation - bringing down both services on the primary
> director node (grind11) and starting them up on the failover director
> (grind12) then both services are accessible.
hmm. let's leave this till later.
Joe
--
Joseph Mack NA3T EME(B,D), FM05lw North Carolina
jmack (at) wm7d (dot) net - azimuthal equidistant map
generator at http://www.wm7d.net/azproj.shtml
Homepage http://www.austintek.com/ It's GNU/Linux!
|