Re: HTTP issue part 2

To:	"LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject:	Re: HTTP issue part 2
From:	"Matthew Story" <matthewstory@xxxxxxxxx>
Date:	Wed, 30 Aug 2006 14:24:31 -0500

Some additional debugging:

I installed apache2 on the director box and took down ipvs for a
minute.  I set up the eth0 interface to take the VIP and the RIP of
the director box.  I was able to connect to the VIP via HTTP this way.
I then took down eth0:0 and put ipvs back online, the request failed
with the same error as before, so the director is not trying to handle
the incoming HTTP requests by itself, though I never really thought
this could have been the case.  In any case something is going wrong
with the forewarding I'm guessing, which again is Direct Route (gate).

On 8/30/06, Matthew Story <matthewstory@xxxxxxxxx> wrote:

So i've debugged quite a bit of this, and have simplified down the
issue to just HTTP for now.  I have it working within the cluster now,
so the director is communicating properly with the apache real
servers, I can bring the node on or offline by moving the file that it
is requesting.  And it is finally properly sending test packets every
2 seconds as configured.

But . . .

I still can't get a connection to the VIP via HTTP request from an
outside box.  The director is representing the VIP properly.  I am
using DR (gate) packet forewarding, and representing the VIP on a lo:0
interface on the real server.  This is working, because if i take lo:0
down, the node goes offline on the cluster.

I have done some debugging on this and narrowed the problem down a bit:

1. The real server is up and running.  I can make a HTTP request to
the RIP of the real server and get the desired response.

2. The packets are never getting to the real server.  I figured this
out by putting up a firewall restricting requests from my IP on port
80 on the real server.  The error was still "connection refused" not
the timeout error that should have occured.  (the director was not
blocked by the firewall, only the IP i was requesting from.)
Conversly, when i put a firewall up on port 80 to the same IP on the
director box it fails with the expected time out error, not the
"connection refused error."

3. My first instict at this point is to make sure that forewarding is
set up properly, checked my sysctl.conf file and sure enough:

net.ipv4.ip_forward = 1

is set properly.  I checked the sysctl.conf on the real server too,
and everything apears to be in order, but that isn't the concern yet
as when I firewalled that server it should have timed out regardless
of the sysctl settings.

Next steps:

My next idea is to install apache on the director to see if its trying
to handle HTTP requests to the VIP by itself, and not forewarding it,
but this is a bit of a hassle, and I don't know if it would show me
anything usefull.

Given all that does anyone have any thoughts?  Have a similar error
they've championed?

many thanks in advance.

On 8/29/06, Matthew Story <matthewstory@xxxxxxxxx> wrote:
> So i've got a better handle on the HTTP error now.  Firstly the set up
> is two dual core AMD Athlon 64 servers are serving as the director
> boxes.  I'm running ultramonkey 3 on each of these.  All of the
> servers are at a data center and are running directly on the WAN.
> What seems to be happening is that when i start heartbeat and
> ldirector on one of the directors it makes an HTTP request to the
> Apache real servers.  Though  when i do a tcpdump on the apache real
> servers it only seems to make a request the first time ipvsadm is run
> on the director box.  After this however it makes no HTTP requests to
> the machine at all, and though the box appears to be clustered when i
> do an ipvsadm -L -n, the connection is refused.  I have no firewalls
> running right now, so that is not the issue.  Here is the section of
> the ldirectord.cf file pertaining to the HTTP services:
>
> virtual=64.34.209.34:80
>         fallback=127.0.0.1:80
>         real=64.34.174.215:80 gate
>         real=64.34.180.165:80 gate
>         service=http
>         request="/update/index.html"
>         receive="Test Page"
>         scheduler=rr
>         #persistent=600
>         protocol=tcp
>         checktype=negotiate
>
> As you can see both of the webservers are on different subnets than
> each other, and also on a different subnet than both of the
> ultramonkey directors, though the director boxes are on the same
> subnet (170) and share a common default gateway.
>
> the ha.cf file sets up a ucast between the two servers, and the names
> are properly configured using the uname -n output as the names of the
> two hosts.
>
> The haresources file looks like this:
>
> Server06.example.com    \
>         ldirectord::ldirectord.cf \
>         LVSSyncDaemonSwap::master \
>         IPaddr2::64.34.209.34/24/eth0/64.34.209.255 \
>         IPaddr2::64.34.183.97/24/eth0/64.34.209.255 \
>         IPaddr2::64.34.209.50/24/eth0/64.34.209.255
>
> Any ideas as to why it is behaving in this weird way?
>
> --
> regards,
> matt
>

--
regards,
matt



--
regards,
matt

<Prev in Thread]	Current Thread	[Next in Thread>
HTTP issue part 2, Matthew Story Re: HTTP issue part 2, Matthew Story Re: HTTP issue part 2, Matthew Story <= Re: HTTP issue part 2, Siim Põder Re: HTTP issue part 2, Todd Lyons Re: HTTP issue part 2, Matthew Story

Previous by Date:	max persistance time?, Joseph T. Duncan
Next by Date:	Re: HTTP issue part 2, Siim Põder
Previous by Thread:	Re: HTTP issue part 2, Matthew Story
Next by Thread:	Re: HTTP issue part 2, Siim Põder
Indexes:	[Date] [Thread] [Top] [All Lists]