LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: help (web clustering/bandwidth limiting on a particular group of URL

To: Joseph Mack <mack@xxxxxxxxxxx>
Subject: Re: help (web clustering/bandwidth limiting on a particular group of URLS)
Cc: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
From: Lynn Winebarger <webmaster@xxxxxxxxxxxxxx>
Date: Thu, 12 Apr 2001 01:21:00 -0600 (MDT)
On Sun, 8 Apr 2001, Lynn Winebarger wrote:
>       Using a cluster with an ability to throttle particular parts of our
> website.  Actually, I'd prefer to be able to put a throttle on each
> virtual host we serve individually as well, which the below process-based
> attempt wouldn't manage.  For that, I'd have to write an Apache module.
> Which is feasible, but not as fast as having a pre-existing solution...
> 
    As far as I can tell, what I would need to do is:
(a) write an apache module to allow me to specify a particular local ip
    address/port to write to (rather than the incoming port).
(b) set up a tunnel device on each real server that tunnels to the
outgoing router.
(c) adapt traffic shaping module to limit on individual ports of a device.
(d) set limits on each incoming port as desired (essentially, each port
represents a virtual host of the web server config in this application)
(e) forward the incoming ports to port 80 on the outgoing interface.

Then there's (d1) write a daemon to obtain ports/limits from a central
directory/database either by polling or getting signaled in some way (and
then reading from the database).

   As I understand it, LVS/DR has the real servers set up so each of them
uses the "virtual IP address" on the ethernet interface connected to the
LVS network, the incoming LVS router does its "rewriting" trick at the
ethernet level (not really rewriting, more like bypassing the IP layer and
performing IP connection level routing at the data link layer).  Then
the real server sends out its packet as originating from the virtual IP
address, because they don't "realize" (without ARP info) that many
machines on that ethernet link send out packets as that machine.
   In this case I really don't need the outgoing tunneling, but I do need
the port rewriting at the application layer.  So as I see it the sequence
of  rewrites would be (with real server's default gateway set to outgoing
router)
<client IP:cport|server VIP:sport> incoming routed to real server A's mac
                                   address.
<client IP:cport|server VIP:sport> accepted by application, response
                                   calculated (file lookup,CGI,whatever)
<server VIP:vport|client IP:cport> response packet sent by real server,
                                   sent to outgoing router (default gw)
                                   [*]
<server VIP:vport|client IP:cport> received by outgoing router, bandwidth
                                   shaping applied, port forwarded to
<server VIP:sport|client IP:cport> any final bandwidth filtering applied
                                   and sent out to the outgoing router's
                                   default gateway.

[*] seems the trickiest piece.  The TCP handshake would be performed
below the application layer, so none of the application rewriting would
apply to it (or be subject to bandwidth limiting, other than the outgoing
aggregate).  However, it seems to me the TCP or IP layer might not like
having the source port arbitrarily rewritten in this way - it only works
if we know no other connections of <client IP:cport|server VIP:vport> have
been actually established, so we can abuse this field in the header to act 
as a marker with extra semantics.  And if we can't just modify the data
packet's IP header, we're in trouble because we don't really have a
connection with that quadruple.  The outgoing router shouldn't present any
trouble because there's no connection with it, its just looking at parts
of the ip headers to see where it should send it next.  The tcp layer
would still respond to the bandwidth limits because packets would be
dropped without feedback, so the application doesn't need to keep any
bandwidth control information (other than the socket).

Does anyone have any comments or recommendations for this (particularly
whether this IP header rewriting is a practical problem and if so, how
to work around it)?
   I think this could also work on a non-clustered machine, using a fake
network device.  

Thanks,
Lynn



<Prev in Thread] Current Thread [Next in Thread>