LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: fastest responce

To: Tyrel Beede <tb90@xxxxxxxxxxxxxxxxx>
Subject: Re: fastest responce
Cc: <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
From: Julian Anastasov <ja@xxxxxx>
Date: Sun, 28 Jan 2001 16:26:17 +0000 (GMT)
        Hello,

On Sat, 27 Jan 2001, Tyrel Beede wrote:

> > > exchange types which do not resemble bulk ftp.  I also wonder if it would 
> > > be
> > > possible to create a framework within the lvs to enable balancing at a 
> > > packet
> > > granularity rather than at a connection granularity?  Would it be 
> > > possible for
> >
> >         Are you trying to balance a ftp traffic?
> >
> >         Hm, what do you mean: balancing at a packet/connection
> > granularity. I don't understand. Scheduling of independent packets?
> > What service needs this?
>
> For example, within a cluster their could be two nodes.  Assuming these nodes 
> each
> had three established tcp connections it would be possible that one of the 
> two nodes
> could have three established connections which were not transmitting data.  
> Therefor
> when we schedual according to a connection granularity load is only shared at 
> a level
> where the number of connections between machines are balanced.  With tcp this 
> does
> not mean that the amount of data transmitted is going to be the same per 
> connection
> and thus, in total, would not be the same per node in the cluster.  Now, what 
> I
> wonder is would such a thing be possible within the current tcp 
> implementation.  As
> you indicated I'm not sure which services would benifit from this the most 
> but it
> wouldn't be hard to characterize a type of data transaction which would 
> benifit the
> most. From this special case it would be possible to evaluate where or not 
> any real
> preformance gains could be made.  This, however, is getting a little bit 
> further away
> from the topic than my original question had invisioned.  I was just 
> wondering if it
> would be possible and if possible how would it be done on paper.

        I don't see what can be done here. In this example one of the
connections can transfer 10MB/sec while the other connections can
transfer only 1KB/sec. In this case we have communication between two
ends and I don't see a way to equally load the network traffic. The
first goal is to connect the both ends and then comes the second goal
to load the links equally. Only in this order. Splitting connection to
different real servers is logically incorrect. We are not sure whether
the two real servers will forward the traffic to same host, i.e. we assume
the real servers are one of the connection ends. The other end is the
client. The balancing effect will be achieved when many connections
are scheduled. This is a "Load informed connection scheduling" and
not "Load informed balancing" because the second term is too ambigous.
So, LVS schedules connections, not packets.

> If the lvs could keep a record of all transactions between a server and host 
> and  if
> the connection were to be closed at the server end it would have the ability 
> to
> regenerate the connection on another server providing that the two servers 
> were able
> to serve identical content.  Now the idea that the lvs could store all the
> information for each transaction through it would most likely be impossible.  
> But
> would just the control information(acks and such) be enough to regenerate the
> connection?  A good visualization of this would be something like a a proxy 
> server
> designed to work at a protocol level instead of the application layer.

        Yes, this in theory is possible, for LVS/NAT. But there are some
questions:

- how the director knows which part from the connection can be continued
from another real server?

- how the second real server will agree to start connection in the middle
and to continue transferring the data started from the failed real
server. This leads to big changes in the real server TCP stack and
of course in the applications. You will need syscall accept() with
support for the initial connection position :) By this way the interal
web server will know at which pos to start sending a static content.

        This sounds as Layer 7 job. For eaxmple, virtual web server.
I'm not sure whether this theory can be applied to TCP. The solution
is to make the application protocol robust, where a drop in one
connection is not fatal, for example FTP/HTTP reget. Everything else
is very complex and breaks many standards.

> >         IMHO, the director needs information from the real servers to
> > balance the load. There are many parameters we can monitor and we can
> > make different expressions based on these parameters: packet rate,
> > cpu usage, free memory. In this way, we can select different expressions
> > for the different services. There is a reason for this: each service
> > loads differently the real host or may be other hosts too, for example
> > databases, etc.
>
> What do you mean my "the director needs information from the real servers to 
> balance
> the load" should this information be a direct result of a platform/application
> specific modification?  How sould it get this information?

        There are agents in the real servers that report information.
The director uses this information to control the connection scheduling.
There is a WRR method in LVS that needs a good cluster software to
achieve this goal. Yes, the agents retrieve OS-specific information.
But this is an application level solution only. Nobody touches the
lower layers. Cluster software. I have some postings on this issue
in the mailing list, you can search them. And I'm preparing a preview
version but it is again delayed, I was busy with creating a healthcheck
program which is now completed.

> >         I don't believe in your theory about the fastest response
> > schedulng but you can surprise us with more specific details and
> > may be results :) Is this scheduler for NAT only?
>
> If I was able to figure out the details and implement something of this 
> nature it
> would be done in NAT to prove the idea

        OK, you know well the source code but if you have some
questions you can post them to the mailing list for discussion.

> Thanks, Tyrel


Regards

--
Julian Anastasov <ja@xxxxxx>



<Prev in Thread] Current Thread [Next in Thread>