LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: Release new code: Scheduler for distributed caching

To: Julian Anastasov <ja@xxxxxx>
Subject: Re: Release new code: Scheduler for distributed caching
Cc: Joe Cooper <joe@xxxxxxxxxxxxx>, lvs-users@xxxxxxxxxxxxxxxxxxxxxx
From: Thomas Proell <Thomas.Proell@xxxxxxxxxx>
Date: Thu, 26 Oct 2000 12:13:40 +0200 (MET DST)
Hi!

>       My thoughts are about the load split. I don't believe in
> the main rule on which is based this scheduling, i.e. the assumption
> that the director is authoritative master for the load information.

Load split isn't easy. You can't meassure the load of the caches from
the director. Even if you have a program running on the caches, it's
hard to tell what "load" really is.
Would it be a good idea to messure the time a simple "ping" needs
to estimate the load of the cache/server?

> Even the WLC scheduler has a simple load balancing. IMO, blindly

Yes, it counts the active connections. But this is not the real load
in every case. Think about a cache that has 2000 open connections for
the same file and the origin server sends 20 bytes/sec. This isn't
loaded, whereas the other cache with 1999 open connections for the
already cached linux kernel will break down soon.
Nevertheless WLC sends the next request to this cache, because it's 
less loaded.

We have only two possibilities:
-running software on the caches/servers that meassures the actual load
 very good, sends it to the director, who uses it for balancing, OR
-trying to find a more or less exact estimation of the actual load on 
 the caches/servers as "number of open connectrions (lc)" or "statisticaly 
 balanced (ch)"

All actual implementations use the second idea. And you'll always find
an unfair situation, because they only estimate the load. The question
is, if the estimation is close enough on reality to work.

> selecting the real servers in a cluster environment can lead to more
> problems. 

BTW the hot spot solution will be implemented this year. This will cut
off the peaks. So, most of the sites will be distributed with consistent
hashing, and the few hot spots will be distributed with least connection
or something like that.
So, most of the sites will be distributed without replication, and
most of the load/requests (for only few sites) will be distributed 
among the least connected.

Thats the advantage of both with taking the disadvantages only slightly - 
only very few sites are replicated.

BTW: the weighting is implemented. I'm just not very keen on making
a new patch for such little changes. 

> Because if the swapping is started it is very difficult
> to escape from this situation. 

Swapping? From RAM to disk or what? I don't understand that.

> I have a setup where the real server's
> load is monitored and the problems are isolated in seconds.

How is the load monitored?

> Without
> such monitoring the servers are simply killed from requests. But only
> the test results with the new scheduler can give more information about
> the load balancing. And it depends on the load.

Sure. The requests for a single company follow a
very extreme zipf-distribution, with only 10 sites requested more
often than 2500 times a day. If three of these sites are unfortunately 
mapped to the same cache, it'll have a bad time.
But with 10 companies together, there's a good chance to get a
zipf-distribution that is wider. We may have 30 or 50 "hot" sites,
and the possibility of an imbalance is far smaller.

There are several studies who worked with that already - at least my
Prof. says so...

But this won't be a theme with the hot spot solution! 


> > Can we implement tunneling with a netfilter module?
> 
>       I don't understand. All features from 2.2 are present
> in the 2.4 port.

My fault. Didn't get the point.


Thomas



<Prev in Thread] Current Thread [Next in Thread>