LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: Jabber Scalability and LVS

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: Jabber Scalability and LVS
From: Clint Byrum <cbyrum@xxxxxxxxxxx>
Date: Thu, 03 Feb 2005 09:20:03 -0800
On Thu, 2005-02-03 at 08:51 -0700, kwijibo@xxxxxxxxxx wrote:
> Jon Phillips wrote:
> 
> > Exactly...we are limiting users of our jabber service to our jabber 
> > servers, however, users will be spread throughout the world. So thus, 
> > they would connect to the jabber server (a collection of LVS 
> > realservers) through the load server, right? The jabberd2 server then 
> > has a db in mysql that maintains information about who is logged in, and 
> > then all the users. Where would this data exist? Would this be synced 
> > with the other servers? Or, would a large RAID be needed to store the 
> > MYSQL db for all the LVS realservers to read and write to? This is where 
> > my knowledge gets a little hazy of the subject?
> 
> I think it would be easier to keep your db in a central location.  I am
> not sure how you would handle the db syncing if each node had it's own
> db.  What would happen if a user jumped to another realserver before the
> db had a chance to sync?  That would be a nightmare to maintain.
> I think a central backend db would be the way to go.  You then have a
> single point of failure though but if you are just doing this for
> distributing the load and not necessarily for redundancy it shouldn't
> be that much of a problem.
> 

Read up on MySQL Cluster. This is what it was designed for. It is not
replication, it is synchronous read-write access across all nodes. They
do this by storing each row on more than one server. So if you have 2
servers, each server has all the data. If you have 4 servers, each
server has half the data.. .and so on and so on (the system is only
efficient if you increase the size in powers of 2, so 6 is not good, but
4 and 8 are). When you do an insert, it uses a hash function on the ID
to know which server to send it to. Likewise on primary-key lookups.
Non-ID lookups are done by sending the query to all servers, and then
combining/filtering the results at the requesting node.

Its pretty darn cool, but because of the ambitious nature of the
project, they had to drop some of the nice things about MySQL. Its a
little slow for single-threaded type queries, though its really fast for
tons of little transactions and queries. There is no FULLTEXT support,
nor is there Foreign Key support. The biggest loss is that there is no
disk-storage. Each node keeps all of its data in RAM. This makes it
extremely fast in some circumstances, but extremely expensive for large
amounts of data. They've identified this as the number one user need
that it does not fulfill, and are responding by implementing disk
storage in the next version.

http://www.mysql.com/products/cluster/

-- 
Clint Byrum <cbyrum@xxxxxxxxxxx>


<Prev in Thread] Current Thread [Next in Thread>