Zachariah Mully <zmully@xxxxxxxxxxxxxx> said:
> Hello all-
> I am having a debate with one of my software developers about how to
> most efficiently sync content between realservers in an LVS system.
> The situation is this... Our content production software that we'll be
> putting into active use soon will enable our marketing folks to insert
> the advertising into our newsletters without the tech and launch teams
> getting involved (this isn't necessarily a good thing, but I'm willing
> to remain open minded ;). This will require that the images they put
> into newsletters be synced between all the webservers... The problem
> though is that the web/app servers running this software are
> load-balanced so I'll never know which server the images are being
> copied to.
> Obviously loading the images into the database backend and then out to
> the servers would be one method, but the software guys are convinced
> that there is a way to do it with rsync. I've looked over the
> documentation for rsync and I don't see anyway to set up cron jobs on
> the servers to run an rsync job that will look at the other servers
> content, compare it and then either upload or download content to that
> server. Perhaps I am missing an obvious way of doing this, so can anyone
> give me some advice as to the best way of pursuing this problem?
You can use rsync, rsync over ssh or scp.
You can also use partition syncing with a network/distributed filesystem such
as Coda or OpenAFS or drbd (drbd is still too experimental for me). Such a
setup creates partitions which are mirrored in real-time. I.e., changes to
one reflect on them all.
We use a common NFS share on a RAID array. In our particular setup, users
connect to a "staging" server and make changes to the content on the RAID. As
soon as they do this, the real-servers are immediately serving the changed
content. The staging server will accept FTP uploads from authenticated users,
but none of the real-servers will accept any FTP uploads. No content is kept
locally on the real-servers so they never need be synced, except for config
changes like adding a new vhost to Apache.
I'm currently building a cluster for a client that uses a pair of NFS servers
which will use OpenAFS to keep synced, then use linux-ha to make sure that
one of them is always available. One thing to note about such a system is
that the synced partitions are not "backups" of each other. This is really
a "meme" (way of thinking about something). The distinction is simply that
you cannot rollback changes made to a synced filesystem (because the change
is made to them both), whereby with a backup you can rollback. So, if a user
deletes a file, you must reload from backup. I mention this because many
people that I've talked to think that if you have a synced filesystem, then
you have backups.
What I'm wondering is why you would want to do this at all. From your
description, your marketing people are creating newsletters with embedded
advertising. If they are embedding a call for a banner (called a creative in
adspeak) then normally that call would grab the creative/clickthrough link
from the ad server not the web servers. For tracking the results of the
advertising, this is a superior solution. Any decent ad server will have an
interface that the marketing dept. can access without touching the servers at
all.
-=dwh=-
________________________________________________________________
http://www.OpenRecording.com For musicians by musicians.
Now with free Web-Based email too!
|