On Wed, 7 Aug 2002, Doug Schasteen wrote:
> Hi everyone.
greetings. it's been a while since i've used lvs, so anything i say in
this email could potentially be flawed. please don't sue me!!1
> I'm looking at LVS as a cost effective way to have high uptime and
> scalability. I'm going to start out with a simple setup of 1 load
> balancer and 2 real servers. I've tried to read as much of the LVS
> documentation as I can without actually having the servers already to
> try stuff out on. My brain can only hold so much raw info without
> frying. Hopefully I can get my servers soon and start unloading some of
> this info by testing things out. But before I order my servers I want to
> make sure this is going to work for me. There are some questions that I
> have that I wasn't able to answer by reading the docs.
it will likely make a lot more sense to you once you start trying it out.
it's difficult to understand what lvs does without seeing a demonstration,
perhaps watching some tcpdumps, etc etc.
> One thing I'm very confused about is how to set up Apache when using
> LVS. We host multiple websites with IP based hosting (not name based
> because SSL only works with IP based) so each hostname that we host has
> its own IP address. Since the real servers will have real IPs of
> 192.168.1.2 and 192.168.1.3, how will I set up my apache.conf? Should
> apache.conf still be set up as if it was the main IP for the domain, or
> should it use its new internal IP? What about for multiple domains and
> multiple IPs? Do you understand what I'm confused about? Because trying
> to explain my confusion just causes more confusion for me. :)
i believe it depends on the lvs methods that you use.
if you use a VS-NAT method of balancing, the load balancer will rewrite
the destination address (and port, perhaps) of the packets sent to the
real server. in this case, you'd want your application to listen on the
real server ips / internal ips / whatever ips you're translating to.
alternatively, if you use VS-DR, the destination address and port will NOT
be rewritten by the balancer, so your application (web server, whatever)
should listen on the public ips.
how you get around the arp problem, if applicable, may also influence the
logistics of this -- i'm not entirely sure offhand.
note that in apache, your "listen" directives may not always be in sync
with, for example, your "virtualhost" lines, depending on your setup. i
foolishly spent too much figuring out that error once myself. there's a
thread in the lvs archive about it, take a look at:
http://marc.theaimsgroup.com/?l=linux-virtual-server&m=97613293904400&w=2
> What problems will I run into when trying to run a mail server on both
> of the real servers? Should I forward pop3/smtp ports to just one server
> instead trying to load balance it? Should I be concerned about this?
the problem with balancing something like pop3 is that you'd want both of
the real servers to have all the mail messages accessable for the clients.
also, when a client, say, deletes a message in his pop3 mailbox, that
deletion needs to carry across to both/all of your real servers. if
you're using something like nfs so that both of your real servers are
accessing the same data, then you're set (well, maybe. if the pop3
applications are nfs-safe, you're probably set. if not...). if not,
things are going to be difficult keeping real server 1's data in sync with
real server 2's data.
depending on the details of what you're trying to do, this may or may not
be an issue for you.
> I already know I'm going to run into trouble with MySQL since I do a lot
> of updates and inserts on the tables. I think my solution with this is
> going to be to rewrite all of my php scripts to use just one server as
> the master and the other will act as the slave and receive the updates
> through mysql's own replication features.
balancing writes to a mysql database with lvs probably isn't worth your
effort to even try. balancing READS might work, however. to the best of
my knowledge, 2 mysql instances using the same storage (nfs filesystem or
whatever) is asking for trouble. mysql has some replication and
master/slave settings of its own these days -- those may or may not better
suit your needs. replicating data via other means, like rsync, could suit
your needs as well (probably not if the data is rapidly changing).
you could also look into distributing mysql load with some sort of
application-level awareness, instead of using lvs. like, "ok, records a-e
are on mysql server ONE, f-p on mysql server TWO, and so on", but your
application would have to be aware of that and manage its sql connections
appropriately, which could be a very involved task. if your database
server is overloaded, the best option might very well simply be to get a
more powerful single machine to run the db server applications. maybe
not. you decide!
again, it depends on what you're trying to accomplish, but generally, this
kind of balancing (lvs with mysql writes, etc) doesn't work well and can
cause problems. you already seem to concede this, which is probably a
good thing.
> I'd like to be able to access all of the real servers independently from
> the internet for administration purposes. So I'm going to buy servers
> with dual network cards and have one of the network cards plugged into
> the hub/lan and one of them accessing the internet. I don't assume there
> will be any problems with this as the public IP is only going to be used
> for SSH access and I won't be using it for apache.
hmm, i don't forsee any problems at a glance, but i may be missing
something.
you may not even need a 2nd network card for this. you could put multiple
ips on the same ethernet interface, or have this traffic be directed
through the load balancer (in a manner where you can still specify going
to a specific real server, instead of being balanced), or other methods
too i'm sure.
not to sound like a broken record, but whether 1 interface will work or if
2 are required, and what's easiest, depends on what kind of setup you're
going to have. for example, if you'll be using VS-DR, your interfaces
will have to be on the same physical segment as the next hop outgoing
router, which is probably the same router for your incoming traffic,
meaning the interfaces are already "outside" and therefore adding public
ips to them is doable. if you're using VS-NAT and physically segmenting
your "outside" network and the network between your balancer and real
servers, then simply adding public ips to the real server network
interfaces won't be enough. or, as mentioned, you could skip the extra
public ips altogether and forward your ssh traffic through the load
balancer. you could direct, say, port 7001/tcp to real server 1's port
22/tcp, and the public ip's port 7002/tcp to real server 2's port 22/tcp,
and so on. no balancing, but the traffic is still being switched by the
load balancer linux machine. there are other options along those lines as
well. so many options for everything!!1 an issue there might be that if
your balancer machine crashes or becomes inoperable for some reason, you
won't be able to remotely access your real server machines until the
balancer's issue is resolved. you get the idea.
> Anyway, I think that covers most of the issues that are bugging me.
> Thanks for any comments/advice anyone can give me.
sorry i couldn't give more specific help. if you're looking for more info
from the list, give some more details about the nature of your needs and
wants and how the backend applications should be functioning and all that
jazz and i'm sure you'll get some opinions and, oh yes, even more options
in response.
good luck.
-tcl.
|