Hello,
Well, I'm only concerned about two cases anyway: 1. I manually set the weight
to 0 because of maintenance work and 2. The machine goes down because of
problems. Upgrading LVS to the new version with the sysctl setting covers #2,
so I hope my boss assigns some time in my TODO list to upgrade LVS.
You can prepare everything and when the next software upgrade is due to,
you simply upgrade a bit more ;).
Which leaves me with option 1. Ideally there's no downtime because of
maintenance, but reality is that you often can't apply service packs and
hotfixes to the realservers without bringing them down (the realservers run
win2k, which is unfortunate from a sysadmin point of view, but implied by a
rather large ASP codebase).
I don't understand why? Let's assume you want to apply SP9239834.233a-1
to your RS. Now since you have multiple RS doing the work for you, you
can simply quiesce one and wait half an hour and then perform the
upgrade with SP9239834.233a-1. Then you put it back in with the old
weight. Then play the game with the next RS. No downtime, no customer
problems.
That's what our developers tell me. They claim that using other session
storage than the IIS default handler (which keeps session state in RAM and
only in RAM) is by far the best performing and using custom session save
handlers degrades performance a lot. I don't have the ASP/IIS knowledge to
question that, but it sounds reasonable enough to me.
Aha, does it. Did they do any reasonable performance tests with
different frameworks? Apart from that this is only feasable if you have
a very low traffic web site because otherwise the memory consumption
will kill you or you will end up in memory trashing.
(Currently we don't use shared storage of session data at all, because of the
potential performance problems, so we stick with pure persistency. But I
personally dislike this choice and would rather prefer to move on to
something more professional and reliable.)
Maybe the approach for your site is the right one. I can't tell you and
I guess your programmers are doing a good enough job of exploiting the
best techniques applicable in solving this issues.
It's a bit complex to reassign parts of a client IP, but not the whole IP, but
if you notice that, say, IP 1.2.3.4 causes a lot of traffic on RS1, why not
reassign all _other_ IPs that are currently using RS1 to the other
Yes, you could do that too.
realservers? It sounds to me that you only need to track the amount of
activity per IP and base the weighting on that and reassign the assigned RS
from the _least_ active IPs, since those are best used for balancing.
But then it would take longer to get the proper balance again. What
about a policy selector where you can simply choose which approach you
want to take?
But again, it's not imbalance during normal business that really worries me,
as it has never hit me until now. It's the imbalance penalty when doing
maintenance to the site that annoys me most.
Hmm, I never had imbalance when doing maintenance work because the load
simply distributes on the remaining RS.
At least you don't need medical care for lack of humour ;-)
Thanks.
You made it a bit more complex than what I thought about, but overengineering
is what every programmer does, no? ;-)
I'd call it careful design with possible extensibility in future, but
yes, overengineering is not a bad expression either ... I guess.
I was thinking along the lines of
- We monitor each client IP for activity
What is we and what is the monitoring metric?
- If the realserver is imbalanced by more than a few percent, or of the target
balance is 0 we start reassigning each client IP to a new realserver, by
modifying the template, starting on the first _new_ socket connection.
I don't know if this a good idea. Try to imagine the worst case
situation. You have 3 RS. AOL connects to RS1, some other proxy connects
come from a provider in UK and are stuck to RS3. Now we have of course a
load imbalance which will soon show up in a few percentages. And the
schedulers (because they run parallel and asynchronously) will both give
away clients to RS2, which in turn will be the overloaded one. Trust me,
the Internet is so dynamic that the latency in reaction to a network
anomaly causing load imbalance which tries to equalize it will result in
a chaotic oscillation of network load distribution.
- If the target weight for the realserver is 0 we always reassign
Already done, yes.
- If the realserver is only seriously imbalanced, but the weight is nonzero we
reassign only if a given client's activity is smaller than the average
activity per client IP for this realserver (or something similar), thereby
balancing the cluster using the single hosts and leaving the big NAT-ed
networks for what they are if possible. Even if multiple NAT ranges end up
on a single realserver the disappearing smaller IPs make the average
activity higher for each subsequent reassign run, so in the end even the
NAT-blocks will be reassigned, if needed, but only if really needed.
I think you should have more patience. The Internet is slow and things
will equalize. You cannot assume to have a load equilibrium after a AOL
burst within the first half a day, especially when complexity and
dynamics of the site vary. Your approach simply sharpens the bursts and
tries to modulate to a mean earlier but I'm not so sure if this will work.
I hope you can run fast ;-)
No problem with that.
I'm inclined to tell you not to use persistency and upgrade your DB ;)
Tell the people deciding over the money and I'm all for it ;-)
Give me the phone number of your boss or project manager and I will talk
with him about it.
(amongst which the website's very company itself...) got connected to the
broken server. I had to turn on quiescence in ldirectord.
^^^^^^^^^^^^^^^
Make that "off", not "on". Oops.
Ok.
What ldirectord does if it detects a realserver failure is set the weight to 0
if quiescence is turned on. That's nice for transitional errors and/or for
non-persistent connections, but when the connections are persistent that
simply means clients are never redirected at all to another RS until the
timeout setting. Needless to say that's unwanted behaviour :-)
Needless to say that it is completely broken behaviour. If a service is
not available anymore you _mustn't_ set the weight to 0. Never, ever,
it's a bit nononononono. Take the service template out and put it back
in when the healthcheck says so. There are only two cases where you need
weight 0.
a) You want to do maintenance work and instead of pissing off your
potential customers by killing their sessions you quiesce the RS
until the template timeout expires.
b) You use the per RS threshold limitation patch that will put a RS
into 'quiesced/cripple' mode until the amount of sessions is below
the lower threshold.
And you're completely sure that ldirectord does show this behaviour when
using persistency and the RS goes down (and quiesce option is on)?
Turning the quiescence option off avoids this problem btw.
I'm stunned.
If a service on a RS is down, your user space core detection engine should
take the template out and before that make sure you have set
/proc/sys/net/ipv4/vs/expire_nodest_conn.
ldirectord with quiescence doesn't do that by default, but that's what I did
configure it to do indeed.
It's none of the quiesce functionality's business. If a service on a RS
is not available anymore, take it out. End of story. Not hard feelings
about it, just rip it out of the connection template because that RS
ain't gonna give you any warm feelings anyway ;).
Indeed :-) I read the docs as well, and it seems to me it doesn't do what I
need. But as you correctly state, the docs are a bit sparse...
Written by a genius, you know how this is ...
Ever had to admin win2k realservers? ;-)
Sure and I love it. I mean, come on, read it from my lips: I'm the born
Windows administrator! Ok, seriously, here's what I do when a customer
wants W2K as RS:
1.) get a decent box with lots of RAM (1-2GB)
2.) install Linux on it (doesn't matter which one)
3.) Now: Install vmware on it!
4.) install W2K into the vmware
5.) setup bridged networking and a terminal service access
6.) give the customer the login and passwd and tell him the IP address
7.) if W2K stalls in a way the customer can't do anything about it:
pkill -9 vmware
8.) if the customer wants backup:
dd if=/var/vmware/nt.disk of=/nfs/backup/bck-$(date)-cust
9.) go into the pub to have a few beers because I don't need to spend
time with support
10.) sleep well because I know backup is there, we can easily powercycle
W2k and because of the amount of beer.
Note to our customers: All those 10 points are not true, it's just a
fairy tale. It would never work that way. [/me runs again like hell]
Besides, most of the maintenance downtime is formed by code updates, because
the sites are still evolving. And copying over new code from the beta to the
live is not something you do on an active RS...
Ok, I'm not so sure about your business but in 4 years of doing
e-commerce projects I've seen some pretty funny stories and failures and
one thing that I learned was: Make an exact copy of your product
framework in a pilot network setup. Have it in-house and do your
software tests and SP upgrades on the exact same setup in-house. How can
you make sure that a new SP doesn't all of the sudden disable the MS
loopback adapter or moves it to a different place in the registry? How
can you make sure that the session ID fetching from disk to RAM still
works? I mean, we're talking about big applications here but even if
yours is not so big, you might convince your boss to spend a few bucks
on a decent pilot setup.
Not really, as RS1 has half the weight of RS2 and RS3 (the backoffice and some
other stuff runs on RS1, so that machine is loaded enough without LVS
activity :-)
So RS1 to RS3 are not providing the same load balanced service? May I
ask you to share your 'ipvsadm -n -L' with us, please?
Fine. So we're talking about a site with very low bandwidth constraints. I
just checked one of our customers site and they have between 4 and
13Mbit/s.
I could only dream about adminning such boxes :-)
Well, there is not a lot to do there, you know ... Linux and things like
that :).
Then again, for a first job after graduation it's not bad at all. This setup
^^^^^^^^^^
still working on that one
is more than ambitious enough before I think I know how all of it works...
Absolutely.
Hmm, actually I have no idea how to measure that. Will ask the database admin
if he can come up with something.
Very good.
DB tuning always needs advanced tricks and most of the time you need a
Russian guy to do this :)
We have a Russian developer, would that qualify? :-P
See, my work experience with Russians tells me that there are (besides
thousands of other nice things) 3 things they produce for sure:
o vodka in all flavours and colours
o excellent mathematicians (Hello NSA, do you copy?)
o fully fledged (Oracle) DB admins with indepth Delphi knowledge
I haven't found the conjunction of those three items yet.
Yes, I am well aware of this and so is the rest of the sysadmin team. But we
don't do the budgets :(
I understand. Good luck anyway.
Best regards,
Roberto Nibali, ratz
--
echo '[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq'|dc
|