pongsit@xxxxxxxxxxxx wrote:
>
> i am thinking of using 2 or 3 squid servers for our internel users (about
> 1000 users) .
> then a come across LVS and this is my questions.
Balancing Squids is one of the more common uses of LVS.
see
http://wwwcache.ja.net/JanetService/PilotService.html
I've included part of the next version of the LVS-HOWTO on Squid.
Michael Sparks said he would rewrite it but I haven't heard from
him since he changed jobs.
Joe
--
Joseph Mack PhD, Senior Systems Engineer, Lockheed Martin
contractor to the National Environmental Supercomputer Center,
mailto:mack.joseph@xxxxxxx ph# 919-541-0007, RTP, NC, USA (part of the LVS-HOWO (C) Joseph Mack 2000)
Squid
What is Squid?
Squid (http://www.squid-cache.org/) is a program which caches http/ftp
requests. You configure the squid box/program to listen to your http
requests on a high port number (default 3128) and if the contents of the
URL are not found in the cache, to forward the request to the real URL.
Squids are useful for ISPs or large sites (eg corporate world) where
URLs will be accessed by many people.
You have configure (netscape - edit/preferences/advanced/proxies/manual)
your browser to use the proxy at squidbox.foo.bar:3128.
Squid has filtering capabilities (to remove advertisements) or you can
use httproute (http://people.qualcomm.com/karn/code/ comes with a starter list
of annoying sites to filter out).
Note httproute replaces annoying
ads with a 1x1 gif rather than blocking delivery of the gif. If you block
the url:/directory of the gif with squid, the whole page won't load and
you'll get a timeout.
The user/squid/httproute/url setup is
____________________
| |
| httpd:80 |
|___________________|
|
|
____________________
| talks to 80 |
| |
| squid |
| |
| listens on 3128 |
|___________________|
|
|
____________________
| talks to 3128 |
| |
| httproute |
| |
| listens on 1080 |
|___________________|
|
|
____________________
| talks to 1080 |
| |
| netscape |
|___________________|
(Note: netscape, squid and httproute can all be running on the same box)
There comes a time when 1 squid is not enough. You can add extra squids,
and arrange for them to share information (with ICP
http://squid.nlanr.net/Doc/FAQ/FAQ-12.html#ss12.2). If the squid you are
connected to does not have the URL in its cache, and another squid box
does, then your squid will fetch the URL from the other squid box. The
problem then is - how to decide which users to connect to which squid
box? Presumably you'd want users spread evenly over the squids to
maximise the chance that a URL would be pre-fetched into one of them.
(
ICP is the Internet Cache Protocol, which squids use to
communicate with each other. you can find a brief definition at
http://squid.nlanr.net/Doc/FAQ/FAQ-12.html#ss12.2
12.2 What is the ICP protocol?
ICP is a protocol used for communication among squid caches.
The ICP protocol is defined in two Internet RFC's. RFC 2186
describes the protocol itself, while RFC 2187 describes the
application of ICP to hierarchical Web caching.
ICP is primarily used within a cache hierarchy to locate
specific objects in sibling caches. If a squid cache does
not have a requested document, it sends an ICP query to
its siblings, and the siblings respond with ICP replies
indicating a ``HIT'' or a ``MISS.'' The cache then uses
the replies to choose from which cache to resolve its
own MISS.
ICP also supports multiplexed transmission of multiple
object streams over a single TCP connection. ICP is
currently implemented on top of UDP. Current versions of
Squid also support ICP via multicast.
)
> From: Craig Sanders <cas@xxxxxxxxxxxxx>
> As with a web server, the point is to load balance the requests
> across multiple machines. That can be done with DNS round-robin
> or with an LVS box or with expensive hardware like an ACE Director.
> DNS round-robin is probably adequate unless you want to force
> all users to use the proxy by by adding redirect rules for port
> 80 on your border router.
Or you could let a computer switch the users between
the squids...
Setting up a LVS Squid farm.
An LVS squid will be setup like this...
1. add entry for squid to /etc/services
optionally on the director and realservers(=squid boxes) add
the following to /etc/services
squid 3128/tcp #squid cache
icp 3130/udp #Internet Cache Protocol
this will allow substitution of the string "squid" for "3128" and
"icp" for "3130" in the LVS configure script and ipvsadm commands.
2. setup hardware
____________________
| |
| httpd:80 | (on internet)
|___________________|
|
router
|
| LAN
------------------------------------------
| | | |
| | | |
____________ | | |
| | | | |
| proxies to | | | |
| VIP:squid | | | |
| | | | |
| user box | | | |
|____________| | | |
| | |
| | |
| | |
| | |
DIP eth0 RIP1 eth0 RIP2 eth0
VIP eth0:1 VIP lo:0 VIP lo:0
___________ ____________ ____________
| director | | realserver | | realserver |
| VS-DR | | | | |
| LVS | | squid | | squid |
| listens on| |listens on | |listens on |
| VIP:squid | | VIP:squid | | VIP:squid |
|___________| | | | |
| connect to | | connect to |
| outside:80 | | outside:80 |
| from RIP | | from RIP |
| | | |
| ICP via VIP| | ICP via VIP|
|____________| |____________|
The setup of the LVS to port 3128 is standard VS-DR and can be setup
using the configure script with the LVS serving port 3128 (squid) or
with ipvsadm.
Squid on the realservers must able to connect to URLs in the outside
world (must have a route to the outside world). This route is via the
RIP, as the VIP is for communication within the LVS.
(If the LVS was serving telnet, nothing special is required on the
realserver as telnet is self contained; if the LVS is serving httpd,
then the realserver also needs html files on the disk; if the LVS is
serving squid then the realservers must also be able to connect to
outside URLs).
In the normal LVS the client is on the internet and the replies come
from the VIP on the realservers. In the squid LVS the client(s) are on
the LAN and the replies come from the VIP on the realservers all of which
are on the LAN. (The information that the squids deliver comes from the
internet, but that isn't of concern to the LVS).
The director will connect a user to the next available squid realserver.
The director doesn't know if the squid has the URL you want in its cache,
the squids will sort this out themselves using ICP, and will eventually
present you with the URL data.
Nothing unusual so far. Here's an example of it working.
From: Craig Sanders <cas@xxxxxxxxxxxxx>a (highly edited - Joe)
hi, i've just set up my first LVS - a squid proxy farm, using 3 machines
kernel-0.9.7-2.2.14, realservers have VIP on dummy0, and i've enabled
the hidden arp feature for dummy0 with the following script fragment:
ifconfig dummy0 x.x.x.8 netmask 255.255.255.0 broadcast x.x.x.255
echo 1 > /proc/sys/net/ipv4/conf/all/hidden
echo 1 > /proc/sys/net/ipv4/conf/dummy0/hidden
IP addresses are as follows:
VIP - x.x.x.8
proxy1 - x.x.x.215 (director & real server)
proxy2 - x.x.x.216 (realserver)
proxy3 - x.x.x.217 (realserver)
ipvsadm rules (running VS-DR)(rules from memory):
# HTTP requests
ipvsadm -A -t x.x.x.8:squid -s wlc
ipvsadm -a -t x.x.x.8:squid -r x.x.x.215
ipvsadm -a -t x.x.x.8:squid -r x.x.x.216
ipvsadm -a -t x.x.x.8:squid -r x.x.x.217
# ICP requests
ipvsadm -A -u x.x.x.8:icp -s wlc
ipvsadm -a -u x.x.x.8:icp -r x.x.x.215
ipvsadm -a -u x.x.x.8:icp -r x.x.x.216
ipvsadm -a -u x.x.x.8:icp -r x.x.x.217
(Question - Is the icp needed here for the LVS to work to the above client
or is it for the squid in the WS later?)
this worked. i did a simple test of setting $http_proxy to point
to "http://x.x.x.8:squid/" and then ran wget to mirror our main web site.
all requests were smoothly load balanced over all 3 real-servers.
all 3 proxies are configured to use each other as siblings (using their
real IPs, not the VIP)
> From: Michael Sparks <zathras@xxxxxxxxxxxxxxxxxx>
>
> The real servers need the following line in their squid config unless
> you're using NAT:
>
> udp_incoming_address x.x.x.8
> ie
> udp_incoming_address VIP
> Or else client caches that talk ICP will get confused, and run really
> slowly.
>
i then tried configuring the squid on my workstation (which i use so
that i can filter banner ads with my redirector script) to use the VIP
(or as a control the individual realservers) as a parent.
(Here's the new setup, the change is the addition of a squid in the
user's box)
____________________
| |
| httpd:80 | (on internet)
|___________________|
|
router
|
| LAN
------------------------------------------
| | | |
____________ | | |
|local squid | | | |
| proxies to | | | |
| VIP:squid | | | |
| | | | | |
| proxies to | | | |
|local squid | | | |
| | | | |
| user box | | | |
|____________| | | |
| | |
| | |
DIP eth0 RIP1 eth0 RIP2 eth0
VIP eth0:1 VIP lo:0 VIP lo:0
___________ ____________ ____________
| director | | realserver | | realserver |
| VS-DR | | | | |
| LVS | | squid | | squid |
| listens on| |listens on | |listens on |
| VIP:squid | | VIP:squid | | VIP:squid |
|___________| | | | |
| connect to | | connect to |
| outside:80 | | outside:80 |
| from RIP | | from RIP |
| | | |
| ICP via VIP| | ICP via VIP|
|____________| |____________|
When i configured squid on my WS to use the 3 RIP addresses (.215,
.216, and .217) as parents. everything worked perfectly. this at least
established that the realservers were all talking to my WS when
used without the director.
When i configured the squid in the WS to use the VIP (and hence the
LVS as the parent). some requests would just fail. i'd get a message
from the squid on my WS saying "unable to forward request to a parent".
seemed like about one out of every 3 requests failed. clicking reload
or shift-reload in netscape didn't help unless i waited a while.
the squid log on my WS looked like this:
first failure:
948786582.237 4 203.16.167.2 TCP_MISS/503 1232 GET *URL* -
multiple reloads and shift-reloads in netscape (this was probably due to
squid briefly caching the TCP_MISS/503 result):
S/503 1232 GET *URL* - NONE/- -
948786589.663 11 *WS* TCP_MISS/503 1232 GET *URL* - NONE/- -
948786590.509 17 *WS* TCP_MISS/503 1232 GET *URL* - NONE/- -
.
.
948786601.867 15 *WS* TCP_MISS/503 1232 GET *URL* - NONE/- -
948786602.768 17 *WS* TCP_MISS/503 1232 GET *URL* - NONE/- -
.
.
finally, the page is fetched:
948786629.368 2228 *WS* TCP_MISS/200 6055 GET *URL* -
TIMEOUT_DEFAULT_PARENT/x.x.x.8 text/html
(for editing *URL* replaces the real url, *WS* is the IP of my
workstation)
> Take a look in your cache.log rather than your access.log - if you see
> info along the lines of "unexpected ICP reply from IP x.x.x.215" then
> that's possibly the root cause of your problem.
> My guess is this is just a squid thing rather than anything else - UDP
> based services balanced using VS-DR or VS-TUN like ICP need to be bound to
> the virtual service address or else everything goes a bit screwy. eg with
> bind 8 you need a line like
>
> listen-on { VIP; }
>
> The reason for this is down to the fact that UDP's a connectionless
> protocol, so unless the server was bright enough to notice which IP it
> recieved the packet on, it'll just choose a default local IP, which
> probably won't be the one you want.
>
> >
> > 948786629.368 2228 203.16.167.2 TCP_MISS/200 6055 GET *URL* -
> TIMEOUT_DEFAULT_PARENT/x.x.x.8 text/html
>
> This is the sort of symptom I'd expect for the above problem.
> .. or it might be something to do with squid's CACHE_DIGEST
>
> See above. Digests are transferred using normal HTTP, so that's not a
> problem here. There's some wider issues in squid clustering which we're
> working to address as part of our work on the JWCS, which we can discuss
> if you like. Essentially it boils down to this: the accuracy of ICP &
> digests is Normal accurcy/N for N servers in an Layer 4 balanced
> situation.
>
> For an excerpt of a detailled discussion I had with someone on this,
> please feel take a look at
> http://epsilon3.wwwcache.ja.net/~zathras/ICP-service.txt
> (Won't be there permenently, but I don't want to clutter up the list)
(at bottom of this section - Joe)
one question: would i be better off using an old pentium box as a
director (Joe - rather than the 600Mhz 1Gmemory screamer he is using now)
> Probably - given squid can be a bit of a beast under high load, and fail
> just when you don't want it to, and the LVS code seems to be as stable as
> a very stable thing indeed, putting the director on a machine that's
> unlikely to fail is a Very Good Thing (tm).
it would also allow me to have all the real-servers identically
configured. IMO, that's a Good Thing. i like consistency - makes it
easier to isolate and track down problems....and it's easier to scale up
as required.
i've found that squid is as stable as the underlying operating system.
it's one of the reasons i always use debian for squid boxes rather
than RH or freebsd. i've reformatted several RH or FreeBSD squid boxes
which were quite flaky and unreliable - installed debian and they run
perfectly stable on exactly the same hardware. admittedly, i'm not as
good at tuning freebsd machines as i am at tuning linux boxes, and i've
heard lots of people say that freebsd does make a good squid box if you
tune it right.
> Take a look in your cache.log rather than your access.log - if you see
> info along the lines of "unexpected ICP reply from IP x.x.x.215" then
> that's possibly the root cause of your problem.
yep...
2000/01/25 18:46:25| WARNING: Ignored 1 replies from non-peer x.x.x.217
2000/01/25 18:48:32| Failed to select source for
'http://www.luv.asn.au/overheads.html'
2000/01/25 18:48:32| always_direct = -1
2000/01/25 18:48:32| never_direct = 1
2000/01/25 18:48:32| timedout = 1
2000/01/25 18:48:39| Detected DEAD Parent: x.x.x.8/squid/icp
here's where i encountered the first problem. some requests would
just fail. i'd get a message from the squid on my WS saying "unable
Conclusion:
the specific problem that i had was that another squid cache
(completely unrelated to the LVS squids) needed to talk ICP to the LVS
squid.
this wasn't working because the individual realservers were sending ICP
responses from their real IP addresses rather than from the VIP. squid
ignores ICP responses sent from unknown hosts.
the solution is to configure squid to use the VIP for ICP. e.g. in
/etc/squid.conf
udp_incoming_address VIP
Squid Appendix
ICP-service.txt
:
Without going into implementation details, (which is pointless since they
are likely to change anyway) the options we have regarding ICP are the
following - especially since we don't use it at all since it's a very bad
protocol:
1) Load balance ICP the same way we're load balancing HTTP.
2) Return HIT for every request.
3) Collate cache digests, do a lookup for each incoming request, and
return HIT if its a cache digest hit.
All three will be tested in the coming months. Before the pros/cons of
each, it's useful to review what a normal ICP hit or Cache Digest hit
means, and the correlate this to a clustered scenario, to show whether
there will be any biassing or not. (All 3 are useful in different ways)
Normal ICP, site-cache -> parent (which is part of a cluster)
Parent responds whether it has the object without querying it's peers.
Since popular content is spread across N machines, potential hit rate/N is
closer to the hit rate the site cache sees.
Normal Cache Digest, Precisely the same effect - execpt the digest lookup
takes the place of the ICP request-response pair.
Clear advantage to Digest's here eliminating the timeouts inherent in the
ICP protocol.
1) Load balance ICP the same way we're load balancing HTTP.
LVS balanced Cache Digest - like normal cache digest, the downstream cache
gets a digest representative of just one machine. Effective change for
users? None.
LVS balanced ICP - like normal ICP, the downstream caches gets a response
based on just one cache's content.
Effect of LVS balancing on normal peering arrangements - none.
eg peer X is one machine
peer Y is an LVS service
When we change the peering to from X to Y, no change as far as client. For
cluster we see better loading. (Assume X can ask all real servers in Y for
pages to- since it can peer with them directly)
Result - no bias, except for network transit times.
*Protocol* Problem: Neither digest nor ICP take into account local peers.
Hence a digest from any machine only represents the disk capacity of that
machine - in real terms this is around 3 million objects, falling very
short of the total capacity.
The same goes for ICP, except ICP has the secondary problem of
susceptability to packet loss.
This requires no code modification and is being tested - and is definitely
the 'zero change from current' option. (Including bringing old problems
with it)
2) Return HIT for every request that would return MISS.
Sounds bad initially, however our service is prepared to service all
misses, has a very high capacity and has a significantly lower ICP hit
rate than reality would indicate. Secondly most clients send as much
traffic that is likely to be cacheable our way (eg don't send POSTS,
https, hotmail, stuff with question marks in, searches) our way, and what
they're really using it for is to find out which parent is more likely to
have the object.
Since ULCC has lower capacity than MCC, this skews the traffic towards
where the capacity is. So on the whole, the system works OK. Clearly this
is bad for people who are just in a sibling relationship, but if that's
the case, they should be using cache digests with no-query.
This also means that the ICP service doesn't need to run on a box with
caching disk - eg the load balancer limiting the change to one machine.
This is simple to write and is being tested.
3) collate cache digests, do a lookup for each incoming request, and
return HIT if its a cache digest hit for any machine in the cluster.
Sounds good on the surface of it - is simple to write, and is being
tested. Digest accuracy is a problem however, so again, it's not
perfect by a long shot. Couple a 5% failure ratio in with a bad protocol
it's not geat. It's probably the best option for an ICP service. But on
the whole people shouldn't be using ICP if they can help it. Essentially
as noted before ICP is the real problem here.
Again, this also means that the ICP service doesn't need to run on a box
with caching disk - eg the load balancer limiting the change to one
machine.
All three are testable indivdially without affecting service (other than
in the way intended) due to the fact we can use the LVS to front end
different versions, and transparently change from one to another.
Since all 3 have their relative merits, and disadvantages all 3 are being
tested.
#---squid-LVS-HOWTO
|