I belong to the North Carolina sysadmins assoc, who put on
monthly talks on topics of interest to members
Last night's talk was by Herbie Pearthree, who heads the team of people
who setup and teardown IBM's websites for transitory sporting events
(like the Olympics, PGA, Wimbledon...). They are responsible for the
hardware, software and monitoring of the websites (for performance,
security). The content is provided for them.
The webfarms consist of replicated fileservers, webservers (AIX boxes),
2 layers of caching, net dispatchers and routers. (The net dispatchers
were copied by Wensong to make VS-DR).
For any event, several identical webfarms are set up at different
NAPS (network access points, large sites where ISPs feed into the
backbone, these are at New York, San Francisco for instance).
The fileservers and webservers are all loaded with content and configured
(and reconfigured on the fly) remotely. The web farm sites are called
"dark sites" because no-one is there. Replication of file systems and
network configuration is all automatic and new sites are put up and pulled
down in a matter of a small number of hours. After an event, a small machine
is left to serve the content, for the small number of people who come to
the site.
The idependant (but identical content) webfarms present the same 8 IPs to
the outside world and clients are directed to the nearest webfarm
by selective advertising of routes at the NAP. Coarse regulation of load
(to prevent overload) occurs by deleting one of the IPs at say the San Francisco
site, reducing the load by 12.5%, and requests to that (deleted)
IP are then forward to the New York site.
As much web content as possible (95%) is flat files. There are no cgi's
(except for things like guest books), as they are slow and security hazards.
What appears to be dynamic content (eg scoreboards, inventory lists
for shopping carts) is flat files generated whenever anything in the
underlying database(s) is changed.
The caches are diskless, storing everything in RAM. Wimbledon has about
250M of content and the caches have 200M RAM. All content has minimal
expiry times (scoreboards are 6mins, photos 1day), to keep the caches
flushed with new material.
Each tcpip layer of the webfarm is individually tweaked (TTL, mtu) to
increase speed.
In front of the caches are the net dispatchers. The most interesting point
of the talk for me, was that there is no default route on the net
dispatchers (for LVS, the director would have no default route).
A default route is not used in a normal LVS as the director
never replies to anyone. Not having a default route
stops people from port scanning you - you don't appear to exist.
While 8IPs are presented to the outside world, a typical webfarm has 40,000
internal IPs. There are LANs for disk access, LANs for administration, LANs
for monitoring... If anyone breaks in, they are stuck in their layer.
Everything is duplicated, even the NICs which are channel-bonded
through separate switches, incase one NIC dies. (With AIX, communication
continues
if one channel-bonded NIC dies, with Linux, the network has to re-arp to start
again. It takes a minute or 2, but is automatic).
Load balancing in the routers outside the netdispatchers is done using BGP
and making the routers an AS.
Opensource tools are used, perl, spong and RRDtool - the latter 2 being network
monitoring tools. In the spirit of independant duplication, IBMs own network
monitoring tools are also used.
During the Olympics, they saw every attack they knew in the firewall logs.
(They said nothing got through). Apparently they were hammered with attacks
through the whole olympics.
Joe
--
Joseph Mack PhD, Senior Systems Engineer, Lockheed Martin
contractor to the National Environmental Supercomputer Center,
mailto:mack.joseph@xxxxxxx ph# 919-541-0007, RTP, NC, USA
|