LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Suumary of talk by Herbie Pearthree of IBM

To: Joseph Mack <mack.joseph@xxxxxxxxxxxxxxx>, mack@xxxxxxxxxxx
Subject: Suumary of talk by Herbie Pearthree of IBM
Cc: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
From: Joseph Mack <mack.joseph@xxxxxxx>
Date: Tue, 10 Oct 2000 13:44:54 -0400
I belong to the North Carolina sysadmins assoc, who put on
monthly talks on topics of interest to members

Last night's talk was by Herbie Pearthree, who heads the team of people
who setup and teardown IBM's websites for transitory sporting events
(like the Olympics, PGA, Wimbledon...). They are responsible for the
hardware, software and monitoring of the websites (for performance,
security). The content is provided for them.

The webfarms consist of replicated fileservers, webservers (AIX boxes),
2 layers of caching, net dispatchers and routers. (The net dispatchers 
were copied by Wensong to make VS-DR). 

For any event, several identical webfarms are set up at different
NAPS (network access points, large sites where ISPs feed into the
backbone, these are at New York, San Francisco for instance). 
The fileservers and webservers are all loaded with content and configured
(and reconfigured on the fly) remotely. The web farm sites are called 
"dark sites" because no-one is there. Replication of file systems and
network configuration is all automatic and new sites are put up and pulled
down in a matter of a small number of hours. After an event, a small machine
is left to serve the content, for the small number of people who come to 
the site.

The idependant (but identical content) webfarms present the same 8 IPs to 
the outside world and clients are directed to the nearest webfarm
by selective advertising of routes at the NAP. Coarse regulation of load
(to prevent overload) occurs by deleting one of the IPs at say the San Francisco
site, reducing the load by 12.5%, and requests to that (deleted) 
IP are then forward to the New York site.

As much web content as possible (95%) is flat files. There are no cgi's
(except for things like guest books), as they are slow and security hazards.
What appears to be dynamic content (eg scoreboards, inventory lists
for shopping carts) is flat files generated whenever anything in the 
underlying database(s) is changed.

The caches are diskless, storing everything in RAM. Wimbledon has about
250M of content and the caches have 200M RAM. All content has minimal 
expiry times (scoreboards are 6mins, photos 1day), to keep the caches
flushed with new material.

Each tcpip layer of the webfarm is individually tweaked (TTL, mtu) to 
increase speed.

In front of the caches are the net dispatchers. The most interesting point
of the talk for me, was that there is no default route on the net 
dispatchers (for LVS, the director would have no default route). 
A default route is not used in a normal LVS as the director 
never replies to anyone. Not having a default route 
stops people from port scanning you - you don't appear to exist.

While 8IPs are presented to the outside world, a typical webfarm has 40,000 
internal IPs. There are LANs for disk access, LANs for administration, LANs
for monitoring... If anyone breaks in, they are stuck in their layer. 
Everything is duplicated, even the NICs which are channel-bonded
through separate switches, incase one NIC dies. (With AIX, communication
continues
if one channel-bonded NIC dies, with Linux, the network has to re-arp to start
again. It takes a minute or 2, but is automatic). 

Load balancing in the routers outside the netdispatchers is done using BGP
and making the routers an AS. 

Opensource tools are used, perl, spong and RRDtool - the latter 2 being network
monitoring tools. In the spirit of independant duplication, IBMs own network 
monitoring tools are also used.

During the Olympics, they saw every attack they knew in the firewall logs.
(They said nothing got through). Apparently they were hammered with attacks
through the whole olympics.

Joe
--
Joseph Mack PhD, Senior Systems Engineer, Lockheed Martin
contractor to the National Environmental Supercomputer Center, 
mailto:mack.joseph@xxxxxxx ph# 919-541-0007, RTP, NC, USA


<Prev in Thread] Current Thread [Next in Thread>