Re: Busted Cluster

To:	"LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject:	Re: Busted Cluster
From:	Rob <ipvsuser@xxxxxxxxxxxxxxxx>
Date:	Sun, 13 Mar 2005 04:40:15 -0800

Since no else is on right now, I'll offer this from my own experience.

I had a high number of inactive connections with apache set up to notuse keepalive at all. After activating keep alive in apache (LVS wasalready persist) the number of inactive connections went way down.

So in my case at least, it was connections that were setup, used for asingle GET for a gif, button, jpeg, js script, or other page componentthen the server closed the connection, only to open another for the nextgif, etc.

You might be able to use something like multilog to watch a bunch of thelogs at the same time to get an idea if the traffic looks like realpeople (get page 1, get page 1 images, get page 2, get page 2 images) orif it is random hammering from a dos attack.

I wrote a small shell script that pulled the recent log entries, countedthe hits per IP address for certain requests and then created a iptablesrule on the director (or some machine in front of the director) totarpit requests from that IP. This worked in my situation because weknew that certain URLs were only hit a small number of times during alegit use session (like a login page shouldn't be hit 957 times in anhour by the same external IP) This could help reduce the tide ofrequests if you are actually encountering a (d)dos. I ran it every 12minutes or so. If you are getting ddos'd the tarpit function of iptableshttp://www.securityfocus.com/infocus/1723 or the tarpit standalone canbe a great help. Also, Felix and his company seem to have helped somelarge companies deal with high traffic ddos attacks - http://www.fefe.de/

BTW, You might be interested in http://www.backhand.org/mod_log_spread/for centralized and redundant logging. That way you can run differentkinds of real time analysis with no extra load on the webservers or thenormal logging hosts by just having an additional machine join/subscribeto the multicast spread group with the log data.

Rob

OK I can't find my script, but this was the start of it, it is hardly ashell script (but someone may find it useful):Add a "grep blah" command just before the awk '{print $2}' if you wantjust certain requests or other filtering.


multidaychk.sh
#!/bin/sh
# look for mutliday patterns
# $1 is how many days back to search
# $2 is how many high usage IPs to list

ls -1tr /usr/local/apache2/logs/access_log.200*0 | tail -${1} | xargs -n1 cat | awk '{print $2}' | sort | uniq -c | sort -nr | head -${2}


byhrchk.sh
#!/bin/sh
# looks for IPs hitting during a certain hr of the day
# $1 is how many days back to search
# $2 is how many high usage IPs to list
# $3 is which hour of the day

ls -1tr /usr/local/apache2/logs/access_log.200*0 | tail -${1} | xargs -n1 cat | fgrep "2005:${3}" | awk '{print $2}' | sort | uniq -c | sort -nr| head -${2}


recentchk.sh
#!/bin/sh
# This just checks the latest X lines from the newest log file
# $1 is how many lines from the file
# $2 is how many high usage IPs to list

ls -1tr /usr/local/apache2/logs/access_log.200*0 | tail -1 | xargs -n 1tail -${1} | awk '{print $2}' | sort | uniq -c | sort -nr | head -${2}


HTH

nigel@xxxxxxxxxxx wrote:

Hi,

      Now the bad news. This weekend the web service we run came under 
increased load --- about an extra 10,000,000 queries per day ---- and we now 
have a busted cluster. Here is what IPVS looks like:

IP Virtual Server version 1.0.10 (size=65536)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  66.98.x.y:80 rr
  -> 66.98.x.y:80              Tunnel  1      37         337
  -> 67.15.x.y:80               Tunnel  1      14         382
  -> 66.98.x.y:80              Tunnel  1      6          131
  -> 207.44.x.y:80             Tunnel  1      21         325
  -> 66.98.x.y:80              Tunnel  1      57         422
  -> 207.44.x.y:80             Tunnel  1      12         354
  -> 69.57.x.y:80              Tunnel  1      33         355
  -> 67.15.x.y:80                Tunnel  1      71         274
  -> 67.15.x.y:80               Tunnel  1      12         378
  -> 207.44.x.y:80             Tunnel  1      5          345
  -> 66.98.x.y:80               Tunnel  1      59         301
  -> 67.15.x.y:80               Tunnel  1      2          347
  -> 67.15.x.y:80               Tunnel  1      19         375
  -> 69.57.x.y:80              Tunnel  1      10         132
  -> 69.57.x.y:80              Tunnel  1      3          128
  -> 67.15.x.y:80               Tunnel  1      15         361
  -> 69.57.x.y:80              Tunnel  1      8          128
  -> 67.15.x.y:80               Tunnel  1      229        303
  -> 67.15.x.y:80               Tunnel  1      16         372
  -> 67.15.x.y:80               Tunnel  1      125        317
  -> 67.15.x.y:80               Tunnel  1      12         367
  -> 207.44.x.y:80             Tunnel  1      13         333
  -> 207.44.x.y:80             Tunnel  0      144        5
  -> 66.98.x.y:80              Tunnel  1      10         404
  -> 207.44.x.y:80             Tunnel  0      0          0
  -> 207.44.x.y:80             Tunnel  1      132        277

 At this point the service works but is too slow. But in the next 60 seconds 
the - InActConn count grows to over 2000+ per real server - and the whole thing 
locks up.

* What precisely does the InActConn figures show?

Is this symptomatic of simply an overloaded cluster - or could it be a DOS  
problem.

Any insights or similar experiences would be much appreciated?

Kind regards,


Nigel

<Prev in Thread]	Current Thread	[Next in Thread>
Busted Cluster, nigel Re: Busted Cluster, Rob <= Re: Busted Cluster, Johan van den Berg Re: Busted Cluster, Simon Schwendemann Re: Busted Cluster, Johan van den Berg Re: Busted Cluster, Julian Anastasov Re: Busted Cluster, Johan van den Berg Re: Busted Cluster, Julian Anastasov Re: Busted Cluster, Horms Re: Busted Cluster, Nigel Hamilton Busted Cluster, nigel RE: Busted Cluster, Peter Mueller

Previous by Date:	Busted Cluster, nigel
Next by Date:	Re: Busted Cluster, Johan van den Berg
Previous by Thread:	Busted Cluster, nigel
Next by Thread:	Re: Busted Cluster, Johan van den Berg
Indexes:	[Date] [Thread] [Top] [All Lists]