Re: [lvs-users] LVS servers slowly draining traffic causing outages

To:	"LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject:	Re: [lvs-users] LVS servers slowly draining traffic causing outages
From:	LDB <thesource@xxxxxxxxxxx>
Date:	Thu, 06 Mar 2008 07:24:13 -0500

Any help would great ... I still cannot figure out why LVS just
arbitrarily drains the web traffic to zero and then a restart
fixes the problem.

LDB

LDB wrote:
> I hope the community can help me ...
> 
> My problem is that if the resources are on either server, over a period
> of 6 to 9 hours the web traffic begins to drain to zero while I am monitoring
> it using the command:
> 
>    ipvsadm -L --rate
> 
> I do not get an errors in /var/log/ha-debug or /var/log/ha-log. Something is
> wrong with the configuration, but I cannot determine what is wrong with it.
> 
> Also, failover works sometimes and the for some reason LVS has an affinity 
> with
> LVS2 and not LVS1.
> 
> 
> Any help would be much appreciated ...
> 
> 
> 
> 
> 
> 
> 
> uname -a =  Linux lvs1.example.org 2.4.21-27.0.2.EL.um.1smp #1 SMP Wed Jan 19
> 17:21:01 JST 2005 i686 i686 i386 GNU/Linux
> 
> uname -a = Linux lvs2.example.org 2.4.21-27.0.2.EL.um.1smp #1 SMP Wed Jan 19
> 17:21:01 JST 2005 i686 i686 i386 GNU/Linux
> 
> 
> 
> 
> 
> 
> 
> ls -al /etc/ha.d/resource.d:
> 
> drwxr-xr-x    2 root  root  2048 Feb 22 17:27 .
> drwxr-xr-x    8 root  root  2048 Mar  4 08:22 ..
> -rwxr-xr-x    1 root  root  8301 Jan 26  2004 apache
> -rwxr-xr-x    1 root  root  2094 Jan 26  2004 AudibleAlarm
> -rwxr-xr-x    1 root  root  5661 Jan 26  2004 db2
> -rwxr-xr-x    1 root  root  1379 Jan 26  2004 Delay
> -rwxr-xr-x    1 root  root  7347 Jan 26  2004 Filesystem
> -rwxr-xr-x    1 root  root  4131 Jan 26  2004 ICP
> -rwxr-xr-x    1 root  root 22077 Jan 26  2004 IPaddr
> -rwxr-xr-x    1 root  root  5258 Jan 26  2004 IPsrcaddr
> lrwxrwxrwx    1 root  root    20 Aug 22  2006 ldirectord -> 
> /usr/sbin/ldirectord
> -rwxr-xr-x    1 root  root  5401 Jan 26  2004 LinuxSCSI
> -rwxr-xr-x    1 root  root  3512 Jan 26  2004 LVM
> -rwxr-xr-x    1 root  root  1841 Jan 26  2004 MailTo
> -rwxr-xr-x    1 root  root  6849 Jan 26  2004 Raid1
> -rwxr-xr-x    1 root  root  9002 Jan 26  2004 ServeRAID
> -rwxr-xr-x    1 root  root  9620 Jan 26  2004 WAS
> -rwxr-xr-x    1 root  root  1901 Jan 26  2004 WinPopup
> 
> 
> 
> 
> 
> 
> 
> 
> lvs1:/etc/ha.d/ha.cf:
> lvs2:/etc/ha.d/ha.cf:
> 
> #
> #       There are lots of options in this file.  All you have to have is a set
> #       of nodes listed {"node ...}
> #       and one of {serial, bcast, mcast, or ucast}
> #
> #       ATTENTION: As the configuration file is read line by line,
> #                  THE ORDER OF DIRECTIVE MATTERS!
> #
> #                  In particular, make sure that the timings and udpport
> #                  et al are set before the heartbeat media are defined!
> #                  All will be fine if you keep them ordered as in this
> #                  example.
> #
> #
> #       Note on logging:
> #       If any of debugfile, logfile and logfacility are defined then they
> #       will be used. If debugfile and/or logfile are not defined and
> #       logfacility is defined then the respective logging and debug
> #       messages will be loged to syslog. If logfacility is not defined
> #       then debugfile and logfile will be used to log messges. If
> #       logfacility is not defined and debugfile and/or logfile are not
> #       defined then defaults will be used for debugfile and logfile as
> #       required and messages will be sent there.
> #
> #       File to write debug messages to
> debugfile /var/log/ha-debug
> #
> #
> #       File to write other messages to
> #
> logfile /var/log/ha-log
> #
> #
> #       Facility to use for syslog()/logger
> #
> logfacility     local0
> #
> #
> #       A note on specifying "how long" times below...
> #
> #       The default time unit is seconds
> #               10 means ten seconds
> #
> #       You can also specify them in milliseconds
> #               1500ms means 1.5 seconds
> #
> #
> #       keepalive: how long between heartbeats?
> #
> keepalive 1
> #
> #       deadtime: how long-to-declare-host-dead?
> #
> deadtime 15
> #
> #       warntime: how long before issuing "late heartbeat" warning?
> #       See the FAQ for how to use warntime to tune deadtime.
> #
> warntime 10
> #
> #
> #       Very first dead time (initdead)
> #
> #       On some machines/OSes, etc. the network takes a while to come up
> #       and start working right after you've been rebooted.  As a result
> #       we have a separate dead time for when things first come up.
> #       It should be at least twice the normal dead time.
> #
> initdead 30
> #
> #
> #       nice_failback:  determines whether a resource will
> #       automatically fail back to its "primary" node, or remain
> #       on whatever node is serving it until that node fails.
> #
> #       The default is "off", which means that it WILL fail
> #       back to the node which is declared as primary in haresources
> #
> #       "on" means that resources only move to new nodes when
> #       the nodes they are served on die.  This is deemed as a
> #       "nice" behavior (unless you want to do active-active).
> #
> nice_failback off
> #
> #       hopfudge maximum hop count minus number of nodes in config
> #hopfudge 1
> #
> #
> #       Baud rate for serial ports...
> #       (must precede "serial" directives)
> #
> #baud   19200
> #
> #       serial  serialportname ...
> #serial /dev/ttyS0      # Linux
> #serial /dev/cuaa0      # FreeBSD
> #serial /dev/cua/a      # Solaris
> #
> #       What UDP port to use for communication?
> #               [used by bcast and ucast]
> #
> udpport 694
> #
> #       What interfaces to broadcast heartbeats over?
> #
> #bcast  eth0            # Linux
> #bcast  eth1 eth2       # Linux
> #bcast  le0             # Solaris
> #bcast  le1 le2         # Solaris
> #
> #       Set up a multicast heartbeat medium
> #       mcast [dev] [mcast group] [port] [ttl] [loop]
> #
> #       [dev]           device to send/rcv heartbeats on
> #       [mcast group]   multicast group to join (class D multicast address
> #                       224.0.0.0 - 239.255.255.255)
> #       [port]          udp port to sendto/rcvfrom (no reason to differ
> #                       from the port used for broadcast heartbeats)
> #       [ttl]           the ttl value for outbound heartbeats.  This affects
> #                       how far the multicast packet will propagate.  (1-255)
> #       [loop]          toggles loopback for outbound multicast heartbeats.
> #                       if enabled, an outbound packet will be looped back and
> #                       received by the interface it was sent on. (0 or 1)
> #                       This field should always be set to 0.
> #
> #
> #mcast eth0 225.0.0.1 694 1 0
> #
> #       Set up a unicast / udp heartbeat medium
> #       ucast [dev] [peer-ip-addr]
> #
> #       [dev]           device to send/rcv heartbeats on
> #       [peer-ip-addr]  IP address of peer to send packets to
> #
> #ucast eth0 192.168.1.2
> 
> # DNC Specific (to LVS2)
> #ucast eth1 192.168.10.16
> #ucast eth2 192.168.20.16  # 100MB Ethernet Port
> 
> bcast eth1 eth2
> 
> #
> #
> #       Watchdog is the watchdog timer.  If our own heart doesn't beat for
> #       a minute, then our machine will reboot.
> #
> #watchdog /dev/watchdog
> #
> #       "Legacy" STONITH support
> #       Using this directive assumes that there is one stonith
> #       device in the cluster.  Parameters to this device are
> #       read from a configuration file. The format of this line is:
> #
> #         stonith <stonith_type> <configfile>
> #
> #       NOTE: it is up to you to maintain this file on each node in the
> #       cluster!
> #
> #stonith baytech /etc/ha.d/conf/stonith.baytech
> #
> #       STONITH support
> #       You can configure multiple stonith devices using this directive.
> #       The format of the line is:
> #         stonith_host <hostfrom> <stonith_type> <params...>
> #         <hostfrom> is the machine the stonith device is attached
> #              to or * to mean it is accessible from any host.
> #         <stonith_type> is the type of stonith device (a list of
> #              supported drives is in /usr/lib/stonith.)
> #         <params...> are driver specific parameters.  To see the
> #              format for a particular device, run:
> #           stonith -l -t <stonith_type>
> #
> #
> #       Note that if you put your stonith device access information in
> #       here, and you make this file publically readable, you're asking
> #       for a denial of service attack ;-)
> #
> #
> #stonith_host *     baytech 10.0.0.3 mylogin mysecretpassword
> #stonith_host ken3  rps10 /dev/ttyS1 kathy 0
> #stonith_host kathy rps10 /dev/ttyS1 ken3 0
> #
> #       Tell what machines are in the cluster
> #       node    nodename ...    -- must match uname -n
> node    lvs1.example.org
> node    lvs2.example.org
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> lvs1:/etc/ha.d/ldirectord.cf:
> lvs2:/etc/ha.d/ldirectord.cf:
> 
> # Global Directives
> checktimeout=10
> checkinterval=5
> # fallback=192.168.10.15:80
> autoreload=yes
> logfile="/var/log/ldirectord.log"
> #logfile="local0"
> 
> # If quiescent=yes, set weight to 0 instead of removing from lvs
> # quiescent + persistent = bad mojo
> quiescent=no
> 
> # Virtual Server for Rediretors redir.democrats.org
> virtual=192.168.10.90:80
>         fallback=192.168.10.91:80 gate
>         real=192.168.10.92:80 gate
>         real=192.168.10.93:80 gate
>         service=http
>         request="ultramonkey.html"
>         receive="Redirector is up"
>         scheduler=rr
>         protocol=tcp
>         checktype=negotiate
> 
> # Virtual Server for proxy work in progress (rdr2,rdr3)
> virtual=192.168.10.180:80
>         #fallback=192.168.10.91:80 gate
>         #real=192.168.10.92:80 gate
>         real=192.168.10.93:80 gate
>         service=http
>         request="ultramonkey.html"
>         receive="Redirector is up"
>         scheduler=rr
>         protocol=tcp
>         checktype=negotiate
> 
> # Virtual Server for search.democrats.org
> virtual=192.168.10.46:80
>         #fallback=192.168.10.91:80 gate
>         real=192.168.10.92:80 gate
>         real=192.168.10.93:80 gate
>         service=http
>         scheduler=rr
>         protocol=tcp
>         #checktype=connect
>         checktype=negotiate
>         request="ultramonkey.html"
>         receive="Redirector is up"
> 
> # Virtual Service for new website (HTTP) www.democrats.org:80
> virtual=192.168.10.70:80
>         #fallback=192.168.10.79:80 gate
>         real=192.168.10.71:80 gate
>         real=192.168.10.72:80 gate
>         real=192.168.10.73:80 gate
>         real=192.168.10.75:80 gate
>         #real=192.168.10.74:80 gate REBUILDER
>         #real=192.168.10.76:80 gate
>         #real=192.168.10.77:80 gate
>         #real=192.168.10.78:80 gate
>         service=http
>         request="page/signup"
>         receive="communication is not authorized"
>         scheduler=rr
>         persistent=10800
>         protocol=tcp
>         checktype=negotiate
> 
> # # Virtual Service for new Website (HTTPS)  www.democrats.org:443
> virtual=192.168.10.70:443
>         #fallback=192.168.10.79:443 gate
>         #real=192.168.10.21:443 gate
>         real=192.168.10.71:443 gate
>         real=192.168.10.72:443 gate
>         real=192.168.10.73:443 gate
>         #real=192.168.10.23:443 gate
>         real=192.168.10.75:443 gate
>         #real=192.168.10.24:443 gate
>         service=https
>         request="page/signup"
>         receive="communication is not authorized"
>         scheduler=rr
>         persistent=10800
>         protocol=tcp
>         checktype=negotiate
> 
> # Virtual Service for DCCC site (HTTP) www.dccc.org
> virtual=192.168.10.80:80
>         fallback=192.168.10.143:80 gate
>         real=192.168.10.83:80 gate
>         real=192.168.10.84:80 gate
>         service=http
>         #request="index.html"
>         #receive="Democratic Congressional Campaign Committee"
>         #receive=""
>         scheduler=rr
>         # persistent=3600
>         protocol=tcp
>         checktype=connect
>         #checktype=negotiate
> 
> # Virtual Service for DCCC site (HTTPS) www.dccc.org
> virtual=192.168.10.80:443
>         fallback=192.168.10.143:443 gate
>         real=192.168.10.83:443 gate
>         real=192.168.10.84:443 gate
>         service=https
>         #request="index.html"
>         #receive=""
>         scheduler=rr
>         # persistent=600
>         protocol=tcp
>         #checktype=negotiate
>         checktype=connect
> 
> 
> 
> 
> 
> 
> 
> 
> 
> lvs1:/etc/ha.d/haresources:
> lvs2:/etc/ha.d/haresources:
> 
> lvs1.example.org IPaddr::192.168.10.46/24 ldirectord::/etc/ha.d/ldirectord.cf
> lvs1.example.org IPaddr::192.168.10.70/24 ldirectord::/etc/ha.d/ldirectord.cf
> lvs1.example.org IPaddr::192.168.10.90/24 ldirectord::/etc/ha.d/ldirectord.cf
> lvs1.example.org IPaddr::192.168.10.80/24 ldirectord::/etc/ha.d/ldirectord.cf
> lvs1.example.org IPaddr::192.168.10.180/24 ldirectord::/etc/ha.d/ldirectord.cf
> 
> 
> 
> 
> 
> 
> _______________________________________________
> LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
> Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
> or go to http://lists.graemef.net/mailman/listinfo/lvs-users
> 
>

<Prev in Thread]	Current Thread	[Next in Thread>
[lvs-users] LVS servers slowly draining traffic causing outages, LDB Re: [lvs-users] LVS servers slowly draining traffic causing outages, LDB Re: [lvs-users] LVS servers slowly draining traffic causing outages, LDB <= Re: [lvs-users] LVS servers slowly draining traffic causing outages, Graeme Fowler Re: [lvs-users] LVS servers slowly draining traffic causing outages, LDB Re: [lvs-users] LVS servers slowly draining traffic causing outages, LDB Re: [lvs-users] LVS servers slowly draining traffic causing outages, Joseph Mack NA3T Re: [lvs-users] LVS servers slowly draining traffic causing outages, LDB Re: [lvs-users] LVS servers slowly draining traffic causing outages, Graeme Fowler Re: [lvs-users] LVS servers slowly draining traffic causing outages, LDB Re: [lvs-users] LVS servers slowly draining traffic causing outages, Joseph Mack NA3T

Previous by Date:	[lvs-users] two VIPS on same port (w/ direct routing), Matthias Zeichmann
Next by Date:	Re: [lvs-users] LVS servers slowly draining traffic causing outages, Graeme Fowler
Previous by Thread:	Re: [lvs-users] LVS servers slowly draining traffic causing outages, LDB
Next by Thread:	Re: [lvs-users] LVS servers slowly draining traffic causing outages, Graeme Fowler
Indexes:	[Date] [Thread] [Top] [All Lists]