Answering to myself..
I was trying to use a non-default port for the mysql service and I was
not able to make ldirector use it for the mysql query (even using
chekport). Using the standard 3306 port everything works great.
Thanks to those who help me,
Samuel.
2005/12/16, samuel <samu60@xxxxxxxxx>:
> Thanks a lot!
>
> Unfortunately it did not solve the problem: ldirector still does not
> assign weight=0 to the failing nodes.
> Might be related to the fact that when ldirectord starts there is one
> node from the ldirectord.cf file already down???
>
> Reading docs,
> SAmuel.
>
> 2005/12/16, techp@xxxxxxxxxxx <techp@xxxxxxxxxxx>:
> > I,
> >
> > Trie with the option :
> > quiescent=no
> >
> > and read the doc to see implications!
> >
> > Laurent
> >
> >
> > -----Message d'origine-----
> > De: lvs-users-bounces@xxxxxxxxxxxxxxxxxxxxxx
> > [mailto:lvs-users-bounces@xxxxxxxxxxxxxxxxxxxxxx] De la part de samuel
> > Envoyé: vendredi 16 décembre 2005 11:56
> > À: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
> > Objet: ultramonkey´s "Streamline Highly Availability and Load
> > Balancing"
> >
> > Hi all!!!
> >
> > I've just started playing around with HA systems so please forgive me
> > if the answers have been already provided in the list or somewhere
> > else (in this case, could you please provide a link?). I've looked
> > around in older threads unsuccesfully...
> >
> > I have followed the instructions in www.ultramonkey.org site for
> > setting up a Streamline High Availability and Load Balancing system
> > with a mysql cluster as real server. I know it's better to start with
> > simpler setups but I ran out of machines so I had to put the load
> > balancer and the replicated real servers in the same machines.
> > The config is the following: Virtual IP=192.168.1.125
> > node1=192.168.1.123 node2=192.168.1.124
> > Virt. IP=.125
> > ------------------- | -----------------
> > | ldirectord1 | | | ldirectord2 |
> > | mysqlAPI1 |-------------------| mysqlAPI2 |
> > ------------------- -----------------
> > node1 IP=.123 node2 IP= .124
> >
> > The problem is that when a node fails, the survivor ldirectord does
> > not remove the failed node from the routing tables, with the funny
> > thing that one every two requests succeeds (algorithm wrr) and the
> > other fails with a myconnection error.
> >
> > I add as much output as I have at the bottom so please take a look and
> > find the error I made (I hope not to exceed the list's limit).
> >
> > Thanks a lot,
> > Samuel.
> >
> >
> >
> > My config files are adaptions from the ultramonkey web site:
> >
> > ha.cfg:
> > mcast eth0 225.255.255.2 695 1 0
> > auto_failback off
> > node cmysql_mysqld_1 #return from uname -n
> > node cmysql_mysqld_2
> > ping 192.168.1.254
> > respawn hacluster /usr/lib/heartbeat/ipfail
> >
> > haresources:
> > node1 \
> > ldirectord::ldirectord.cf \
> > LVSSyncDaemonSwap::master \
> > IPaddr2::192.168.1.125
> >
> >
> > ldirectord.cf:
> > checktimeout=10
> > checkinterval=2
> > autoreload=no
> > logfile="var/log/ldirectord.log"
> > logfile="local0"
> > quiescent=yes
> >
> > virtual=192.168.1.125:3307
> > real=192.168.1.123:3307 gate
> > real=192.168.1.124:3307 gate
> > fallback=127.0.0.1:3307 gate
> > checktype=negotiate
> > login="ser"
> > passwd="heslo"
> > database="ser"
> > request="SELECT * from version"
> > scheduler=wrr
> >
> > Succession of ping and ipvsadm....weight remains 1 although it is
> > unreachable!!!
> > cmysql_mysqld_2:/etc/ha.d# ping 192.168.1.123
> > PING 192.168.1.123 (192.168.1.123) 56(84) bytes of data.
> > >From 192.168.1.124 icmp_seq=1 Destination Host Unreachable
> > >From 192.168.1.124 icmp_seq=2 Destination Host Unreachable
> > >From 192.168.1.124 icmp_seq=3 Destination Host Unreachable
> >
> > --- 192.168.1.123 ping statistics ---
> > 5 packets transmitted, 0 received, +3 errors, 100% packet loss, time
> > 4046ms
> > , pipe 3
> > cmysql_mysqld_2:/etc/ha.d# ipvsadm -L -n
> > IP Virtual Server version 1.0.11 (size=4096)
> > Prot LocalAddress:Port Scheduler Flags
> > -> RemoteAddress:Port Forward Weight ActiveConn InActConn
> > TCP 192.168.1.125:3307 wrr
> > -> 192.168.1.124:3307 Local 1 0 0
> > -> 192.168.1.123:3307 Route 1 0 0
> >
> >
> > Extract from /var/log/messages...it restores the failed node with
> > weight=1 and do not remove it later...
> > Dec 16 08:19:16 localhost heartbeat[3732]: info: Received shutdown
> > notice from 'cmysql_mysqld_1'.
> > Dec 16 08:19:16 localhost heartbeat[3732]: info: Resources being
> > acquired from cmysql_mysqld_1.
> > Dec 16 08:19:16 localhost heartbeat[3793]: info: acquire local HA
> > resources (standby).
> > Dec 16 08:19:17 localhost heartbeat[3794]: info: No local resources
> > [/usr/lib/heartbeat/ResourceManager listkeys cmysql_mysq
> > ld_2] to acquire.
> > Dec 16 08:19:17 localhost heartbeat[3793]: info: local HA resource
> > acquisition completed (standby).
> > Dec 16 08:19:17 localhost heartbeat[3732]: info: Standby resource
> > acquisition done [all].
> > Dec 16 08:19:17 localhost heartbeat: info: Running /etc/ha.d/rc.d/status
> > status
> > Dec 16 08:19:17 localhost heartbeat: info: Taking over resource group
> > ldirectord::ldirectord.cf
> > Dec 16 08:19:17 localhost heartbeat: info: Acquiring resource group:
> > cmysql_mysqld_1 ldirectord::ldirectord.cf LVSSyncDaemon
> > Swap::master IPaddr2::192.168.1.125
> > Dec 16 08:19:18 localhost ldirectord[3847]: ldirectord is stopped for
> > /etc/ha.d/conf/ldirectord.cf
> > Dec 16 08:19:18 localhost ldirectord[3847]: Exiting with exit_status
> > 3: Exiting from ldirectord status
> > Dec 16 08:19:18 localhost heartbeat: info: Running
> > /etc/ha.d/resource.d/ldirectord ldirectord.cf start
> > Dec 16 08:19:19 localhost ldirectord[3867]: Starting Linux Director
> > v1.77.2.32 as daemon
> > Dec 16 08:19:19 localhost ldirectord[3869]: Added virtual server:
> > 192.168.1.125:3307
> > Dec 16 08:19:19 localhost ldirectord[3869]: Added fallback server:
> > 127.0.0.1:3307 ( x 192.168.1.125:3307) (Weight set to 1)
> > Dec 16 08:19:20 localhost ldirectord[3869]: Quiescent real server:
> > 192.168.1.123:3307 mapped from 192.168.1.123:3307 ( x 192
> > .168.1.125:3307) (Weight set to 0)
> > Dec 16 08:19:20 localhost heartbeat: info: Running
> > /etc/ha.d/resource.d/LVSSyncDaemonSwap master start
> > Dec 16 08:19:20 localhost ldirectord[3869]: Quiescent real server:
> > 192.168.1.124:3307 mapped from 192.168.1.124:3307 ( x 192
> > .168.1.125:3307) (Weight set to 0)
> > Dec 16 08:19:20 localhost ldirectord[3869]: Restored real server:
> > 192.168.1.123:3307 ( x 192.168.1.125:3307) (Weight set to
> > 1)
> > Dec 16 08:19:20 localhost kernel: IPVS: stopping sync thread 3393 ...
> > Dec 16 08:19:20 localhost kernel: IPVS: sync thread stopped!
> > Dec 16 08:19:20 localhost heartbeat: info: ipvs_syncbackup down
> > Dec 16 08:19:20 localhost ldirectord[3869]: Deleted fallback server:
> > 127.0.0.1:3307 ( x 192.168.1.125:3307)
> > Dec 16 08:19:20 localhost kernel: IPVS: sync thread started.
> > Dec 16 08:19:21 localhost heartbeat: info: ipvs_syncmaster up
> > Dec 16 08:19:21 localhost heartbeat: info: ipvs_syncmaster obtained
> > Dec 16 08:19:21 localhost ldirectord[3869]: Restored real server:
> > 192.168.1.124:3307 ( x 192.168.1.125:3307) (Weight set to
> > 1)
> > Dec 16 08:19:21 localhost heartbeat: info: Running
> > /etc/ha.d/resource.d/IPaddr2 192.168.1.125 start
> > Dec 16 08:19:21 localhost heartbeat: info: Removing conflicting loopback
> > lo.
> > Dec 16 08:19:21 localhost heartbeat: info: /bin/ip -f inet addr delete
> > 192.168.1.125 dev lo
> > Dec 16 08:19:21 localhost heartbeat: info: /bin/ip -o -f inet addr show
> > lo
> > Dec 16 08:19:21 localhost heartbeat: info: /bin/ip route delete
> > 192.168.1.125 dev lo
> > Dec 16 08:19:21 localhost heartbeat: info: /bin/ip -f inet addr add
> > 192.168.1.125/24 brd 192.168.1.255 dev eth0
> > Dec 16 08:19:21 localhost heartbeat: info: /bin/ip link set eth0 up
> > Dec 16 08:19:21 localhost heartbeat: /usr/lib/heartbeat/send_arp -i
> > 200 -r 5 -p /var/lib/heartbeat/rsctmp/send_arp/send_arp-
> > 192.168.1.125 eth0 192.168.1.125 auto 192.168.1.125 ffffffffffff
> > Dec 16 08:19:22 localhost heartbeat: info:
> > /usr/lib/heartbeat/mach_down: nice_failback: foreign resources
> > acquired
> > Dec 16 08:19:22 localhost heartbeat[3732]: info: mach_down takeover
> > complete.
> > Dec 16 08:19:22 localhost heartbeat: info: mach_down takeover complete
> > for node cmysql_mysqld_1.
> > Dec 16 08:19:47 localhost heartbeat[3732]: WARN: node cmysql_mysqld_1:
> > is dead
> > Dec 16 08:19:47 localhost heartbeat[3732]: info: Dead node
> > cmysql_mysqld_1 gave up resources.
> > Dec 16 08:19:47 localhost heartbeat[3732]: info: Link
> > cmysql_mysqld_1:eth0 dead.
> > Dec 16 08:19:47 localhost ipfail[3741]: info: Status update: Node
> > cmysql_mysqld_1 now has status dead
> > Dec 16 08:19:47 localhost ipfail[3741]: info: NS: We are still alive!
> > Dec 16 08:19:47 localhost ipfail[3741]: info: Link Status update: Link
> > cmysql_mysqld_1/eth0 now has status dead
> > Dec 16 08:19:47 localhost ipfail[3741]: info: Asking other side for
> > ping node count.
> > Dec 16 08:19:47 localhost ipfail[3741]: info: Checking remote count of
> > ping nodes.
> > _______________________________________________
> > LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
> > Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
> > or go to http://www.in-addr.de/mailman/listinfo/lvs-users
> >
> >
> > _______________________________________________
> > LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
> > Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
> > or go to http://www.in-addr.de/mailman/listinfo/lvs-users
> >
>
|