Re: VRRP & sync_instance & low-level NIC monitoring (Was: Re: Redirecto

To:	lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject:	Re: VRRP & sync_instance & low-level NIC monitoring (Was: Re: Redirector project for FreeBSD)
From:	Roberto Nibali <ratz@xxxxxxxxxxxx>
Date:	Sat, 30 Mar 2002 15:43:55 +0100

Hello Alexandre,

The source of the userguide is some where on my company SAN... I need tocorrect some english problems :)

I agree :) I could proofread it for you if you want me to. Also Ithought about giving a speech about LVS/HA/keepalived at the next SwissLinux developer conference. You're invited too of course.

Have worked this night on netlink fetcher for IF events.


Cool, does it work?

Ok we are sync :), this kind of state machine is a little hard toexplain :/

Yes, but I haven't really read your documentation thoroughly to thatpoint so it is not your fault. I should be reading the documentation.

Now I get it, I hope. This is a 4-bit state diagram with the bitsbeing: LVS1(eth0), LVS2(eth0), LVS1(eth1) and LVS2(eth1). AndFAULT_STATE is a result of a test, either MII beat failure or IFF_DOWNor routing changes or fwrules. According to the state transition tablewhich I haven't seen yet (but I will draw) you know what happens.Thank you Alexandre, my slow brain starts working now.
Exactly you got it. Adding a "sync_instance" VRRP extension, introduceside effect in sate machine and resulting of a protocol instability. Thesolution is adding a new state FAULT_STATE to workaround the instability.


I'm glad my brain still works.

Well that's why the HA folks invented the concept of a heartbeat. Andthat's what you need to implement too in your framework. I think Iunderstand your approach now and I have to tell you that the heartbeatis crucial. Your software needs to be capable of sending advertsthrough physically independant heartbeat VRRP instances two each forL1x and L2x. They do the STOMITH monitoring to avoid such protocolloops you're mentioning. I would go as far and exchange our HAsolution with your framework, if you can provide FS interaction fromHBs to the L1x/L2x transition, user defined healthchecks and servicereload on demand.
With your first approach, FS(x) is a function which has following setof interactions to derive it's new state:
o MII beat state changes
o routing changes
o IFF_UP & IFF_RUNNING

Normal desired path:
[L1x=M|L2x=B] ---> FS(L1x,link failure) ---> [L1x=B|L2x=M]

Unwanted path:
[L1x=M|L2x=B] ---> FS(L2x,cable cut)    ---> [proto loop noise, L2x=M]
With my approach you have interaction of the above mentioned 3 plusthe status information of the HBs which are physically separated. Thiscuts out the unwanted path and leaves you with the desiredfailover/failback path or state transition.
wow ... very nice !... Just to be sync with you => HB VRRP <=> MII probe+ IFF_UP|RUNNING + Routing update ?

Yes, exactly. Only that the MII probe should be optional since it is theonly thing that is not generally applicable. I'm very happy that youlike my framework and I hope that we don't reinvent SGI failover. Larswould know it. But AFAIC remember SGI's failover was more aboutapplication monitoring and clustering and not network failover/failback.

I'd rather have HB instances as a pool of resource. This is NICindependant and easy to implement.
Ok so for you during VRRP bootstrap, we register a HB thread(peerforming MII probe + IFF_..... + ...) and keep a global interfacestruct sync with the NIC states ? ... This is what I was thinking whenstarting coding.

Yes, just make the MII probe optional. The rest provides enough HA. I'vebeen working and coding stuff in the HA environment since years alreadyand it never occured to me that something wicked happened to the HBs.MII beat information is not needed for the HBs but a nice feature tohave. Make it configurable. This makes the state machine a little bitmore complex. I can try to draw it for you if you want.

Yes agreed. Will start first with the 2 checks and will add the routingmonitoring after since it will demand more work :)


A working keepalived with HB VRRP threads would be a very nice start.

does HB is a crosscable VRRP protocol independent (serial, ethernet,...) ? I can understand this is good for a hard moniroting because MII

Take it as a crosscable approach. Hardware vendors tend to use serialheartbeats too and while I agree this is a nice feature (protocolindependancy in the kernel) we don't want to complicate our lives withserial protocol implementations.

probe & the link is a deductive approach (if link down so we can notsend advert, but we can find state where link is up and advert can notbe sent....). But if user use SWITCH/HUB... the probability of a crashis very low... For me introducing an extra protocol part for monitoringLXX sound a little workaround. Such a protocol like VRRP can handlenatively... This is my current point of view with HB :) (but still opento discuss).

No hubs/switches for HB's. Such devices try to be more intelligent thenwe need them to be for a simple thing like HB functionality. You makeyour software behave like follows (something like that, a lot stillmissing but I need to go cooking):


if (HB) {
  pool_vs=create VRRP threads with L10/L20=M and L11/L21=B;
  if (poll(FS(M/pool_vs)==bad || poll(FS(B/pool_vs)==bad){
    check FS(HB) and work according to the state table;
    send advert over HB;
  }
  if (more_than_1_HB) {
    pool_hb=create VRRP threads with HB11/HB21=M and HB12/HB22=B;
    send adverts over M;
    if (poll(FS(M/pool_hb))==bad || poll(FS(B/pool_hb))==bad){
       handle state transition of pool_hb;
       // state table for pool_hb is a lot different than the one for
       // pool_vs, since you're allowed to have asymmetric routing.
    }
  else {
    create VRRP thread with HB11/HB21=M
    send adverts over M;
  }
else {
  current VRRP/keepalived implementation;
}

Best regards,
Roberto Nibali, ratz

<Prev in Thread]	Current Thread	[Next in Thread>
Re: Redirector project for FreeBSD, (continued) Re: Redirector project for FreeBSD, Roberto Nibali Re: Redirector project for FreeBSD, Roberto Nibali Re: VRRP & sync_instance & low-level NIC monitoring (Was: Re: Redirector project for FreeBSD), Alexandre Cassen Re: VRRP & sync_instance & low-level NIC monitoring (Was: Re: Redirector project for FreeBSD), Julian Anastasov Re: VRRP & sync_instance & low-level NIC monitoring (Was: Re: Redirector project for FreeBSD), Roberto Nibali Re: VRRP & sync_instance & low-level NIC monitoring (Was: Re: Redirector project for FreeBSD), Julian Anastasov Re: VRRP & sync_instance & low-level NIC monitoring, Alexandre Cassen Re: VRRP & sync_instance & low-level NIC monitoring, Julian Anastasov Re: VRRP & sync_instance & low-level NIC monitoring (Was: Re: Redirector project for FreeBSD), Alexandre Cassen Re: VRRP & sync_instance & low-level NIC monitoring (Was: Re: Redirector project for FreeBSD), Julian Anastasov Re: VRRP & sync_instance & low-level NIC monitoring (Was: Re: Redirector project for FreeBSD), Roberto Nibali <=

Previous by Date:	Re: VRRP & sync_instance & low-level NIC monitoring (Was: Re: Redirector project for FreeBSD), Roberto Nibali
Next by Date:	Re: VRRP & sync_instance & low-level NIC monitoring (Was: Re: Redirector project for FreeBSD), Julian Anastasov
Previous by Thread:	Re: VRRP & sync_instance & low-level NIC monitoring (Was: Re: Redirector project for FreeBSD), Julian Anastasov
Next by Thread:	problem with the weight, Octave
Indexes:	[Date] [Thread] [Top] [All Lists]