LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: MAC address on dummy0

To: Roberto Nibali <ratz@xxxxxx>
Subject: Re: MAC address on dummy0
Cc: Julian Anastasov <ja@xxxxxx>, Joseph Mack <mack.joseph@xxxxxxx>, lvs-users@xxxxxxxxxxxxxxxxxxxxxx
From: Alexandre Cassen <Alexandre.Cassen@xxxxxxxxxx>
Date: Fri, 24 Aug 2001 13:16:35 +0200
Hi ratz,

Yes, I'm still alive and I still have to reply to hundreds of emails,
especially some important ones from Alexandre (je sais que je fais chier!)

nice to hear about you and no pbs (tu fais pas chier, je comprends c'est pareil pour moi, j suis overloaded)

> >         The missing (which is may be needed for this VRRP spec)
> >is the feature to select source MAC for the outgoing traffic according
> >to the source IP. But I don't know whether this is possible to
> >add into the networking code. Then on some cards one can send traffic
> >with different MAC. But I have never played with VRRP.

1. You cannot select the source MAC with the current implementation
2. You might be able to add it to the network code but this will
   add a lot of ugliness to your code and you need another patch
3. You are right, there are some cards that can overrun the NIC node
   address but you might not get the specs and even if, only a limited
   number of cards would support it.
4. Cisco switches (>5500), in contrary to 3com switches _really_ don't like
   virtual MAC's. You might end up with a blocked port on the switch.
5. Tell me if you can set proxy_arp on the dummy0 interface: If I remember
   the code this would mean, that all other interfaces will respond to arp
   queries destined for addresses on this interface. We needed this once
   for bridging. I might be completely wrong though, haven't checked the
   code in a while.
6. Some pointers for code you might want to browse/check:


http://www.ds9a.nl/2.4Routing/HOWTO/cvs/2.4routing/output/2.4routing-17.html
   http://bridge.sourceforge.net/
   http://vlan.sourceforge.net/
   The bonding driver in 2.4.x kernel source
   Ask Andi Kleen (I think kleen@xxxxxxx)

   They all did mess around with having some sort of virtual MAC.

> Currently I process user space with a simple SIOCSIFHWADDR ioctl call.
> but need to shutdown interrface before ioctl call.

For obvious reasons :)

> So all outgoing packets own the VMAC which is the new physical interface
> MAC. This is the only way I have found to use VMAC (it is not very flexible
> because
> need to use a netlink routing fetcher to restore routings entries after
> VMAC is set).
>
> I have played with 3COM card to play with 3 differents MAC but it is
> extrimly hardware dependent... :/

Yes, remember the email I sent you about MAC addresses?

Yes exactly I have try on a vortex NIC.

||> > regarding changing MAC addr. In the actual VRRP implementation (made by
||> > jerome etienne), to change MAC address we need to shuting down and upping
||> > the interface to make new MAC addr persistent in the kernel => this
||> > operation cause a routing table flushing so we loose default gw for
example
||> > and all other rt entries refrering to that interface. A solution can be to
||> > hardcode a function performing a routeing table fetch called before
shuting
||> > down the interface and an other performing a restore rt called after the
||> > ioctl SIOCSIFFLAGS.

||> This is the best way although I'm still thinking of a way of how to change
||> the MAC without reinitializing the EEPROM of the NIC. The funny thing is,
||> that every NIC has 3 MAC addresses which are of course all three the same
||> but 2 of them can be changed.

||> station address    : Can be changed with ``ifconfig ... hw ether MAC''
||> NIC node address : This is not changeable unless you use a PAL programmer
||> OEM station address: Use D. Beckers tools to change this. For 3com for
||>                      example it is ``vortex-diag -# 1 -D -f -H MAC -w''

> In your opinion :
>
> [Pb description] : When running VRRP on a director with multiple Interfaces
> what is the best way to handle the roaming IP synchronization ? => If we

Depends on how you use those multiple Interfaces. If they serve different
zones, you must run a vrrpd for every interface. If they are concatenated,
like bonding, you don't, but then you have the second interface as IFF_SLAVE.

For my topology I have 2 NICs, one to serve the LAN and the other for the WAN => LVS NAT in my setup test.

> have eth0 & eth1 for example, should we run one VRRP instance on each
> interface or run only one VRRP instance onto the LAN interface (eth1) and
> synchronize VIP (eth0).

There we might rise some security concerns. But this is IMHO the only way
to get a solution, althouh I don't actually understand how you mean it. Can
you tell me an example setup where you have multiple NIC's that need fail-
over where the NIC's don't run in completely seperated physical segments?

Consider the PDF file I sent. When you have 2 LVS directors (LD1 & LD2) using LVS-NAT. On each ETH0 for WAN interface & ETH1 for LAN as default gw for the realserver pools.

If we run 2 VRRP instances on both LD1 & LD2 interfaces ETH0 & ETH1. If ETH0 on LD1 fails (hardware pb, cable cutted, NIC fire :)...), then ETH0 on LD2 takeover according to VRRP protocol. During that takeover to preserve routing path we need to synchronize VRRP instance on ETH1. So LD2 VRRP instance running on ETH0 send a higher priority VRRP advert to LD1 VRRP instance running on ETH0. When LD1 VRRP instance on ETH0 receive that advert it shutdown the VRRP VIP. Then LD2 VRRP instance on ETH0 set the VIP.

=> So we have right now both VRRP instance fully sync.

=> Will document this tonight :)

In fact when setting ipvsadm rules, for VIP I use VRRP VIP. (even more mixing it with LVS sync daemon, when takeover we can preserve the connection table)


> => The VRRP VIP is a simple ip alias on the eth interface.

Ohhh, so it's just another address? Is it a secondary address? Just curious
because you might set it like:

ip a a VRRP_VIP brd + dev eth0/dummy0 tentative

and modify the MAC layer code for tentative devices for outgoing packets.
And again, I haven't checked the code here and I might babble useless
stuff here.

To set the VMAC daemon use ioctl system call. To set VRRP VIPs use netlink RTM_NEWADDR call

---[ snipped code : Begin ]---

int ipaddr_op(int ifindex, uint32_t addr, int addF)
....
  req.n.nlmsg_len    = NLMSG_LENGTH(sizeof(struct ifaddrmsg));
  req.n.nlmsg_flags  = NLM_F_REQUEST;
  req.n.nlmsg_type   = addF ? RTM_NEWADDR : RTM_DELADDR;
  req.ifa.ifa_family = AF_INET;
  req.ifa.ifa_index  = ifindex;
  req.ifa.ifa_prefixlen  = 32;

  addr = htonl(addr);
  addattr_l(&req.n, sizeof(req), IFA_LOCAL, &addr, sizeof(addr));

  if (rtnl_open(&rth, 0) < 0)
    return -1;
  if (rtnl_talk(&rth, &req.n, 0, 0, NULL, NULL, NULL) < 0)
    return -1;
...
---[ snipped code : End ]---

=> so it is a secondary ip => is it compatible with LVS ? can we set a LVS VIP using a secondary interface IP ?

> [My opinion] : I agree that runing one VRRP instance per physical interface
> sounds nasty, but how can we handle a loose link synchronization ? For
> example if we run one VRRP instance onto the DIP (eth1). When the link on
> DIP interface (eth1) is down, and eth0 still up, the takeover appear. So

This depends on how you see your director. Maybe eth0 was serving a bunch
of customers and eth1 is serving a completely different amount of customers
in a completely different zone. You should then only takeover the eth0.
This implies you have to use a vrrpd per interface, as ugly as it gets :)

Agree

> Backup VRRP instance on DIP takeover but how can we sync VIP (eth0) ? at
> this point VIP run on to the wrong eth0 MASTER interface... so how can
> BACKUP director tell to MASTER director that is VIP running onto eth0
> should be removed ? .... it introduce a security issue and a real problems.

You have to add an additional layer of intelligence. The vrrpd has to be
able to interact with other vrrpd's and tell them the status, something
like RIP. If eth0 goes down on MASTER and eth1 is still ok on BLASTER, we
send a netlink message to vrrpd_eth1 and tell, that eth0 is now active
on BLASTER. This should not be too a difficult task (the last time I said
such a thing to my boss cost me 3 months of development).

VRRP do it for us.


> The only way we can prevent against network failures in that scenario is to
> use VRRP.

Or you might put your vrrpd on top of a bonding interface.

hmm, will investigate on it :)

> My point of vue is runing one VRRP instance per physical instance.

No question about that. Make sure you use an as secure transportation
and notification protocol as the least secure deployed one (TCP in
most cases)

I need to document that point.

> The current vrrp release I am working on is configured as following :
> vrrp_instance VI_1 {
>    interface eth0
>    virtual_router_id 50
>    preempt
>    authentication {
>      auth_type AH
>      auth_pass k@!v361
>    }
>    priority 100
>    advert_int 1
>    virtual_ipaddress {
>      192.168.200.11
>      192.168.200.12
>      192.168.200.13
>    }
>    sync_instance VI_2
> }

Wow! Reading this I get the impression of a big project going on there.

Can be a nice LVS addon.

> Thanks for your time,
> Alexandre

I'm sorry, I don't reply that often, but I just can't handle all the
projects right now and plus it's summer here with excellent 30 degrees
celsius. Half of the day I'm out playing beach volleyball or climbing.

No pbs :)

regards,
Alexandre






<Prev in Thread] Current Thread [Next in Thread>