Re: keepalived (was Re: News contrib to LVS)

To:	Alexandre CASSEN <alexandre.cassen@xxxxxxxxxxxxxx>
Subject:	Re: keepalived (was Re: News contrib to LVS)
Cc:	lvs-users@xxxxxxxxxxxxxxxxxxxxxx
From:	ratz <ratz@xxxxxx>
Date:	Wed, 03 Jan 2001 14:50:46 +0100

Hi Alexandre,

> :) I think MD5 encryption is enought, and in such a developpement, we need
> to integrate this problematic.
> >it. So and now tell me how your half-open tcpcheck will be secured against
> Seq-
> >Number attacks, if it even doesn't check them? ;)
> 
> :) Ok, In the new release (0.2.3) I wanted to post yesterday (sourceforge
> down :/) and will post today, The tcp check implement :

I'll have a look at it this weekend.
 
> 1. Create a random (simple not hard random) TCP sequence number.

Why do you generate an extra random ISN?

> 2. Send to the remote host a TCP/IP packet based on that sequence flagged
> to SYN
> 3. Using a two level timouted handled function : tcpcheck wait for SYN|ACK
> reply

hmm, so you set the timeout for the SYN|ACK return? You just have to pay 
attention that if someone sets the SYN|ACK timeout higher than in the value
in the proc-fs your socket might be closed before you want it to be.

> 4. When a SYN|ACK packet is received, it looks for the ack seq number
> expected, for @IP src & port
> src.

Good, now tell me, how many parallel checks can you perform? I mean you have,
let's say in a productionary environment, 50 VIPs and 30 of them need tcpcheck
for their realserver. How do you intend to handle the parallelism? I mean, as
soon as you have a non sane result from the check you take the server/service
out and if the check is back ok, you like to insert the server/service back into
the cluster configuration. You also have to handle the case where the last
server
of a VIP will be taken out. This is a special case. Tell me your ideas. 

> 5. If packet headers mistmatch 4. tcpcheck wait for 2 scdes to received a
> good answer.

You shouldn't hardcore this. This must be settable. You may want to check a
service every 5 seconds and leave 2 secs of response delay but you also might
check a not so important service every 30 seconds and have the timeout to 5
seconds. BTW: I saw that you use in your code somewhere following structure:

int recvfrom_to(int s, char *buf, int len, struct sockaddr *saddr, int timo){
struct timeval to;
to.tv_sec  = timo/1000;
to.tv_usec = 0;
[...]
nfound = select(s+1,&readset,&writeset,NULL,&to);
[...]

Just recall that to.tv_sec is a long! So your granularity should be split up
between to.tv_sec and to.tv_usec.

> 6. If good SYN|ACK is not received within 2 scdes, the packet is probably
> loosed -or network congestion) so tcpcheck perform a 3 times retry.

Hmm, this must also be selectable. You should never hardcore this. If you 
have access to the Alteon load balancer GUI or the ServerIron load balancer
GUI, you can see if you dig deep into the configuration that you can set the
amount of repeating tests a healthcheck performs before it will return a 
EFAILED.
 
> => Finally if no answer is received, we assume that the packet is not
> received, so the check is false.

I agreee :)
 
> If I had time I probably add a MAC address check.

Wouldn't do that since poking around with MAC addresses within a LVS-cluster
is strong tobacco.
 
> >I could even imagine having a separate dedicated probe that does the
> >healthchecking
> >and reporting to the director which in turn does a setsockopt or a netlink
> >socket to inform the kernel to change the LVS-parameters.
> 
> I my mind this is a good design, we limit the communication exchange. The
> advantage is that the checks implementations doesn't depend on the OS type.

That would be nice of course. (although we do all like Linux, don't we:)
 
> >
> >>     2. An advanced keepalived daemon working with a listener on the
> >>     director. All the servers push information to this
> >>     listener. Finally the listener send action to LVS via setsockopt.
> >
> >For me this is just another set of healthchecks but remote healthchecks.
> You
> >need them for example to monitor CPU, RAM unless you take snmp.
> 
> Yes, we can have a design where we use 2 servers. One for LVS, the other
> for healthcheck. The second can run a snmp engine using SNMP TRAP. This

For heaven's sake, please don't use snmptrap!! That's like all the shit HP
Openview or BMC Patrol or Winblows NetBIOS or TXE shit does. It fills up your 
network with crap packets which 95% of the time get lost somewhere and waste
bandwith. If you want information, you get it, if not, keep quiet. We like
to have control over the internal physical net of the LVS-cluster. I tell
you, if you once had to tcpdump in a heterogenic Win$loth environment to
find out why the cluster doesn't work you have to use very long regex syntax
to tcpdump to filter out all the mentioned waste-traffic.

> server can be connected to the LVS server using a secure or dedicated
> connection.

Nope, the LVS-server connects. We have to maintain a security hierarchy.
The securest box should be the load balancer. If you hack it, bye bye load
balancing anyway. It's like if you design a database server for your 
firewall logs. You would never send them to the machine, but the machine
would connect to the server and get the logs. So you need no listener on
the logserver and if you establish a connection you use an unpriviledged
port.

Regards,
Roberto Nibali, ratz

-- 
mailto: `echo NrOatSz@xxxxxxxxx | sed 's/[NOSPAM]//g'`

<Prev in Thread]	Current Thread	[Next in Thread>
Re: keepalived (was Re: News contrib to LVS), ratz Re: keepalived (was Re: News contrib to LVS), Julian Anastasov Re: keepalived (was Re: News contrib to LVS), ratz Re: keepalived (was Re: News contrib to LVS), Julian Anastasov Re: keepalived (was Re: News contrib to LVS), ratz Re: keepalived (was Re: News contrib to LVS), ratz <=

Previous by Date:	Re: [Linux-ha-dev] linuxvirtualserver.org gone away? WTF?, Lars Marowsky-Bree
Next by Date:	Re: Performance tuning (single/dual cpu), Pat O'Rourke
Previous by Thread:	Re: keepalived (was Re: News contrib to LVS), ratz
Next by Thread:	Re: MASQ/LVS connection state handling (was Re: help,question), ratz
Indexes:	[Date] [Thread] [Top] [All Lists]