On Tue, Apr 24, 2007 at 09:07:08AM -0700, Roberto Nibali wrote:
[snip]
> > @@ -1177,7 +1192,9 @@ sub read_config
> > $1 eq "pops" ||
> > $1 eq "radius"||
> > $1 eq "pgsql" ||
> > + $1 eq "scftp" ||
>
> s/scftp/sftp/, unless I'm missing something:
Fixed
>
> ratz@t43:~/mpt-status> grep -w sftp /etc/services
> sftp 115/tcp # Simple File Transfer Protocol
> sftp 115/udp # Simple File Transfer Protocol
> ratz@t43:~/mpt-status> grep -w scftp /etc/services
>
> > $1 eq "sip" ||
> > + $1 eq "ssh" ||
> > $1 eq "smtp")
> > or &config_error($line,
> > "service must " .
> > @@ -1189,7 +1206,8 @@ sub read_config
> > "oracle, " .
> > "pop, pops, " .
> > "radius, " .
> > - "pgsql, sip " .
> > + "pgsql, scftp " .
>
> Same as above: s/scftp/sftp/
Ditto
[snip]
> > +sub check_ssh {
> > + require Net::SSH::Perl; # Maybe Net::SSH2 would be preferrable? --ratz
>
> Regarding the fact that I haven't been able to even run the script
> (neither Net::SSH::Perl nor the Net::SSH2 module from CPAN did compile
> on my box.), I think putting this check into ldirectord is a bit early.
> But if I recall correctly you mentioned this before already.
This is definately a problem. I am treating this patch as experimental,
not to be merged, primarily because of this. I gather that you
implemented the external check as means of getting around this.
[snip]
> > + # Why oh why are there never any return values
> > + # documented in perl???
>
> We should probably take this out for production. I remember that after
> looking for 2 hours through the web not finding anyone documenting the
> stuff, I got highly irritated.
Removed
> > + $ssh->login( $$v{login}, $$v{passwd} );
> > + ( $stdout, $stderr, $exit ) = $ssh->cmd( $$r{request} );
> > + alarm(0);
> > + };
> > +
> > + if ( @$ eq "timeout\n" ) {
> > + service_set( $v, $r, "down" );
> > + return 1;
> > + }
>
> So here a function return of 1 means SERVICE_DOWN? As I wrote in my last
> email regarding this, I'm a bit unsure if I understood the correct
> meaning of the return value of a health check function.
Confusingly yes, UP=0, DOWN=1. I'm happy to switch it around though I
see little gain.
> > + # Buggy as hell, but hey, I can't even test the damn thing,
> > + # because of the broken CPAN:
> > + # a) Not every correct command returns 0
> > + # b) Not every command writes to stdout
> > + #
> > + if ( $exit = 0 and $stdout =~ $$r{receive} ) {
> > + service_set( $v, $r, "up" );
> > + return 0;
> > + } else {
> > + service_set( $v, $r, "down" );
> > + return 1;
> > + }
> > +}
>
> The above needs broad testing and probably some more lines of code :).
Agreed.
> > +sub check_sftp {
> > + require Net::SFTP;
> > + my ( $v, $r ) = @_;
> > + my $sftp;
> > + my $port = ( defined $$v{checkport} ? $$v{checkport} : $$r{port} );
> > +
> > + &ld_debug( 2, "Checking sftp server=$$r{server} port=$port" );
> > +
> > + eval {
> > + sub get_callback {
> > + my ($sftp, $data, $offset, $size) = $@;
> > + &ld_debug(2, "Read $offset / $size bytes");
> > + }
> > +
> > + local $SIG{'__DIE__'} = "DEFAULT";
> > + local $SIG{'ALRM'} = sub { die "timeout\n"; };
> > + alarm( $$v{negotiatetimeout} );
> > + unless ( $sftp = Net::SFTP->new( $$r{server},
> > + login => $$v{login},
> > + password => $$v{passwd},
> > + ssh_args =>
> > + [port => $port]) ) {
> > + service_set( $v, $r, "down" );
> > + return 1;
> > + }
> > + alarm(0);
> > + };
>
> I hope having a callback here does not introduce some kind of weird
> behaviour with regard to internal siglongjmp() calls in Perl. I don't
> trust Perl a lot, especially when it comes to signalling.
I think that having the get_callback() and setup of the connection
inside the eval{} and the call to $sftp->get outside it is probably
asking for trouble. I think that moving $sftp->get inside the eval
would be ok, as the timeout should cover the total time for the test,
not just the connect time. As for if that would work. I say "probably".
--
Horms
H: http://www.vergenet.net/~horms/
W: http://www.valinux.co.jp/en/
|