RE: LVS vs Piranha

To:	lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject:	RE: LVS vs Piranha
From:	Michael Loftis <zop12@xxxxxxxxxxxx>
Date:	Thu, 14 Sep 2000 01:22:03 -0700 (PDT)
I've been watching this thread develop and decided to check out those four
Bugzillas.  Number 1 is a Resolution:  WORKSFORME...  (copied below)
contrary to what Mr. Barrett claims.  You'll have to hit the bottom of
this note to find that the first two are WORKSFORME, fairly 15911 is
ASSIGNED, but 15912 is DEFERRED for addition into the next revision.

So will Mr. Downey recieve any credit for the revision addition of 15912?
And did that little note make the Errata at release time?

Anyway I'll drop it there...  I just couldn't stand back and watch this
one after seeing the Bugzilla entries for myself.  I don't think that
you're saying Mr. Downey was bringing personal issues into this was fair,
throughout the letters to the list he has maintained a decent amount of
professionalism until you pointed that it was "personal".

Michael Loftis

ATTACHEMENTS::


---- Bugzilla 15909 ----
Product - Version Component Status Short Summary 
Red Hat High Availability Server - 1.0 ipvsadm RESOLVED lvs component
fails to create ipvsadm entries 


Opened by david.downey@xxxxxxxxxxxxxx on  2000-08-10 04:21:17   Long
Description 


Case:        I define a virtual server www.xyz.com with IP of xxx.xxx.xxx
(FQIP) with a real server entry of 192.168.1.11. When i start the lvs 
services lvs is supposed to create an ipvsadmin rule and enable that rule. 

Problem: Rule does not get created and thus activated.

Current solution:      Manually create the ipvsadm rule and enable by
hand.

Problem: ivsadm rule will not be managed by lvs
------- Additional comments from kbarrett@xxxxxxxxxx 2000-08-10 02:52
-------
As far as we can tell, this should work fine. Piranha certainly will
create and
modify the ipvsadm rules as needed, so we need to know more in order to
investigate this. Certainly, as you point out, that if lvs doesn't create
the
rule it will have problems maintainig it. Could you update us on the
following:

1. What is the ipchains and ipvadm rules you are trying (or expecting to
result)
   to use? Sample commands, ipvsadm list output, etc.

2. Can you include a copy of your lvs.cf config file.

3. A simply diagram of your network setup, with indications of the ip and
   virtual ip addresses and nic interfaces?


Thanks


------- Additional comments from david.downey@xxxxxxxxxxxxxx 2000-08-13
03:09 -------
I shouldn't have to create any ipvsadm rules. Defining the real nodes
involved 
should create the rules. (piranha should have the function to look and see
what 
the name of the virtual server is like www.qixo.com and then look at the
real 
nodes that make up that virtual server and create the rules and enable
them.)

like 216.200.192.106 is virtual server www.qixo.com
this is made upof real nodes 192.168.1.11 and 192.168.1.10

ipchains -A forward -s 192.168.1.0/24 -d 0.0.0.0/0 -j MASQ should already
have 
been applied since the admin should already know that he neds MASQ enabled
for 
his servers.

piranha should create these rules and put them in place..
ipvsadm -A -t 216.200.192.106:80 -s rr
ipvsadm -a -t 216.200.192.106:80 -r 192.168.1.11 -m
ipvsadm -a -t 216.200.192.106:80 -r 192.168.1.10 -m

some part of the lvs clustering software should then be monitoring via
somethng 
along the lines of the following to add remove servers as they go up and
down..


grep lvs.cf for the real node IPs, pass that information to a script that
tests 
for known response from whatever servies are defined as being handled by
the 
real nodes (like www for instance). If no response removes the non
responding 
server's ipvsadm rule, if there IS a response runs ipvsadm -L and greps
for the 
name or IP of the real node. if there it doesn't re-add it, it just tests
the 
next one. if it;s NOT there it adds the rule.
All of this gets checked every 10 to 15 seconds. this needs to be started/
stopped from the script that starts/stops the lvs daemon. and is
continuously 
monitoring.

At this juncture the above actions are NOT done.
 here is the copy of my lvs.cf file as it stands now.



CURRENT LVS.CF FILE

#
# Set up timeout values for the LVS
# =================================
ipchains -M -S 7200 10 160
#
# Start setting up routing for LVS/HA
# ===================================
ipvsadm -A -t 216.200.192.111:80 -s rr
# RE-ENABLE .12 WHEN DEVEL IS DONE!
[root@vs-00 /root]# less /etc/lvs.cf
primary = 216.200.192.100
service = lvs
rsh_command = rsh
backup_active = 1
backup = 216.200.192.101
heartbeat = 1
heartbeat_port = 539
keepalive = 6
deadtime = 12
network = nat
nat_router = 192.168.1.22 eth1:0
virtual 216.200.192.106.qixo.com {
     active = 1
     address = 216.200.192.106 eth0:1
     port = 80
     send = "GET / HTTP/1.0\r\n\r\n"
     expect = "QIXO"
     load_monitor = ruptime
     scheduler = rr
     protocol = tcp
     timeout = 6
     reentry = 15
     server ws-01 {
         address = 192.168.1.11
         active = 1
         weight = 1
     }
     server ws-02 {
                        address = 192.168.1.10
                        active = 1
                        weight = 1
        }
}



SYSTEMS LAYOUT

             VIRT_IP
                |
       ====================
       |                  |
       0                  0
      LVS Node1         LVS Node2
       |                  |
       ====================
        |                |
      Real Node1        Real Node2




------- Additional comments from kbarrett@xxxxxxxxxx 2000-08-14 11:05
-------
> I shouldn't have to create any ipvsadm rules. Defining the real nodes
involved 
> should create the rules. 

I thought I said this.

This is why I asked for more information; to determine what's wrong in
your
situation.


> some part of the lvs clustering software should then be monitoring via
> somethng along the lines of the following to add remove servers as
> they go up and down..

This is what the product does.


> grep lvs.cf for the real node IPs, pass that information to a script
> that tests for known response from whatever servies are defined as being
> handled by the real nodes (like www for instance). If no response
removes
> the non responding server's ipvsadm rule, if there IS a response runs
> ipvsadm -L and greps for the name or IP of the real node. if there it
> doesn't re-add it, it just tests the next one. if it;s NOT there it adds
> the rule.

Again, this is what the product does.


> At this juncture the above actions are NOT done.

OK, this is why we need to look at your situation a bit.


>  here is the copy of my lvs.cf file as it stands now.


GREAT. We'll look at it.  This diagram helps a little too. Could you
supply one
more piece of information? Your diagram does not indicate the all the
non-virtual IP addresses being used, nor their interfaces. In order to
recreate
your problem in our lab, we could use tht information. If it helps, there
are
simple block diagrams in the HA Server installation Guide that you could
clone.
lvs.cf does not show all the ip addresses involved in a setup (for
example; your
ipvs rules reference an ip address not shown). Thanks.



------- Additional comments from david.downey@xxxxxxxxxxxxxx 2000-08-14
02:22 -------
Keith the setup is very simple. It is exactly like what is in your manual.

Front end nodes have FQIPs on eth0 with eth1 being the 192.168.1.x IPs.
Piranha configures and enables eth0:0 as the floating FQIP that the world
sees 
as the cluster IP. Piranha configures and enables eth1:0 as the NAT device
as shown in the lvs.cf file.

Now, the situation has changed in regards to this product. There is no way
you can say the product does what it's supposed to.  It DOES create the 
needed eth aliases and DOES enable them and DOES maintain them. We CAN
send information back and forth from the front end FQIP to the server farm 
in the back and we DO get responses BUT this is ONLY after we HAND create
the ipvsadm rules to do this and put them in place. WE should not be doing 
this, PIRANHA is supposed to be doing this. This is most DEFINITELY broken
in the software! I've called and spoken with Q about this issue, we went 
thru on the telephone and configured this according to the manual. (He had
one that he walked through with me (manual that is) to the point that we 
were reading off page numbers to make sure we were in the same section of
the manual!)

Simply put folks, this software is broken. Piranha does NOT create,
maintain, or modify the needed ipvsadm rules that will make this product
work as 
advertised. At this current point it does NOT work as advertised!   To
make matters worse when we called in to the Durham office to get this
definite 
bug fixed, we were told that we would have to pay for a development
contract to get this to work. 

ERR?  Why should WE have to pay for an additional contract to fix a
problem with code in the original product. A bug that should have NOT been
in the 
code and should have been working in the original product. That makes
entirely NO sense! Why should WE be charged for fixing a bug that is core
to 
the product working correctly and as advertised?? You mean we are going to
be charged to fix something wrong with your product? When I asked 
what the basis was for the charge, I was told that it was because the
support contract that comes with the product is for installation
configuration and 
administration only and NOT for fixing something at the code level. Keith,
this is a problem at the code level that should NOT have been there in the 
FIRST place! And this definitely affects the configuration and
adminsitration points of the contract since neither  Red Hat nor I can
rectify the problem if 
the code is broken! The code is most DEFINTIELY broken.


Next, when the conversation moved to refund territory, we were told by
Chris that management was going to keep 500 of our money for technical 
support already rendered! Why? The technical support was for nothing more
than reporting to you via telephone that there was a possible bug in the 
software and to have it verifiied that there WAS in fact a bug!  ***This
IS a bug!*** There is no way we will allow a $500 charge!

I have left numerous mesages with various folks involved with this like
Nathan Thomas, Q, yourself, Kim Lynch, and others. This is rapidly
starting to 
feel like WE are being  made to pay for  the RIGHT to have bugs fixed that
should have been working in the first place since the whole LVS structure 
hinges on this codse working correctly. Right now there is no controlling
entity in any way shape or form that handles nodes coming in or out of the 
server pool. All additions, first time entries, and removals of dead
machines are having to be handled by a human. Right now the only thing the
product 
DOES do correctly is to rotate the FQIP  for the virtual server between
the front end nodes.  

Needless to say, neither my CTO, CEO, nor I are happy in th
------- Additional comments from kbarrett@xxxxxxxxxx 2000-08-14 05:22
-------
The problem you are reporting is unique to your situation. This is not a
known
bug with the product. In fact, it is a fundemental part of the product to
perform ipvsadm calls. Bugs are always possible -- this could be a unique
ipvs
situation, but it needs to be investigated  and that requires time and
cooperation. 

After several commnuications, involving both support and myself, it has
become
apparent that there are more issues being brought into this situation than
just
a problem report, and that bugzilla is not the best forum to resolve them.
Certainly I am not on a position to respond to refund disatisfaction.

This problem has been moved to Red Hat support.

------- Additional comments from kbarrett@xxxxxxxxxx 2000-08-14 05:24
-------
Additional information: Using the posted lvs.cf file, the problem was not
reproducible in the lab and the system responded correctly.

------- Additional comments from david.downey@xxxxxxxxxxxxxx 2000-08-15
12:37 -------
Keith, I do not see how we can be the only ones out here with this
problem. 

marking the problem resolved does NOT make the problem go away though it
does 

make it appear that a single customer is having a problem with this and 

therefore not a bug and therefore a face saving solution.



This problem is NOT resolved whether it is marked as such or not. We ARE
working 

with Red hat to resolve this issue once key members return from the LWE.
(we can 

discuss this at LinuxWorld if you will be attending.)



Also, I was not stating that you had anything to do with refund stuff. I
brought 

that out into the open due to the lack of response we recieved from Red
hat on 

these issues. Since playing phone tag was getting no where, a public 

announcement in bugzilla regarding the problem was necessitated. HOWEVER<
the 

problem has been resolved to both party's satisfaction at this point even
if the 

underlying issue is not resolved as of yet.



I do however believe that one will be forth coming, though we do take
exception 

to the early closing of this bug before we, as a team, have had a chance
to work 

on it. If the lvs.cf file given to you worked in your labs then it should
have 

worked equally fine in our systems. I do however pose the possibility that 

mayhaps there is something in the hardware of a Dell 2450 server that may
or may 

not cause this issue since that has not been addressed, nor the question
even 

posed as to what hardware we were using. No generic troublshooting
questions 

were asked in fact other than those that you posed to me in this forum. 



At this juncture I will leave off further comments regarding this issue
until 

such time as we can work on this after the LWE. A working relationship has
been 

established from which to solve this puzzle due to an earlier discussion.



I will not argue the closing of this issue other than to publicly state
that the 

original problem has not thus far been solved, but steps have been taken
by BOTH 

sides to ensure this becomes the case.


------- Additional comments from kbarrett@xxxxxxxxxx 2000-08-15 01:41
-------
Again, this entry is closed because there are several, non-technical
support
issues involved with this customer. These will not be elaborated on here.
Since
official phone support is involved and bugzilla is a casual support
vehicle
(there is no obligation by Red Hat to respond to postings here), there
will not
be further activity logged on this bugzilla entry. This is also not a
proper
forum for debate.







        
 


Additional Information 
Bug#: 15909 Product: Red Hat High Availability Server  Version: 1.0  
Platform: i386  Reporter: david.downey@xxxxxxxxxxxxxx Component: ipvsadm  
Status: RESOLVED Priority: high  
******--->>>  Resolution: WORKSFORME Severity:  
Assigned To: kbarrett@xxxxxxxxxx  

Cc: david.downey@xxxxxxxxxxxxxx 
QA Contact: copeland@xxxxxxxxxx 
URL:  
Summary: lvs component fails to create ipvsadm entries 
Status Whiteboard:  
 
Attachments:  


Dependencies: Bug 15909 depends on:  Show dependency tree 
Show dependency graph  
 Bug 15909 blocks:
<Prev in Thread]	Current Thread	[Next in Thread>
Re: LVS vs Piranha, (continued) Re: LVS vs Piranha, Keith Barrett Re: LVS vs Piranha, tc lewis Re: LVS vs Piranha, David D.W. Downey Re: LVS vs Piranha, Keith Barrett Re: LVS vs Piranha, David D.W. Downey Re: LVS vs Piranha, Keith Barrett Re: LVS vs Piranha, David D.W. Downey Re: LVS vs Piranha, Lars Marowsky-Bree Re: LVS vs Piranha, David D.W. Downey Re: LVS vs Piranha, Keith Barrett RE: LVS vs Piranha, Michael Loftis <= Re: RE: LVS vs Piranha, Lars Marowsky-Bree Re: RE: LVS vs Piranha, David D.W. Downey Re: LVS vs Piranha, Keith Barrett Re: LVS vs Piranha, David D.W. Downey Re: LVS vs Piranha, Jeremy Hansen Re: LVS vs Piranha, Keith Barrett
Previous by Date:	Re: my connection are going to the wrong machine, Jeremy Hansen
Next by Date:	firewall farm, John Chuang
Previous by Thread:	Re: LVS vs Piranha, Keith Barrett
Next by Thread:	Re: RE: LVS vs Piranha, Lars Marowsky-Bree
Indexes:	[Date] [Thread] [Top] [All Lists]