Hello,
This friday the director node felt down and backup director became to own
the resources, including drbd.
Linux-HA is nice, isn't it?
The problem is that now, after restarted primary directord, the shared disk
seems to be not syncronized.
What do you mean by restarted primary directord? Did you fallback to the
master, or did you start the directord resource on the master?
I get disklessclient....Inconsistent in the old primary director. And
ServerforDless....Consistent in the old backup director (now the primary
after failover) What is happening?
Please show the correct output of /proc/drbd (or whichever entry it is,
as I don't know it by heart) with the inconsistent behaviour. Also
reading this I get the impression that you might be better served at the
linux-ha-users mailing list.
I need to get back to the old
configuration. How can syncronize both disks??
Please share you linux-ha configuration and what you refer to with "old
configuration".
I've read in the list and this is what I made:
I stopped drbd in old director node and then started again.
Why? Doing that you've probably disabled a crucial service during
runtime. Since you've not told us what exactly you share over your DRBD
it's difficult to tell.
Then, watching the
status, drbd noticed that there was some MB to resync and started to
This does not make much sense.
syncronize but suddenly the sync process stopped and what I get now is:
in old director: cs:WFConnection st:Primary/Unknown ld:Consistent
in new director:
cs:NetworkFailure st:Secondary/Unknown ld:Inconsistent
Check your heartbeat and your interface configuration and your linux-ha
log files. It looks like your heartbeat network is broken; possibly the
reason for the failover.
So it seems to be a problem with the net between both nodes, doesn't it?
Yes.
I tried to change the net that drbd uses to syncronize the disks
(changing /etc/drbd.conf) but If I change it in the new director, Should I
restart drbd??
Yes.
How could this affect the data?
Depends how you restart drbd.
The cluster nodes are mounting
a directory that is in shared disks and is very used, could this be a
problem??
Local disks?
Please, I need some help with this problem (this cluster is in production).
Debug your network and check the log file entries. Heartbeat has
certainly logged interesting information regarding this incident.
**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido.
**CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.
Please drop such email statements, since this is legally difficult.
Best regards,
Roberto Nibali, ratz
--
echo
'[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc
|