[Ocfs2-users] Transport endpoint not connected after crash of one node

Sunil Mushran Sunil.Mushran at oracle.com
Fri Aug 24 14:01:45 PDT 2007


You could be encountering Novell bugzilla 296606. It is specific
to SLES10 (and SP1). Novell owns the bug.

Sebastian Reitenbach wrote:
> Hi,
>
> I am on SLES 10, SP1, x86_64, running the distribution rpm's of ocfs:
> ocfs2console-1.2.3-0.7
> ocfs2-tools-1.2.3-0.7
>
> I have a two node ocfs2 cluster configured. One node died (manual reset), 
> and the second started immediately to have problems on accessing the file 
> system for the following reason from the logs: Transport endpoint not 
> connected.
>
> a mounted.ocfs2 on the still living machine showed that both machines have 
> the filesystems mounted. After a umount of all the filesystems, the second 
> node still thought that it had mounted some of the ocfs2 partitions:
>
>
> ppsdb101:~ # mounted.ocfs2 -f
> Device                FS     Nodes
> /dev/sda1             ocfs2  ppsdb102
> /dev/sdb1             ocfs2  ppsdb102
> /dev/sdc1             ocfs2  ppsdb102
> /dev/sdd1             ocfs2  ppsdb102
> /dev/sde1             ocfs2  ppsdb102
> /dev/sdf1             ocfs2  ppsdb102
> /dev/sdg1             ocfs2  ppsdb102
> /dev/sdh1             ocfs2  ppsdb102
> /dev/sdi1             ocfs2  ppsdb102
> /dev/sdj1             ocfs2  ppsdb102
> /dev/sdk1             ocfs2  ppsdb102
> /dev/sdl1             ocfs2  ppsdb102, ppsdb101
> /dev/sdm1             ocfs2  ppsdb102
> /dev/sdn1             ocfs2  ppsdb102
> /dev/sdo1             ocfs2  ppsdb102
> /dev/sdp1             ocfs2  ppsdb102, ppsdb101
> /dev/sdq1             ocfs2  ppsdb102, ppsdb101
> /dev/sdr1             ocfs2  ppsdb102, ppsdb101
> /dev/sds1             ocfs2  ppsdb102, ppsdb101
> /dev/sdt1             ocfs2  ppsdb102
> /dev/sdu1             ocfs2  ppsdb102
>
> in the above case, the ppsdb102 is the dead machine, the ppsdb101 is the one 
> that is still alive. An ordinary mount command shows that there are none of 
> the above listed partitions mounted, but mounted.ocfs2 still thinks that 
> some of them are mounted.
>
> o2cb configure was configured like this:
> Load O2CB driver on boot (y/n) [y]:
> Cluster to start on boot (Enter "none" to clear) [ppscluster]:
> Specify heartbeat dead threshold (>=7) [61]:
> Use user-space driven heartbeat? (y/n) [n]:
> Cluster keepalive delay (ms) [5000]:
> Cluster reconnect dealy (ms) [2000]:
> Cluster idle timeout (ms) [10000]:
> Writing O2CB configuration: OK
> O2CB cluster ppscluster already online
>
>
> Two questions:
> 1. shouldn't the still living machine recognize the dead of the other node 
> after 61 seconds=
> 2. shouldn't mounted.ocfs2 show the same locally mounted ocfs2 partitions as 
> mount -t ocfs2 does?
>
> kind regards
> Sebastian
>
>
>
>
>
>
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>   




More information about the Ocfs2-users mailing list