[Ocfs2-users] Pb with ocfs2 & dlm on Fedora 13

Alain.Moulle Alain.Moulle at bull.net
Tue Nov 9 00:49:00 PST 2010


Hi,
The three cluster.conf are exactly the same on the 3 nodes.
The errors messages are :

-nodes1:
	o2net: accepted connection from node selfxl-5 (num 1) at
10.197.189.218:7777
o2net: no longer connected to node selfxl-5 (num 1) at
10.197.189.218:7777

-nodes2:
	(1457,1):o2net_connect_expired:1656 ERROR: no connection established
with node 1 after 30.0 seconds, giving up and returning errors.

Note that once a mount is refused for example on node3, if
I umount the FS on node1 for example, then I can mount it
on node3. 
Note also that when the mound is refused for example on node3, 
I've check that this node3 "pings" successfully both other
nodes on IP addr given in cluster.conf.

Alain



Tao Ma a écrit :
> Hi Alain,
>
> On 11/08/2010 11:08 PM, Alain.Moulle wrote:
>   
>>   Hi,
>>
>> I have a problem on Fedora13 with releases :
>> ocfs2  1.4.3-5.fc13.x86_64
>> dlm_tool 3.0.17
>>
>> With a 3 nodes ocfs2 cluster, I can't mount FS on the three nodes at the same time
>> but only on two nodes   among the 3 nodes  , whatever the two nodes are among the 3 nodes.
>>
>> The errors are :
>> "(1475,0):o2net_connect_expired:1656 ERROR: no connection established
>> with node 2 after 30.0 seconds, giving up and returning errors.
>> (2175,0):dlm_request_join:1035 ERROR: status = -107
>> (2175,0):dlm_try_to_join_domain:1209 ERROR: status = -107
>> (2175,0):dlm_join_domain:1487 ERROR: status = -107
>> (2175,0):dlm_register_domain:1753 ERROR: status = -107
>> (2175,0):o2cb_cluster_connect:313 ERROR: status = -107
>> (2175,0):ocfs2_dlm_init:2995 ERROR: status = -107
>> (2175,0):ocfs2_mount_volume:1789 ERROR: status = -107
>> ocfs2: Unmounting device (8,16) on (node 0)
>> o2net: no longer connected to node selfxl-4 (num 0) at
>> 10.197.189.204:7777
>> o2net: connected to node selfxl-4 (num 0) at 10.197.189.204:7777
>>
>> It seems to be a lock management problem
>> Is it an already known issue ?
>> Is there an available patch ?
>>     
> It doesn't look like a dlm problem, but a network problem. ;)
> So your first error is o2net_connect_expired.
> So it seems that the 3rd node can't connect with node 2.
> Could you please check the error message in node 2?
>
> btw, I would deem that the cluster.conf is the same among the 3 nodes, 
> and you you can connect to 7777(which is used by ocfs2) of node 2 from 
> node 3.
>
> Regards,
> Tao
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>
>
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20101109/e461fd30/attachment-0001.html 


More information about the Ocfs2-users mailing list