[Ocfs2-users] New node..new problems

Dante Garro dante at bigbyte.com.ar
Fri Oct 10 05:05:12 PDT 2008


Thanks Tao, I've setup the same on both nodes and the cluster becomes
online.
Now, when I try to mount the following errors appears on node 1 (new
CentOS):
(2512,1):o2net_connect_expired:1585 ERROR: no connection established with
node 0 after 30.0 seconds, giving up and returning errors.
(3022,1):dlm_request_join:901 ERROR: status = -107
(3022,1):dlm_try_to_join_domain:1049 ERROR: status = -107
(3022,1):dlm_join_domain:1321 ERROR: status = -107
(3022,1):dlm_register_domain:1514 ERROR: status = -107
(3022,1):ocfs2_dlm_init:2024 ERROR: status = -107
(3022,1):ocfs2_mount_volume:1133 ERROR: status = -107
ocfs2: Unmounting device (147,0) on (node 1)

And the following on node 0 (old Debian)

 (2228,0):o2net_check_handshake:1093 node nodo2 (num 1) at
192.168.168.2:7777 advertised net protocol version 103 but 2 is required,
disconnecting

I believe the Debian message is clear, protocol version incompatibility.

Are there a way to resolve it?

Thanks

Dante


-----Mensaje original-----
De: Tao Ma [mailto:tao.ma at oracle.com] 
Enviado el: viernes, 10 de octubre de 2008 1:25
Para: Dante Garro
CC: 'Sunil Mushran'; 'ocfs2-users at oss.oracle.com'
Asunto: Re: [Ocfs2-users] New node..new problems


Hi,
Dante Garro wrote:
> Sunil, now I fall in count of messages are related to node 0, but the 
> new is node 1 and does not care about the value I've setup allways says
14000 ms.
> Do this change your diagnostic?
Node1 start connection with node0, so you see the messages related to node0
on node1. It looks like your configuration in node1 is wrong.
Please make sure that value of O2CB_HEARTBEAT_THRESHOLD in
/etc/sysconfig/o2cb of node1 is the same as that in node0.

Regards,
Tao

> 
> 
> -----Mensaje original-----
> De: Sunil Mushran [mailto:sunil.mushran at oracle.com] Enviado el: 
> Jueves, 09 de Octubre de 2008 06:02 p.m.
> Para: Dante Garro
> CC: 'ocfs2-users at oss.oracle.com'
> Asunto: Re: [Ocfs2-users] New node..new problems
> 
> Yeah the cluster timeouts are not consistent. Update and restart the 
> cluster on the new node (or all nodes as the case might be).
> 
> Hint: cat /sys/kernel/config/cluster/<clustername>/idle_timeout_ms
> to see the active heartbeat threshold.
> 
> Dante Garro wrote:
>> Hi all, because problems with ocfs2 release of Debian distribution 
>> decided to remake my cluster replacing it by CentOS based installation.
>> Started replacing one of the nodes keeping the other working.
>>
>> On this recently created node the following errors appears:
>>
>> drbd0: Writing meta data super block now.
>> (2558,1):o2hb_check_slot:881 ERROR: Node 0 on device drbd0 has a dead 
>> count of 14000 ms, but our count is 130000000 ms.
>> Please double check your configuration values for
> 'O2CB_HEARTBEAT_THRESHOLD'
>> OCFS2 1.2.9 Wed Sep 24 19:26:41 PDT 2008 (build
>> a693806cb619dd7f225004092b675ede)
>> (2520,1):o2net_connect_expired:1585 ERROR: no connection established 
>> with node 0 after 30.0 seconds, giving up and returning errors.
>> (2556,1):dlm_request_join:901 ERROR: status = -107
>> (2556,1):dlm_try_to_join_domain:1049 ERROR: status = -107
>> (2556,1):dlm_join_domain:1321 ERROR: status = -107
>> (2556,1):dlm_register_domain:1514 ERROR: status = -107
>> (2556,1):ocfs2_dlm_init:2024 ERROR: status = -107
>> (2556,1):ocfs2_mount_volume:1133 ERROR: status = -107
>> ocfs2: Unmounting device (147,0) on (node 1)
>> (2591,1):o2hb_check_slot:881 ERROR: Node 0 on device drbd0 has a dead 
>> count of 14000 ms, but our count is 130000000 ms.
>> Please double check your configuration values for
> 'O2CB_HEARTBEAT_THRESHOLD'
>> (2520,1):o2net_connect_expired:1585 ERROR: no connection established 
>> with node 0 after 30.0 seconds, giving up and returning errors.
>> (2589,1):dlm_request_join:901 ERROR: status = -107
>> (2589,1):dlm_try_to_join_domain:1049 ERROR: status = -107
>> (2589,1):dlm_join_domain:1321 ERROR: status = -107
>> (2589,1):dlm_register_domain:1514 ERROR: status = -107
>> (2589,1):ocfs2_dlm_init:2024 ERROR: status = -107
>> (2589,1):ocfs2_mount_volume:1133 ERROR: status = -107
>> ocfs2: Unmounting device (147,0) on (node 1)
>>
>> I've changed the parameter O2CB_HEARTBEAT_THRESHOLD according O2CB 
>> adviced me, but It don't resolve the issue.
>>
>> I hope someone could give me a clue.
>>
>> Thanks in advance.
>>
>> Dante
>>
>>
>> _______________________________________________
>> Ocfs2-users mailing list
>> Ocfs2-users at oss.oracle.com
>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>   
> 
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users



More information about the Ocfs2-users mailing list