[Ocfs2-users] New node..new problems

Tao Ma tao.ma at oracle.com
Thu Oct 9 21:24:47 PDT 2008


Hi,
Dante Garro wrote:
> Sunil, now I fall in count of messages are related to node 0, but the new is
> node 1 and does not care about the value I've setup allways says 14000 ms.
> Do this change your diagnostic?
Node1 start connection with node0, so you see the messages related to 
node0 on node1. It looks like your configuration in node1 is wrong.
Please make sure that value of O2CB_HEARTBEAT_THRESHOLD in 
/etc/sysconfig/o2cb of node1 is the same as that in node0.

Regards,
Tao

> 
> 
> -----Mensaje original-----
> De: Sunil Mushran [mailto:sunil.mushran at oracle.com] 
> Enviado el: Jueves, 09 de Octubre de 2008 06:02 p.m.
> Para: Dante Garro
> CC: 'ocfs2-users at oss.oracle.com'
> Asunto: Re: [Ocfs2-users] New node..new problems
> 
> Yeah the cluster timeouts are not consistent. Update and restart the cluster
> on the new node (or all nodes as the case might be).
> 
> Hint: cat /sys/kernel/config/cluster/<clustername>/idle_timeout_ms
> to see the active heartbeat threshold.
> 
> Dante Garro wrote:
>> Hi all, because problems with ocfs2 release of Debian distribution 
>> decided to remake my cluster replacing it by CentOS based installation.
>> Started replacing one of the nodes keeping the other working.
>>
>> On this recently created node the following errors appears:
>>
>> drbd0: Writing meta data super block now.
>> (2558,1):o2hb_check_slot:881 ERROR: Node 0 on device drbd0 has a dead 
>> count of 14000 ms, but our count is 130000000 ms.
>> Please double check your configuration values for
> 'O2CB_HEARTBEAT_THRESHOLD'
>> OCFS2 1.2.9 Wed Sep 24 19:26:41 PDT 2008 (build
>> a693806cb619dd7f225004092b675ede)
>> (2520,1):o2net_connect_expired:1585 ERROR: no connection established 
>> with node 0 after 30.0 seconds, giving up and returning errors.
>> (2556,1):dlm_request_join:901 ERROR: status = -107
>> (2556,1):dlm_try_to_join_domain:1049 ERROR: status = -107
>> (2556,1):dlm_join_domain:1321 ERROR: status = -107
>> (2556,1):dlm_register_domain:1514 ERROR: status = -107
>> (2556,1):ocfs2_dlm_init:2024 ERROR: status = -107
>> (2556,1):ocfs2_mount_volume:1133 ERROR: status = -107
>> ocfs2: Unmounting device (147,0) on (node 1)
>> (2591,1):o2hb_check_slot:881 ERROR: Node 0 on device drbd0 has a dead 
>> count of 14000 ms, but our count is 130000000 ms.
>> Please double check your configuration values for
> 'O2CB_HEARTBEAT_THRESHOLD'
>> (2520,1):o2net_connect_expired:1585 ERROR: no connection established 
>> with node 0 after 30.0 seconds, giving up and returning errors.
>> (2589,1):dlm_request_join:901 ERROR: status = -107
>> (2589,1):dlm_try_to_join_domain:1049 ERROR: status = -107
>> (2589,1):dlm_join_domain:1321 ERROR: status = -107
>> (2589,1):dlm_register_domain:1514 ERROR: status = -107
>> (2589,1):ocfs2_dlm_init:2024 ERROR: status = -107
>> (2589,1):ocfs2_mount_volume:1133 ERROR: status = -107
>> ocfs2: Unmounting device (147,0) on (node 1)
>>
>> I've changed the parameter O2CB_HEARTBEAT_THRESHOLD according O2CB 
>> adviced me, but It don't resolve the issue.
>>
>> I hope someone could give me a clue.
>>
>> Thanks in advance.
>>
>> Dante
>>
>>
>> _______________________________________________
>> Ocfs2-users mailing list
>> Ocfs2-users at oss.oracle.com
>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>   
> 
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users



More information about the Ocfs2-users mailing list