[Ocfs2-users] New node..new problems

Dante Garro dante at bigbyte.com.ar
Thu Oct 9 17:11:23 PDT 2008


Sunil, now I fall in count of messages are related to node 0, but the new is
node 1 and does not care about the value I've setup allways says 14000 ms.
Do this change your diagnostic?


-----Mensaje original-----
De: Sunil Mushran [mailto:sunil.mushran at oracle.com] 
Enviado el: Jueves, 09 de Octubre de 2008 06:02 p.m.
Para: Dante Garro
CC: 'ocfs2-users at oss.oracle.com'
Asunto: Re: [Ocfs2-users] New node..new problems

Yeah the cluster timeouts are not consistent. Update and restart the cluster
on the new node (or all nodes as the case might be).

Hint: cat /sys/kernel/config/cluster/<clustername>/idle_timeout_ms
to see the active heartbeat threshold.

Dante Garro wrote:
> Hi all, because problems with ocfs2 release of Debian distribution 
> decided to remake my cluster replacing it by CentOS based installation.
> Started replacing one of the nodes keeping the other working.
>
> On this recently created node the following errors appears:
>
> drbd0: Writing meta data super block now.
> (2558,1):o2hb_check_slot:881 ERROR: Node 0 on device drbd0 has a dead 
> count of 14000 ms, but our count is 130000000 ms.
> Please double check your configuration values for
'O2CB_HEARTBEAT_THRESHOLD'
> OCFS2 1.2.9 Wed Sep 24 19:26:41 PDT 2008 (build
> a693806cb619dd7f225004092b675ede)
> (2520,1):o2net_connect_expired:1585 ERROR: no connection established 
> with node 0 after 30.0 seconds, giving up and returning errors.
> (2556,1):dlm_request_join:901 ERROR: status = -107
> (2556,1):dlm_try_to_join_domain:1049 ERROR: status = -107
> (2556,1):dlm_join_domain:1321 ERROR: status = -107
> (2556,1):dlm_register_domain:1514 ERROR: status = -107
> (2556,1):ocfs2_dlm_init:2024 ERROR: status = -107
> (2556,1):ocfs2_mount_volume:1133 ERROR: status = -107
> ocfs2: Unmounting device (147,0) on (node 1)
> (2591,1):o2hb_check_slot:881 ERROR: Node 0 on device drbd0 has a dead 
> count of 14000 ms, but our count is 130000000 ms.
> Please double check your configuration values for
'O2CB_HEARTBEAT_THRESHOLD'
> (2520,1):o2net_connect_expired:1585 ERROR: no connection established 
> with node 0 after 30.0 seconds, giving up and returning errors.
> (2589,1):dlm_request_join:901 ERROR: status = -107
> (2589,1):dlm_try_to_join_domain:1049 ERROR: status = -107
> (2589,1):dlm_join_domain:1321 ERROR: status = -107
> (2589,1):dlm_register_domain:1514 ERROR: status = -107
> (2589,1):ocfs2_dlm_init:2024 ERROR: status = -107
> (2589,1):ocfs2_mount_volume:1133 ERROR: status = -107
> ocfs2: Unmounting device (147,0) on (node 1)
>
> I've changed the parameter O2CB_HEARTBEAT_THRESHOLD according O2CB 
> adviced me, but It don't resolve the issue.
>
> I hope someone could give me a clue.
>
> Thanks in advance.
>
> Dante
>
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>   



More information about the Ocfs2-users mailing list