[Ocfs2-users] another fencing question

Mailing List SVR lists at svrinformatica.it
Thu Jan 14 04:01:00 PST 2010


Hi, 

periodically one of on my two nodes cluster is fenced here are the logs:

Jan 14 07:01:44 nvr1-rc kernel: o2net: no longer connected to node nvr2-
rc.minint.it (num 0) at 1.1.1.6:7777
Jan 14 07:01:44 nvr1-rc kernel: (21534,1):dlm_do_master_request:1334 ERROR: 
link to 0 went down!
Jan 14 07:01:44 nvr1-rc kernel: (4007,4):dlm_send_proxy_ast_msg:458 ERROR: 
status = -112
Jan 14 07:01:44 nvr1-rc kernel: (4007,4):dlm_flush_asts:600 ERROR: status = 
-112
Jan 14 07:01:44 nvr1-rc kernel: (21534,1):dlm_get_lock_resource:917 ERROR: 
status = -112
Jan 14 07:02:19 nvr1-rc kernel: (3950,5):o2net_connect_expired:1664 ERROR: no 
connection established with node 0 after 35.0 seconds, giving up and returning 
errors.
Jan 14 07:02:54 nvr1-rc kernel: (3950,5):o2net_connect_expired:1664 ERROR: no 
connection established with node 0 after 35.0 seconds, giving up and returning 
errors.
Jan 14 07:03:10 nvr1-rc kernel: (4007,4):dlm_send_proxy_ast_msg:458 ERROR: 
status = -107
Jan 14 07:03:10 nvr1-rc kernel: (4007,4):dlm_flush_asts:600 ERROR: status = 
-107
Jan 14 07:03:29 nvr1-rc kernel: (3950,5):o2net_connect_expired:1664 ERROR: no 
connection established with node 0 after 35.0 seconds, giving up and returning 
errors.
Jan 14 07:03:50 nvr1-rc kernel: (31,5):o2quo_make_decision:146 ERROR: fencing 
this node because it is connected to a half-quorum of 1 out of 2 nodes which 
doesn't include the lowest active node 0
Jan 14 07:03:50 nvr1-rc kernel: (31,5):o2hb_stop_all_regions:1967 ERROR: 
stopping heartbeat on all active regions.

I'm sure there are no network connectivity problem but it is possible that 
there are heavy IO loads, is this the intended behaviour? Why under heavy load 
the loaded node is fenced?

I'm using ocfs2-1.4.4 on rhel5 kernel-2.6.18-164.6.1.el5

thanks
Nicola



More information about the Ocfs2-users mailing list