[Ocfs2-users] another fencing question
Mailing List SVR
lists at svrinformatica.it
Thu Jan 14 04:01:00 PST 2010
Hi,
periodically one of on my two nodes cluster is fenced here are the logs:
Jan 14 07:01:44 nvr1-rc kernel: o2net: no longer connected to node nvr2-
rc.minint.it (num 0) at 1.1.1.6:7777
Jan 14 07:01:44 nvr1-rc kernel: (21534,1):dlm_do_master_request:1334 ERROR:
link to 0 went down!
Jan 14 07:01:44 nvr1-rc kernel: (4007,4):dlm_send_proxy_ast_msg:458 ERROR:
status = -112
Jan 14 07:01:44 nvr1-rc kernel: (4007,4):dlm_flush_asts:600 ERROR: status =
-112
Jan 14 07:01:44 nvr1-rc kernel: (21534,1):dlm_get_lock_resource:917 ERROR:
status = -112
Jan 14 07:02:19 nvr1-rc kernel: (3950,5):o2net_connect_expired:1664 ERROR: no
connection established with node 0 after 35.0 seconds, giving up and returning
errors.
Jan 14 07:02:54 nvr1-rc kernel: (3950,5):o2net_connect_expired:1664 ERROR: no
connection established with node 0 after 35.0 seconds, giving up and returning
errors.
Jan 14 07:03:10 nvr1-rc kernel: (4007,4):dlm_send_proxy_ast_msg:458 ERROR:
status = -107
Jan 14 07:03:10 nvr1-rc kernel: (4007,4):dlm_flush_asts:600 ERROR: status =
-107
Jan 14 07:03:29 nvr1-rc kernel: (3950,5):o2net_connect_expired:1664 ERROR: no
connection established with node 0 after 35.0 seconds, giving up and returning
errors.
Jan 14 07:03:50 nvr1-rc kernel: (31,5):o2quo_make_decision:146 ERROR: fencing
this node because it is connected to a half-quorum of 1 out of 2 nodes which
doesn't include the lowest active node 0
Jan 14 07:03:50 nvr1-rc kernel: (31,5):o2hb_stop_all_regions:1967 ERROR:
stopping heartbeat on all active regions.
I'm sure there are no network connectivity problem but it is possible that
there are heavy IO loads, is this the intended behaviour? Why under heavy load
the loaded node is fenced?
I'm using ocfs2-1.4.4 on rhel5 kernel-2.6.18-164.6.1.el5
thanks
Nicola
More information about the Ocfs2-users
mailing list