[Ocfs2-users] disable heartbeat nic caused ocfs2 errors

Hai Tao taoh666 at hotmail.com
Sat Sep 10 00:50:23 PDT 2011


I have a two nodes ocfs2 cluster, and I disabled the heartbeat nic with "ifdown eth1". I got following weird logs on both nodes:
 
Sep  7 10:45:49 dbtest-01 kernel: o2net: connection to node dbtest-02 (num 1) at 10.194.59.65:7777 has been idle for 30.0 seconds, shutting it down.
Sep  7 10:45:49 dbtest-01 kernel: (swapper,0,3):o2net_idle_timer:1503 here are some times that might help debug the situation: (tmr 1315417519.185025 now 1315417549.183798 dr 1315417519.185016 adv 1315417519.185032:1315417519.185032 func (b9bb7168:504) 1315417518.872227:1315417518.872268)
Sep  7 10:45:49 dbtest-01 kernel: o2net: no longer connected to node dbtest-02 (num 1) at 10.194.59.65:7777
Sep  7 10:45:49 dbtest-01 kernel: (dlm_thread,3781,2):dlm_send_proxy_ast_msg:457 ERROR: status = -112
Sep  7 10:45:49 dbtest-01 kernel: (oracle,26129,1):dlm_do_master_request:1334 ERROR: link to 1 went down!
Sep  7 10:45:49 dbtest-01 kernel: (oracle,26129,1):dlm_get_lock_resource:917 ERROR: status = -112
Sep  7 10:45:49 dbtest-01 kernel: (dlm_thread,4256,1):dlm_send_proxy_ast_msg:457 ERROR: status = -112
Sep  7 10:45:49 dbtest-01 kernel: (dlm_thread,4256,1):dlm_flush_asts:604 ERROR: status = -112
Sep  7 10:45:49 dbtest-01 kernel: (dlm_thread,3781,2):dlm_flush_asts:604 ERROR: status = -112
Sep  7 10:46:19 dbtest-01 kernel: (o2net,3736,3):o2net_connect_expired:1664 ERROR: no connection established with node 1 after 30.0 seconds, giving up and returning errors.
Sep  7 10:46:19 dbtest-01 kernel: o2net: accepted connection from node dbtest-02 (num 1) at 10.194.59.65:7777
Sep  7 10:48:37 dbtest-01 kernel: INFO: task events/0:10 blocked for more than 120 seconds.
Sep  7 10:48:37 dbtest-01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep  7 10:48:37 dbtest-01 kernel: events/0      D ffff810001004420     0    10      1            11     9 (L-TLB)
Sep  7 10:48:37 dbtest-01 kernel:  ffff81083ffedc80 0000000000000046 ffffffff80333680 0000000000000001
Sep  7 10:48:37 dbtest-01 kernel:  0000000000000400 000000000000000a ffff81083ffe1820 ffffffff80309b60
Sep  7 10:48:37 dbtest-01 kernel:  0030b62498ce7b3f 000000000000416b ffff81083ffe1a08 0000000000000000
Sep  7 10:48:37 dbtest-01 kernel: Call Trace:
Sep  7 10:48:37 dbtest-01 kernel: Call Trace:
Sep  7 10:48:37 dbtest-01 kernel:  [<ffffffff80064167>] wait_for_completion+0x79/0xa2
Sep  7 10:48:37 dbtest-01 kernel:  [<ffffffff8008e16d>] default_wake_function+0x0/0xe
Sep  7 10:48:37 dbtest-01 kernel:  [<ffffffff884e64b7>] :ocfs2:ocfs2_wait_for_mask+0xd/0x19
Sep  7 10:48:37 dbtest-01 kernel:  [<ffffffff884e78d8>] :ocfs2:ocfs2_cluster_lock+0x9ae/0x9d3
Sep  7 10:48:37 dbtest-01 kernel:  [<ffffffff885013e5>] :ocfs2:ocfs2_orphan_scan_work+0x0/0x83
Sep  7 10:48:37 dbtest-01 kernel:  [<ffffffff884ed1e4>] :ocfs2:ocfs2_orphan_scan_lock+0x55/0x84
Sep  7 10:48:37 dbtest-01 kernel:  [<ffffffff884fc59b>] :ocfs2:ocfs2_queue_orphan_scan+0x32/0x147
Sep  7 10:48:37 dbtest-01 kernel:  [<ffffffff885013ff>] :ocfs2:ocfs2_orphan_scan_work+0x1a/0x83
Sep  7 10:48:37 dbtest-01 kernel:  [<ffffffff8004dc37>] run_workqueue+0x94/0xe4
Sep  7 10:48:37 dbtest-01 kernel:  [<ffffffff8004a472>] worker_thread+0x0/0x122
Sep  7 10:48:37 dbtest-01 kernel:  [<ffffffff8004a562>] worker_thread+0xf0/0x122
Sep  7 10:48:37 dbtest-01 kernel:  [<ffffffff8008e16d>] default_wake_function+0x0/0xe
Sep  7 10:48:37 dbtest-01 kernel:  [<ffffffff80032bdc>] kthread+0xfe/0x132
Sep  7 10:48:37 dbtest-01 kernel:  [<ffffffff8005efb1>] child_rip+0xa/0x11
Sep  7 10:48:37 dbtest-01 kernel:  [<ffffffff80032ade>] kthread+0x0/0x132
Sep  7 10:48:37 dbtest-01 kernel:  [<ffffffff8005efa7>] child_rip+0x0/0x11
Sep  7 10:48:37 dbtest-01 kernel:

Does anyone know why this happened?
 
Thanks. 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20110910/95f29a65/attachment-0001.html 


More information about the Ocfs2-users mailing list