[Ocfs2-users] OCFS2 Node restart

Raheel Akhtar rakhtar at ryerson.ca
Wed Jul 22 11:29:43 PDT 2009


Hi,

 

I have 6 nodes cluster with OCFS2 1.4.2 running on vmware virtual system
RedHat 5.2 (2.6.18-128.1.16.el5) 64bit.

 

Out of 6 nodes two nodes alf0 and alf3 reboot automatically, I enabled
remote logging for kernel, and here is log.

I noticed VM become non-response and suddenly reboots. I am running Alfresco
(documents sharing) application all nodes are accessing common share on
OCFS.

 

---------------------------------------------------------

-Jul 22 09:01:25 172.25.29.10 kernel: o2net: connection to node alf3 (num 3)
at 172.25.29.13:7777 has been idle for 30.0 secon

ds, shutting it down. 

-Jul 22 09:01:25 172.25.29.10 kernel: (0,1):o2net_idle_timer:1506 here are
some times that might help debug the situation: (tm

r 1248267655.660420 now 1248267685.655778 dr 1248267655.660405 adv
1248267655.660422:1248267655.660423 func (0ffa2aed:505) 12

48267647.662032:1248267647.662034) 

-Jul 22 09:01:25 172.25.29.10 kernel: o2net: no longer connected to node
alf3 (num 3) at 172.25.29.13:7777 

-Jul 22 09:01:25 172.25.29.15 kernel: o2net: connection to node alf3 (num 3)
at 172.25.29.13:7777 has been idle for 30.0 secon

ds, shutting it down. 

-Jul 22 09:01:25 172.25.29.15 kernel: (0,0):o2net_idle_timer:1506 here are
some times that might help debug the situation: (tm

r 1248267655.816401 now 1248267685.812715 dr 1248267655.816401 adv
1248267655.816401:1248267655.816401 func (0ffa2aed:502) 12

48267507.842160:1248267507.842160) 

-Jul 22 09:01:25 172.25.29.15 kernel: o2net: no longer connected to node
alf3 (num 3) at 172.25.29.13:7777 

-Jul 22 09:01:55 172.25.29.10 kernel: (2733,1):o2net_connect_expired:1667
ERROR: no connection established with node 3 after 3

0.0 seconds, giving up and returning errors. 

-Jul 22 09:01:55 172.25.29.15 kernel: (2541,0):o2net_connect_expired:1667
ERROR: no connection established with node 3 after 3

0.0   seconds, giving up and returning errors.

 

 

How can I know which is having Quorum? And can I move to less busy node.

 

Thanks

Raheel

 

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20090722/c8fa16b6/attachment.html 


More information about the Ocfs2-users mailing list