[Ocfs2-users] OCFS2 Node restart

Raheel Akhtar rakhtar at ryerson.ca
Wed Jul 22 11:49:29 PDT 2009


Hi,

1. I tried to setup netconsole but getting NO logs on logging system.
2. How can check which node is having quorum and how to move to different
node?

Thanks


-----Original Message-----
From: Sunil Mushran [mailto:sunil.mushran at oracle.com] 
Sent: Wednesday, July 22, 2009 2:45 PM
To: Raheel Akhtar
Cc: ocfs2-users at oss.oracle.com
Subject: Re: [Ocfs2-users] OCFS2 Node restart

Please file a bugzilla and attach the netconsole logs of all six nodes.

The messages provided indicate that that node saw the two nodes
become unresponsive. As to why they became unresponsive will be
known only after we see the netconsole logs of the two nodes.

Raheel Akhtar wrote:
>
> Hi,
>
>  
>
> I have 6 nodes cluster with OCFS2 1.4.2 running on vmware virtual 
> system RedHat 5.2 (2.6.18-128.1.16.el5) 64bit.
>
>  
>
> Out of 6 nodes two nodes alf0 and alf3 reboot automatically, I enabled 
> remote logging for kernel, and here is log.
>
> I noticed VM become non-response and suddenly reboots. I am running 
> Alfresco (documents sharing) application all nodes are accessing 
> common share on OCFS.
>
>  
>
> ---------------------------------------------------------
>
> -Jul 22 09:01:25 172.25.29.10 kernel: o2net: connection to node alf3 
> (num 3) at 172.25.29.13:7777 has been idle for 30.0 secon
>
> ds, shutting it down.
>
> -Jul 22 09:01:25 172.25.29.10 kernel: (0,1):o2net_idle_timer:1506 here 
> are some times that might help debug the situation: (tm
>
> r 1248267655.660420 now 1248267685.655778 dr 1248267655.660405 adv 
> 1248267655.660422:1248267655.660423 func (0ffa2aed:505) 12
>
> 48267647.662032:1248267647.662034)
>
> -Jul 22 09:01:25 172.25.29.10 kernel: o2net: no longer connected to 
> node alf3 (num 3) at 172.25.29.13:7777
>
> -Jul 22 09:01:25 172.25.29.15 kernel: o2net: connection to node alf3 
> (num 3) at 172.25.29.13:7777 has been idle for 30.0 secon
>
> ds, shutting it down.
>
> -Jul 22 09:01:25 172.25.29.15 kernel: (0,0):o2net_idle_timer:1506 here 
> are some times that might help debug the situation: (tm
>
> r 1248267655.816401 now 1248267685.812715 dr 1248267655.816401 adv 
> 1248267655.816401:1248267655.816401 func (0ffa2aed:502) 12
>
> 48267507.842160:1248267507.842160)
>
> -Jul 22 09:01:25 172.25.29.15 kernel: o2net: no longer connected to 
> node alf3 (num 3) at 172.25.29.13:7777
>
> -Jul 22 09:01:55 172.25.29.10 kernel: 
> (2733,1):o2net_connect_expired:1667 ERROR: no connection established 
> with node 3 after 3
>
> 0.0 seconds, giving up and returning errors.
>
> -Jul 22 09:01:55 172.25.29.15 kernel: 
> (2541,0):o2net_connect_expired:1667 ERROR: no connection established 
> with node 3 after 3
>
> 0.0   seconds, giving up and returning errors.
>
>  
>
>  
>
> How can I know which is having Quorum? And can I move to less busy node.
>
>  
>
> Thanks
>
> Raheel
>
>  
>
>  
>
>  
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users




More information about the Ocfs2-users mailing list