[Ocfs2-users] forcing ocfs2 NOT to reboot the server

Karim Alkhayer kkhayer at gmail.com
Fri Feb 13 05:00:43 PST 2009


How long does the switchover take?
I believe that if O2CB_HEARTBEAT_THRESHOLD value is greater than the time
taken to complete the switchover, then you'd overcome the reboot issue.

Let me know your thoughts.

Best regards,
Karim

-----Original Message-----
From: ocfs2-users-bounces at oss.oracle.com
[mailto:ocfs2-users-bounces at oss.oracle.com] On Behalf Of Mehmet Can ÖNAL
Sent: Friday, February 13, 2009 10:26 AM
To: Sunil Mushran
Cc: ocfs2-users at oss.oracle.com
Subject: Re: [Ocfs2-users] forcing ocfs2 NOT to reboot the server

Ok that was a little confusing. Let me start again

We have 6 node RAC es7000 and we bought new hp rx6600 servers and we want to
move our system to hps. we have two storage units. Es7000, our present
production system is connected to DMX1000 and hps are connected to DMX2000.
we configured a dataguard between these tow RAC systems and hp site is
doning redo-mirror, one member on DMX1000 site and the other disk is on the
DMX2000 site. Our problem is if we switch over to hps and become hps
production and when DMX1000 site has a disaster, altough it is the mirror
redo disk, hp systems are rebooted by ocfs2. This reboot is unneccassary for
our hp production system so then our database is closed nonsense. 
This is what we are considering to solve.
Could you advice something for the situation?

-----Original Message-----
From: Sunil Mushran [mailto:sunil.mushran at oracle.com] 
Sent: Thursday, February 12, 2009 8:30 PM
To: Mehmet Can ÖNAL
Cc: ocfs2-users at oss.oracle.com
Subject: Re: [Ocfs2-users] forcing ocfs2 NOT to reboot the server

Sorry, there is no trick or workaround to change the fencing mechanism.
Also, I am still not clear as what your arch is. One would imagine that
the mirroring process would be transparent to the filesystem.

Mehmet Can ÖNAL wrote:
> We are using /etc/sysconfig/o2cb as defaults so then as you expect
O2CB_HEARTBEAT_THRESHOLD is 31.
>
> The error message was :
>
> ## Message from syslog at fa01 at Sun Feb 8 01:08:49 2009 ...
> ## Fa01 kernel : Heartbeat thread (41) printing last 24 blocking
operations (cur=6)
> ## Message from syslog at fa01 at Sun Feb 8 01:08:49 2009 ...
> ## fa01 kernel : INdex 7: took 18 ms to do waiting for read completion 
> ## Message from syslog at fa01 at Sun Feb 8 01:08:49 2009 ...
> ## fa01 kernel : INdex 8: took 1959 ms to do msleep
> ## Message from syslog at fa01 at Sun Feb 8 01:08:49 2009 ...
> ## fa01 kernel : INdex 9: took 0 ms to do allocating bios for read
> ## Message from syslog at fa01 at Sun Feb 8 01:08:49 2009 ...
> ## fa01 kernel : INdex 10: took 0 ms to do bio alloc read
> ## Message from syslog at fa01 at Sun Feb 8 01:08:49 2009 ...
> ## fa01 kernel : INdex 11: took 23 ms to do waiting for read completion
>
> At our tests we should have overwrite one of the two redo disks storage
based. Thus we overwrite it with a clone of old disks by using emc software.
As a result our server found the disk as write disable that emc tool sends
that signal to other ends of the disk when it is operating a write process.
Then highly probable our server reboots after this wirte disable thing
however it is the second redo disk and the the loose of access of this disk
is not that important, rebooting the server. for this reason i asked this
question. Is there any tip or tricks that you would give?  
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> -----Original Message-----
> From: Sunil Mushran [mailto:sunil.mushran at oracle.com] 
> Sent: Wednesday, February 11, 2009 7:36 PM
> To: Mehmet Can ÖNAL
> Cc: ocfs2-users at oss.oracle.com
> Subject: Re: [Ocfs2-users] forcing ocfs2 NOT to reboot the server
>
> No, this is not configurable. We have to fence else the processes will
hang.
>
>  From your description it appears it is rebooting because the hb ios are
not
> completing within the timeout. What is your current setting?
> O2CB_HEARTBEAT_THRESHOLD in /etc/sysconfig/o2cb.
>
> Mehmet Can ÖNAL wrote:
>   
>> *Hi everyone;*
>>
>> * *
>>
>> *I want to ask you a question whether we can make ocfs2 services not 
>> to reboot server when a disk can not be accessed by that server. Can I 
>> set the importance level of a disk for ocfs2 that when one of the 
>> servers can not access low level important disk ocfs2 service only 
>> produces an alert for that not to restart the server. Can it be a 
>> mount option either?*
>>
>> * *
>>
>> *PS: Result for doing that is a disaster scenario and our temporary 
>> system should work under these conditions. Two redo disks are written 
>> at the same time by a server but one of them is a mirror. So then the 
>> access mirror could be ignored, that reboot is costly fort he 
>> importance of that disk.*
>>
>> * *
>>
>> *Thanx for your time*
>>
>>     
>
>   


_______________________________________________
Ocfs2-users mailing list
Ocfs2-users at oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users




More information about the Ocfs2-users mailing list