[Ocfs2-users] Catatonic nodes under SLES10

Mon Apr 9 16:26:50 PDT 2007

Fencing is not a fs operation but a cluster operation. The fs is only a 
client
of the cluster stack.

Alexei_Roudnev wrote:
> It all depends of the usage scenario.
>
> Tipical usage is, for example:
>
> (1) Shared application home. Writes happens once / week during maintanance,
> otehr time files are opened for reading only. Few logfiles
> can be redirected if required.
>
> So, when server see a problems, it HAD NOT any pending IO for a 3 days - so
> what the purpose of reboot? It 100% knows that NO ANY IO
> is pending, and other nodes have not any IO pending as well.
>
> (2) Backup storage for the RAC. FS is not opened 90% of the time. At night,
> one node opens it and creates a few files. Other node have not any pending
> IO on this FS. Fencing passive node (which dont run any backup) is not
> useful because it HAD NOT ANY PENDING IO for a few hours.
>
> (3) WEB server. 10 nodes, 1 only makes updates. The same - most nodes have
> not any pending IO.
>
> Of course there is always a risk of FS corruption in the clusters. Any layer
> can keep pending IO forever (I saw Linux kernel keeping it for 10 minutes).
> Problem is that in such cases software fencing can't help as well because
> node is half-dead and can't detect it's own status.
>
> So, the key point here is not in _fence for each ap-chi_ but _keep system
> without pending writes as long as possible and make clean transition between
> active write/active read / passive states. Then you can avoid self-fencing
> in 90% cases (because of server wil be in passive or active reads state). I
> mounT FS but don't cd into it, or just CD but dont read - passive status. I
> read file - active read for 1 minute, tbhnen flush buffers so that it is in
> passive mode again. I began top write - switch system to write mode. I did
> not write blocks for 1 minute - flush everything, wait 1 more minute and
> switch to passive mode.
>
>
>
>
> ----- Original Message ----- 
> From: "Sunil Mushran" <Sunil.Mushran at oracle.com>
> To: "David Miller" <syslog at d.sparks.net>
> Cc: <ocfs2-users at oss.oracle.com>
> Sent: Monday, April 09, 2007 3:18 PM
> Subject: Re: [Ocfs2-users] Catatonic nodes under SLES10
>
>
>   
>> For io fencing to be graceful, one requires better hardware. Read
>>     
> expensive.
>   
>> As in, switches where one can choke off all the ios to the storage from
>> a specific
>> node.
>>
>> Read the following for a discussion on force umounts. In short, not
>> possible as yet.
>> http://lwn.net/Articles/192632/
>>
>> Readonly does not work wrt to io fencing. As in, ro only stops any new
>> userspace
>> writes but cannot stop pending writes. And writes could be lodged in any
>> io layer.
>> A reboot is the cheapest way to avoid corruption. (While a reboot is
>> painful, it is
>> much less painful than a corrupted fs.)
>>
>> With 1.2.5 you should be able to increase the network timeouts and
>> hopefully avoid
>> the problem.
>>
>> David Miller wrote:
>>     
>>> Alexei_Roudnev wrote:
>>>       
>>>> Did you checked
>>>>
>>>>  /proc/sys/kernel/panic  /proc/sys/kernel/panic_on_oops
>>>>
>>>> system variables?
>>>>
>>>>         
>>> No.  Maybe I'm missing something here.
>>>
>>> Are you saying that a panic/freeze/reboot is the expected/desirable
>>> behavior?  That nothing more graceful could be done, like to just
>>> dismount the ocfs2 file systems, or force them to a read-only mount or
>>> something like that?  We have to reload the kernel?
>>>
>>> Thanks,
>>>
>>> --- David
>>>
>>>       
>>>> ----- Original Message ----- From: "David Miller" <syslog at d.sparks.net>
>>>> To: <ocfs2-users at oss.oracle.com>
>>>> Sent: Monday, April 02, 2007 9:01 AM
>>>> Subject: [Ocfs2-users] Catatonic nodes under SLES10
>>>>
>>>>         
>>> [snip]
>>>
>>>       
>>>> Both servers will be connected to a dual-host external RAID system.
>>>> I've setup ocfs2 on a couple of test systems and everything appears
>>>> to work fine.
>>>>
>>>> Until, that is, one of the systems loses network connectivity.
>>>>
>>>> When the systems can't talk to each other anymore, but the disk
>>>> heartbeat is still alive, the high numbered node goes catatonic.
>>>> Under SLES 9 it fenced itself off with a kernel panic; under 10 it
>>>> simply stops responding to network or console.  A power cycling is
>>>> required to bring it back up.
>>>>
>>>> The desired behavior would be for the higher numbered node to lose
>>>> access to the ocfs2 file system(s).  I don't really care whether it
>>>> would simply timeout ala stale NFS mounts, or immediately error like
>>>> access to non-existent files.
>>>>
>>>>
>>>>         
>>> _______________________________________________
>>> Ocfs2-users mailing list
>>> Ocfs2-users at oss.oracle.com
>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>>       
>> _______________________________________________
>> Ocfs2-users mailing list
>> Ocfs2-users at oss.oracle.com
>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>
>>     
>
>