[Ocfs2-users] CRS/CSS and OCFS2

Luis Freitas lfreitas34 at yahoo.com
Tue May 27 14:31:56 PDT 2008


Hmm,

  There is a "lazy" umount:


       -l     Lazy unmount. Detach the filesystem from the filesystem hierar-
              chy  now,  and cleanup all references to the filesystem as soon
              as it is not busy anymore. This option allows a âbusyâ filesys-
              tem to be unmounted.  (Requires kernel 2.4.11 or later.)

  Not sure if this prevents writes on the filesystem after the umount completes, or if there is some way to fence the device, so these problems would need to be verified.

  This is the way any other cold failover HA solution works, they dont use cluster filesystems. If there is no way to force the filesystem to be umounted or fence the device, it should be possible to configure CRS to evict the primary node before the filesystem is mounted on the secondary node. (Or use some external fence device, like a SAN switch or power appliance).

  Btw, is OCFS2 officially supported with Oracle 9i single instance databases?

Regards,
Luis
--- On Tue, 5/27/08, Sunil Mushran <Sunil.Mushran at oracle.com> wrote:
From: Sunil Mushran <Sunil.Mushran at oracle.com>
Subject: Re: [Ocfs2-users] CRS/CSS and OCFS2
To: lfreitas34 at yahoo.com
Cc: ocfs2-users at oss.oracle.com, alexandra.strauss at bayerbbs.com
Date: Tuesday, May 27, 2008, 5:07 PM

AFAIK:
a. There is no force umount in Linux.
b. There is no way to know whether a local fs is mounted on another node.

Luis Freitas wrote:
> Alexandra,
>
>    You could use only CRS and ext3 instead of ocfs2 for this kind of 
> use. You would need to register a script to force umount the 
> filesystem on the primary node and mount it on the node you are 
> failing over to, it would be nice to be able to check if the 
> filesystem is mounted before atempting to mount it, but I am not sure 
> on how to do this)
>
>    Are you using a cross-over cable for the private interconnect?
>
> Regards,
> Luis
>
> --- On *Fri, 6/27/08, alexandra.strauss at bayerbbs.com 
> /<alexandra.strauss at bayerbbs.com>/* wrote:
>
>     From: alexandra.strauss at bayerbbs.com
<alexandra.strauss at bayerbbs.com>
>     Subject: [Ocfs2-users] CRS/CSS and OCFS2
>     To: ocfs2-users at oss.oracle.com
>     Date: Friday, June 27, 2008, 10:41 AM
>
>
>     Hello,
>
>     I refer to you hoping you may help me with my problem... We have
>     got an issur here and opened a SR at Metalink but until now, we
>     got no useful information in solving our problem. SR-Number is
>     6855815.994...
>
>     We wanted to protect 9i Single-Instance Databases with 10g
>     Clusterware following the third-party-tool approach. There are no
>     RAC-databases involved. But we want to achieve high availability
>     as the databases are business critical systems. We want to make
>     the systems able to
>     relocate to another machine in case of failure to keep downtimes
>     low... To achieve this we want to use OCFS2 for the filesystem.
>     Relocate is done by script with help of CRS.
>
>     So we took two systems (byaz05 and byaz10) and installed the
>     following software: 10g CRS (10.2.0.3) and Oracle Software 9.2.0.8
>     and OCFS2 1.2.8
>
>     We found the following Metalinknotes and adjusted the heartbeat
>     and timeouts for OCFS2: Metalink Note 395878.1:
>     Heartbeat/Voting/Quorum Related Timeout Configuration for Linux,
>     OCFS2, RAC Stack to avoid unnessary node fencing, panic and reboot
>     Metalink Note 391771.1: OCFS2 - FREQUENTLY ASKED QUESTIONS (hier
>     insbesondere der Abschnitt zu Fencing und Quorum)
>     Metalink Note 434255.1: Common reasons for OCFS2 Kernel Panic or
>     Reboot Issues
>     Metalink Note 457423.1: OCFS2 Fencing, Network, and Disk Heartbeat
>     Timeout Configuration
>
>     We did no changes to the CRS/CSS default settings until now.
>
>     During HA-testing we watched unexpected behaviour of the system.
>     We deactivated the bond for private interconnect and expected only
>     one node to go down. But we faced both nodes going down. As it
>     seems to me one node was rebooted from OCFS2 and the other one
>     from CRS/CSS.
>
>     Timestamp                
>    
--------------------------------------------------------------------------------------------------------------
>
>     10:21:06                bond1 disabled (eth1)                        
>     */var/log/messages byaz05*
>     Apr 25 10:21:06 byaz05 kernel: bonding: bond1: link status
>     definitely down for interface eth1, disabling it
>     Apr 25 10:21:06 byaz05 kernel: bonding: bond1: making interface
>     eth5 the new active one.
>
>     10:21:09                bond1 disabled (eth5)        
>     */var/log/messages byaz05*
>     Apr 25 10:21:09 byaz05 kernel: bonding: bond1: link status
>     definitely down for interface eth5, disabling it
>     Apr 25 10:21:09 byaz05 kernel: bonding: bond1: now running without
>     any active interface !
>
>     10:21:23                o2net – no longer connected                
>     */var/log/messages byaz05*
>     Apr 25 10:21:23 byaz05 kernel: o2net: no longer connected to node
>     byaz10.bayer-ag.com (num 1) at 10.190.59.6:7777
>     */var/log/messages byaz10*
>     Apr 25 10:21:23 byaz10 kernel: o2net: no longer connected to node
>     byaz05.bayer-ag.com (num 0) at 10.190.59.5:7777
>
>     10:21:27                CSSD failure 134
>     10:21:29                Reboot initiated by CRS
>     */var/log/messages byaz05*
>     Apr 25 10:21:27 byaz05 logger: Oracle clsomon failed with fatal
>     status 12.
>     Apr 25 10:21:27 byaz05 logger: Oracle CSSD failure 134.
>     Apr 25 10:21:27 byaz05 su(pam_unix)[25839]: session closed for
>     user oracle
>     Apr 25 10:21:27 byaz05 logger: Oracle CRS failure.  Rebooting for
>     cluster integrity.
>     Apr 25 10:21:27 byaz05 kernel: md: stopping all md devices.
>     Apr 25 10:21:27 byaz05 kernel: md: md0 switched to read-only mode.
>     Apr 25 10:21:29 byaz05 logger: Oracle CRS failure.  Rebooting for
>     cluster integrity.
>     Apr 25 10:21:29 byaz05 kernel: e1000: eth2: e1000_watchdog_task:
>     NIC Link is Up 1000 Mbps Full Duplex
>     Apr 25 10:21:29 byaz05 logger: Oracle init script ceding reboot to
>     sibling 27383.
>
>     10:21:58                Reboot initiated by OCFS2(?)
>     */var/log/messages byaz10*
>     Apr 25 10:21:58 byaz10 su(pam_unix)[4595]: session opened for user
>     oracle by (uid=0)
>     Apr 25 10:21:58 byaz10 su(pam_unix)[4595]: session closed for user
>     oracle
>     Apr 25 10:25:58 byaz10 syslogd 1.4.1: restart.
>     Apr 25 10:25:58 byaz10 syslog: syslogd startup succeeded
>     Apr 25 10:25:58 byaz10 kernel: klogd 1.4.1, log source =
>     /proc/kmsg started.
>     Apr 25 10:25:58 byaz10 kernel: Bootdata ok (command line is ro
>     root=/dev/vgroot/_)
>
>
>     We supposed all the time this is a timing problem. But we don't
>     know which settings raise the problem and which steps to do to to
>     correct them. Otherwise we'll have to work over the complete
>     concept for the business critical systems.
>     Can anyone help me?
>
>     Regards,
>     Alexandra
>
>
>
>
>
>
>
>
>
>
>
>
>
>     Freundliche Grüße / Best Regards
>
>     Alexandra Strauss
>     _________________________________________
>
>     Fa. Opitz Consulting
>     Fa. Opitz Consulting
>     Phone:
>     Fax:
>     E-mail:
>     Web: http://www.BayerBBS.com
>
>     Geschäftsführung: Vorsitzender Andreas Resch   |   Arbeitsdirektor
>     Norbert Fieseler
>     Vorsitzender des Aufsichtsrats: Klaus Kühn
>     Sitz der Gesellschaft: Leverkusen   |   Amtsgericht Köln, HRB 49895
>
>     _______________________________________________
>     Ocfs2-users mailing list
>     Ocfs2-users at oss.oracle.com
>     http://oss.oracle.com/mailman/listinfo/ocfs2-users
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users


      
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20080527/db59b18e/attachment-0001.html 


More information about the Ocfs2-users mailing list