[Ocfs2-users] CRS/CSS and OCFS2

Sunil Mushran Sunil.Mushran at oracle.com
Tue May 27 13:07:25 PDT 2008


AFAIK:
a. There is no force umount in Linux.
b. There is no way to know whether a local fs is mounted on another node.

Luis Freitas wrote:
> Alexandra,
>
>    You could use only CRS and ext3 instead of ocfs2 for this kind of 
> use. You would need to register a script to force umount the 
> filesystem on the primary node and mount it on the node you are 
> failing over to, it would be nice to be able to check if the 
> filesystem is mounted before atempting to mount it, but I am not sure 
> on how to do this)
>
>    Are you using a cross-over cable for the private interconnect?
>
> Regards,
> Luis
>
> --- On *Fri, 6/27/08, alexandra.strauss at bayerbbs.com 
> /<alexandra.strauss at bayerbbs.com>/* wrote:
>
>     From: alexandra.strauss at bayerbbs.com <alexandra.strauss at bayerbbs.com>
>     Subject: [Ocfs2-users] CRS/CSS and OCFS2
>     To: ocfs2-users at oss.oracle.com
>     Date: Friday, June 27, 2008, 10:41 AM
>
>
>     Hello,
>
>     I refer to you hoping you may help me with my problem... We have
>     got an issur here and opened a SR at Metalink but until now, we
>     got no useful information in solving our problem. SR-Number is
>     6855815.994...
>
>     We wanted to protect 9i Single-Instance Databases with 10g
>     Clusterware following the third-party-tool approach. There are no
>     RAC-databases involved. But we want to achieve high availability
>     as the databases are business critical systems. We want to make
>     the systems able to
>     relocate to another machine in case of failure to keep downtimes
>     low... To achieve this we want to use OCFS2 for the filesystem.
>     Relocate is done by script with help of CRS.
>
>     So we took two systems (byaz05 and byaz10) and installed the
>     following software: 10g CRS (10.2.0.3) and Oracle Software 9.2.0.8
>     and OCFS2 1.2.8
>
>     We found the following Metalinknotes and adjusted the heartbeat
>     and timeouts for OCFS2: Metalink Note 395878.1:
>     Heartbeat/Voting/Quorum Related Timeout Configuration for Linux,
>     OCFS2, RAC Stack to avoid unnessary node fencing, panic and reboot
>     Metalink Note 391771.1: OCFS2 - FREQUENTLY ASKED QUESTIONS (hier
>     insbesondere der Abschnitt zu Fencing und Quorum)
>     Metalink Note 434255.1: Common reasons for OCFS2 Kernel Panic or
>     Reboot Issues
>     Metalink Note 457423.1: OCFS2 Fencing, Network, and Disk Heartbeat
>     Timeout Configuration
>
>     We did no changes to the CRS/CSS default settings until now.
>
>     During HA-testing we watched unexpected behaviour of the system.
>     We deactivated the bond for private interconnect and expected only
>     one node to go down. But we faced both nodes going down. As it
>     seems to me one node was rebooted from OCFS2 and the other one
>     from CRS/CSS.
>
>     Timestamp                
>     --------------------------------------------------------------------------------------------------------------
>
>     10:21:06                bond1 disabled (eth1)                        
>     */var/log/messages byaz05*
>     Apr 25 10:21:06 byaz05 kernel: bonding: bond1: link status
>     definitely down for interface eth1, disabling it
>     Apr 25 10:21:06 byaz05 kernel: bonding: bond1: making interface
>     eth5 the new active one.
>
>     10:21:09                bond1 disabled (eth5)        
>     */var/log/messages byaz05*
>     Apr 25 10:21:09 byaz05 kernel: bonding: bond1: link status
>     definitely down for interface eth5, disabling it
>     Apr 25 10:21:09 byaz05 kernel: bonding: bond1: now running without
>     any active interface !
>
>     10:21:23                o2net – no longer connected                
>     */var/log/messages byaz05*
>     Apr 25 10:21:23 byaz05 kernel: o2net: no longer connected to node
>     byaz10.bayer-ag.com (num 1) at 10.190.59.6:7777
>     */var/log/messages byaz10*
>     Apr 25 10:21:23 byaz10 kernel: o2net: no longer connected to node
>     byaz05.bayer-ag.com (num 0) at 10.190.59.5:7777
>
>     10:21:27                CSSD failure 134
>     10:21:29                Reboot initiated by CRS
>     */var/log/messages byaz05*
>     Apr 25 10:21:27 byaz05 logger: Oracle clsomon failed with fatal
>     status 12.
>     Apr 25 10:21:27 byaz05 logger: Oracle CSSD failure 134.
>     Apr 25 10:21:27 byaz05 su(pam_unix)[25839]: session closed for
>     user oracle
>     Apr 25 10:21:27 byaz05 logger: Oracle CRS failure.  Rebooting for
>     cluster integrity.
>     Apr 25 10:21:27 byaz05 kernel: md: stopping all md devices.
>     Apr 25 10:21:27 byaz05 kernel: md: md0 switched to read-only mode.
>     Apr 25 10:21:29 byaz05 logger: Oracle CRS failure.  Rebooting for
>     cluster integrity.
>     Apr 25 10:21:29 byaz05 kernel: e1000: eth2: e1000_watchdog_task:
>     NIC Link is Up 1000 Mbps Full Duplex
>     Apr 25 10:21:29 byaz05 logger: Oracle init script ceding reboot to
>     sibling 27383.
>
>     10:21:58                Reboot initiated by OCFS2(?)
>     */var/log/messages byaz10*
>     Apr 25 10:21:58 byaz10 su(pam_unix)[4595]: session opened for user
>     oracle by (uid=0)
>     Apr 25 10:21:58 byaz10 su(pam_unix)[4595]: session closed for user
>     oracle
>     Apr 25 10:25:58 byaz10 syslogd 1.4.1: restart.
>     Apr 25 10:25:58 byaz10 syslog: syslogd startup succeeded
>     Apr 25 10:25:58 byaz10 kernel: klogd 1.4.1, log source =
>     /proc/kmsg started.
>     Apr 25 10:25:58 byaz10 kernel: Bootdata ok (command line is ro
>     root=/dev/vgroot/_)
>
>
>     We supposed all the time this is a timing problem. But we don't
>     know which settings raise the problem and which steps to do to to
>     correct them. Otherwise we'll have to work over the complete
>     concept for the business critical systems.
>     Can anyone help me?
>
>     Regards,
>     Alexandra
>
>
>
>
>
>
>
>
>
>
>
>
>
>     Freundliche Grüße / Best Regards
>
>     Alexandra Strauss
>     _________________________________________
>
>     Fa. Opitz Consulting
>     Fa. Opitz Consulting
>     Phone:
>     Fax:
>     E-mail:
>     Web: http://www.BayerBBS.com
>
>     Geschäftsführung: Vorsitzender Andreas Resch   |   Arbeitsdirektor
>     Norbert Fieseler
>     Vorsitzender des Aufsichtsrats: Klaus Kühn
>     Sitz der Gesellschaft: Leverkusen   |   Amtsgericht Köln, HRB 49895
>
>     _______________________________________________
>     Ocfs2-users mailing list
>     Ocfs2-users at oss.oracle.com
>     http://oss.oracle.com/mailman/listinfo/ocfs2-users
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users




More information about the Ocfs2-users mailing list