[Ocfs2-users] Re: FW: Use of OCFS2 file systems.
Galan Merchan, Martin
martin.galan at t-systems.es
Wed Oct 4 00:37:10 PDT 2006
Hello,
I'm working with OCFS2 on Radhat Advanced Server 4 Patch 3 and I had kernel panics too. I use OCFS2 only for RAC archive logs and RMAN backups.
Well, I'm testing one solution and seems to be fine:
In /etc/ocfs2/cluster.conf I have replaced the public IPs by the heartbeat IPs (parameter ip_address), but keeping the names.
Is there anyone that knows this solution and have tested it with fails?
Regards from Spain,
MARTÍN
-----Mensaje original-----
De: ocfs2-users-bounces at oss.oracle.com [mailto:ocfs2-users-bounces at oss.oracle.com] En nombre de Alexei_Roudnev
Enviado el: miércoles, 04 de octubre de 2006 0:49
Para: Sunil Mushran; ocfs2-users
Asunto: Re: [Ocfs2-users] Re: FW: Use of OCFS2 file systems.
Unfortunately, it MAKES CLUSTER LESS STABLE. It works until network and SAN
systems afe fine, but is not so good in failed situations.
Even if we use OCFSv2 for idle file systems (which do nothing 90% of the
time) , o2cb reboots nodes when lost heartbeat
or (worst) network or (even worst) both... Instead of trying to recover
without it (as I said 0- FS is in consistant state,
no activity at all).
It is not just OCFSv2 problem - Oracle CSS behave simular (butis much more
stable in reality), and Linux HA cluster
too (but it can use different heartbeat conenctions so it can be configured
very reliable).
You are right saying that _cluster software always have a tendency to fence
or kill neighbours to keep
internal consistancy_. But OCFSv2 is one of he worst examples of such
software.
What can be done _relatively easy_.
(1) as we saiud many times - redundancy and better timeout control in
heartbeat. (Of course, long timeouts means _long recovery_, but it's OK for
90%
installations). Typical network recovery is 1 minute, not 10 seconds.
(2) System should not make bad things IF it is in consistant state. In many
cases, if system have not outstanding IO requests, it can recover
without server reboot (or at least try to do it) even if it lost heartbeats
and suspect, that other systems could take control out of it.
It is serious theoretical challenge _how to do it safely_, but it is very
desired for such systems.
(3) In some configurations, FS can be treated as _not so important_. It
means that it is safer to switch into red_only and try to recover online,
but not panic. Good example - you have production Oracle which uses ASM, and
you use OCFSv2 for backup storage. IT is safer to make IOP failure on this
storage vs rebooting system without reasons.
PS. I had 2 network outages in the lab today,m because of bad UPS - and in
all cases, ALL OCFSv2 servers (in 2 different clusters) rebooted. No one
survived short (30 seconds) lost of Ethernet conenction (including iSCSI).
In some cases, one server rebooted by OCFS and otehr by another part of the
cluster (HA or RAC) - but result is exactly this - _all_ OCFSv2 panic on a
shport network/san outage, in all cases.
----- Original Message -----
From: "Sunil Mushran" <Sunil.Mushran at oracle.com>
To: "ocfs2-users" <ocfs2-users at oss.oracle.com>
Sent: Tuesday, October 03, 2006 1:51 PM
Subject: [Ocfs2-users] Re: FW: Use of OCFS2 file systems.
> I try to avoid responding to such emails because I am not sure how
> much credibility a partisan has in such debates. After all I have been
> working on OCFS/OCFS2 the last 4/5 years.
>
> Having said that, I have some issues with the statements. While it is true
> that we can improve on the disk/net heartbeat, it is wrong to say that it
> does not work or makes the cluster unstable.
>
> We have OCFS2 running on lots of clusters in Oracle that are testing each
> new revision of the database. While these machines are test boxes, they
are
> all running loads designed to break Oracle. I am rarely pinged about them
> hitting an OCFS2 issue.
>
> We also have internal production databases as well as Oracle customers who
> are using OCFS2 with much success.
>
> However, we do have room for improvement and we are working on it.
>
> For the list of ongoing projects, you can peruse the OCFS2 Development
> Wiki at http://oss.oracle.com/osswiki/OCFS2.
>
> If you wish to contribute code, as this is an open source project, feel
free
> to ping me or the ocfs2-devel at oss.oracle.com mailing list.
>
> Thanks
> Sunil Mushran
>
> >
> > Hi Sunial,
> >
> > What are your thoughts about this message on the mailing lists?
> >
> > Thanks!
> > Sanjeet
> >
> >
> > ------------------------------------------------------------------------
> >
> > *From:* ocfs2-users-bounces at oss.oracle.com
> > [mailto:ocfs2-users-bounces at oss.oracle.com] *On Behalf Of
*Alexei_Roudnev
> > *Sent:* Friday, September 29, 2006 11:50 PM
> > *To:* Bill Wells; Sunil Mushran
> > *Cc:* ocfs2-users at oss.oracle.com
> > *Subject:* Re: [Ocfs2-users] Use of OCFS2 file systems.
> >
> >
> >
> > If you can avoid OCFSv2 on a RAC server, better do it. Any cluster
> > (RAC and OCFS) have it's own instability elements (OCFSv2 have a poor
> > heartbeat alghoritm and so tend to self-fence without real failure,
> > and (in addition) is relatively new. It works fine enough to be used,
> > when you really need file sharing (such as database files or backups
> > or even archive logs), but the less you use it, the better. Oracle
> > home files feels well without sharing.
> >
> >
> >
> > // I don't see problems with OCFSv2 on SLES9 SP3-updated, but I avoid
> > to use it for mission critical file systems or heavy-duty file systems,
> >
> > // and I still have failure scenario, when RAC cluster could work but
> > OCFS cause full-cluster failure
> >
> > // If you have network problem, SAN
> >
> > // system restart, disk io error, etc etc - you can end up with system
> > panic or reboot, caused by OCFS -
> >
> > // so the less OCFS you have, the better is your system stability.
> >
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>
_______________________________________________
Ocfs2-users mailing list
Ocfs2-users at oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users
This e-mail may contain confidential or privileged information. Any unauthorised
copying, use or distribution of this information is strictly prohibited.
Este mensaje electrónico puede contener información confidencial o privilegiada, por lo
que está completamente prohibida la copia, el uso o la distribución no autorizada de
dicha información
Aquest missatge electrònic pot contenir informació confidencial o privilegiada i està
completament prohibida qualsevol còpia, ús o distribució no autoritzada d'aquesta
informació.
Mezu honek, enpresaren jabetzapeko edo legalki babestutako isilpeko informazioa izan dezake.
Zu ez baldin bazara hartzailea, mesedez bidaltzaileari jakinarazi iezaiozu eta mezua ezabatu,
ez ezazu gorde ezta birbidali ere, baimendu gabeko bere erabilera debekatzen da eta.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20061004/ceb5589b/attachment-0001.html
More information about the Ocfs2-users
mailing list