[Ocfs2-users] Catatonic nodes under SLES10

Mark Fasheh mark.fasheh at oracle.com
Tue Apr 10 09:28:32 PDT 2007


On Mon, Apr 09, 2007 at 08:47:42PM -0700, Alexei_Roudnev wrote:
> It is all true, when system _READS_, but not when it is just 'get all
> buffers and sit quiet_.

Ocfs2 (and other cluster file systems) cache locks, as well as other cluster
data. So there's really never a time when it's "silent" with respect to the
cluster.


> I mean that it is possible to find a quiet states, when cluster can be
> remounted without any harm. Even with the reads, but more likely when there
> is lot any activity on the file system for a while.

It's not at all about what your past activitiy was like. We fence to prevent
future activity.


> Btw, why OCFSv1 had not such problems? It sacrified  functionality (worked
> with oracle only), but it buy so much stability, that such mode can be
> extremely usefull for OCFSv2 too.

Ocfs (version 1) did not run as a tightly coupled cluster. It ensured
cluster integrity by communication via disk I/O (and later network was added
for some subset of messages) and ordered writeout of object
meta data. The way it made that work while avoiding extremely complicated
dependencies between objects is by making said writeout synchronous.

If all you had was some Oracle data files, this worked well. Feature wise
though, it can barely be called a file system. Data coherency for example,
is non existent. Hard links don't work. Performance in particular is
extremely poor. I could go on for hours. The bottom line is that what you
misleading call a "mode" is actually an intentionally crippled,
micro-focused file system design that permeates from the user interface down
to the disk layout. *


Now, Alexei. You've managed to hijack another thread and switch our focus
from helping users to refuting another of your incessant fencing flames.
Well, you've got my attention. But there's a price. I want to know when this
will end. When can someone on this list mention "fencing" without provoking
a long ranty e-mail from you about how we should have done things in the 1st
place?

Despite the tone of my last paragraph, I don't want to drive you from this
list, because you certainly provide good input from time to time. But you
should understand that the goal of this list is to help our users, and every
time you start one of these threads you distract us from that goal. So we
take it very seriously.

I propose that you file a bugzilla, which is the formal way of requesting
changes in the file system. The default owner is Sunil, but please feel free
to add my e-mail address as a CC. We will both automatically get an e-mail
when that bug is updated. You can and should put your thoughts in there.
Though I can not promise any time tables, I can certainly say that we'll
update that bug as we proceed on fencing issues. Trust me, we want to make a
better cluster stack too and fencing is high on our list of things to make
better.

These threads need to stop, now.
	--Mark


* Despite how I may sound right now, we're actually very proud of Ocfs v1.
  It served a stated purpose quite well, which IMHO is more than most
  software can claim.

--
Mark Fasheh
Senior Software Developer, Oracle
mark.fasheh at oracle.com



More information about the Ocfs2-users mailing list