[Ocfs2-users] Support and Stability

Brad Plant bplant at iinet.net.au
Mon May 24 15:02:39 PDT 2010


Hi All,

Firstly, I want to say that I very much appreciate all the time that
the Oracle devs have spent on ocfs2 and responding to issues on the
mailing list. They are always quite prompt and responding to issues on
the mailing list which I'm personally grateful for.

On Mon, 24 May 2010 13:32:47 -0400
Michael Austin <onedbguru at gmail.com> wrote:

> I would like to get some feedback on the overall perception on the
> support and stability of OCFS2 (latest).  This tool looks like a
> perfect fit for a production system I am planning, but, due to it's
> open source roots, there are some concerns about s&s.  The app will
> be deemed mission critical with very little tolerance for any
> downtime (24x365).

As Brian Kroth mentioned, the disk free fragmentation issue can really
bite you. My understanding is that if the FS is created with
ocfs2-tools 1.4.4 (the latest release), then you shouldn't hit the
issue. I've created a new FS on our dev platform to test it, but I'm
currently adding the "time" ingredient.

I don't like saying it, but I've had some pretty major issues with
stability when the nodes are under heavy load (both disk and cpu). I
had a web app that was reading writing cache data to/from the ocfs2 FS
and that seemed to trigger node trace-less crashes/reboots after a
hour or so. I was able to reproduce this with any vanilla kernel in the
2.6.27-2.6.32 range. I haven't tried 2.6.33 or 2.6.34, but I'm not
aware of any fixes.

I also found with 2.6.32, killing one node would result in a BUG() on
the other node during the recovery. This didn't happen with 2.6.30
and before (not sure about 2.6.31). See my bz here:
http://oss.oracle.com/bugzilla/show_bug.cgi?id=1221

On the positive side though, I do find that using centos 5 (and the
rpms from oss.oracle.com) seemed to be rock solid. The only time that
I've had a centos 5 kernel crash is when I've also been mounting the FS
with a mainline (2.6.27-2.6.32) kernel. If the mainline kernel crashed,
the centos kernel would *sometimes* also go down. The setup hasn't
faulted since only using centos 5 kernels (maybe because they don't
crash in the first place!).

I think ocfs2 is a great product that is very easy to use (compared to
GFS for example which requires you to configure fencing). My
recommendation would be to stick with centos 5 and do some heavy (for
24+ hours) testing before deployment.

Cheers,

Brad




More information about the Ocfs2-users mailing list