[Ocfs2-users] Filesystem corruption and OCFS2 errors

Wed May 20 15:36:52 PDT 2009

Christian van Barneveld wrote:
> No, I don't have the full output, but I still have the snapshots that I've made before teh FSCK. I've mounted it at a different server and ran a (readonly) FSCK. See attached output.

The output shows i/o errors. It is unable to read the blocks beyond
a certain size. Consistent with the LVM problem.

> I think LVM has messed up the filesystem.  I mapped also the snapshot of this corrupted filesystem to a different server and also there the output from fdisk -l is 500 GB instead of 2,5 TB...

Yes, appears it is messed up.

> Ok, this was not clear to the person who has installed this cluster. How can I avoid corruption until the migration to non-LVM or CLVM implementation is finished? 
> Only mount the OCFS2 filesystems to one node?  

Well, as long as the LVM mappings remain consistent on all nodes,
it will work. The problem is that if someone changes the setup on a
node, you will encounter the problem you just did. The only safe way
is to have the lvm clustered too. Whereas clvm is clustered, we would
prefer supporting it if we can run the fs and it, using one clusterstack.
SLES11 HAE will have support for this. We hope to have the same by (RH)EL6.

> Weird...also LVM related?!
> This is a 4 TB filesystem build from 2 x 2 TB LUN's

Appears that way.

> Maybe 2.6.25 ignore something and 2.6.29 have a better error handling?
> I'll setup a test environment as soon as possible with the same configuration but without LVM. Debian Lenny, 2.6.29 kernel and OCFS2  with NFS 
> exports and hopefully that's a stable setup for the new production environment.
>