[Ocfs2-users] Filesystem corruption and OCFS2 errors

Wed May 20 14:16:23 PDT 2009

Hi Sunil,

Thank you for your reply.

> When one issues a rm, we first remove the directory entry and add a
corresponding entry in the orphan dir. Then we delete... free all the extents, inode
bit, etc.
> In your case, the dir entry is there but the inode bit is free. Also,
that inode is still in the orphan dir. Do you by chance save the fsck output?
> Wanted to see if all the inodes were co-located... meaning the error stemmed from a
corrupt inode alloc bitmap.

No, I don't have the full output, but I still have the snapshots that I've made before teh FSCK. I've mounted it at a different server and ran a (readonly) FSCK. See attached output.

> No, you did not give the wrong answer. More specifically, your answer
did not cause the problem. That yes only set the size in the superblock the same as
what was in the global bitmap. That's harmless. The qs is why that value in the global
bitmap was so wrong. And this is one value we don't touch.... other than during
resize. And we don't allow shrinking. So size in gloabl bitmap should never be smaller than
the one in the superblock.

I think LVM has messed up the filesystem.  I mapped also the snapshot of this corrupted filesystem to a different server and also there the output from fdisk -l is 500 GB instead of 2,5 TB...

>If fdisk is saying the device is 485G, then that's what the other tools
will see.
> And this appears to be the root cause of your problem. LVM. There is a
reason why we don't support LVM.

Ok, this was not clear to the person who has installed this cluster. How can I avoid corruption until the migration to non-LVM or CLVM implementation is finished? 
Only mount the OCFS2 filesystems to one node?  

>Best solution is to salvage your data using debugfs.ocfs2. It has commands like
> dump/rdump that read the files directly off the disk.

I'll do that, thanks.

> > May 15 23:47:31 fileserver-1 kernel: (14610,1):ocfs2_read_locked_inode:466 ERROR: status = -22
> > May 15 23:47:31 fileserver-1 kernel: dm-17: rw=0, want=6635799728, limit=419422208

> What this is saying is that the disk size has shrunk. It is trying to read 6G into
> the volume but the block layer is saying that the device is 500M only. You have to
> look into your block device setup.

Weird...also LVM related?!
This is a 4 TB filesystem build from 2 x 2 TB LUN's

> I would avoid LVM. We are working towards supporting CLVM but we don't do it
> today. I mean using CLVM would be better. Actually, if you have-to-have-to use
> LVM, use sles11 ha ext. It will have proper ocfs2/clvm support.

Last year our storage only supported max 2 TB LUN's. We needed a 12 TB filesystem so we did it with LVM. 
I will investigate if it's possible to create one 12 TB LUN with the current storage. Our mainstream LINUX flavor is Debian, we don't use sles at the moment.
Lenny supports also CLVM, but is this recommended like sles ha ext? Can you tell some more about the full support of CLVM with OCFS2?

I will try to avoid (C)LVM if it's possible, but the customer will have the flexibility of adding storage to the filesystem. 

> I am not sure as to why 2.6.29.3 crashed and 2.6.25 worked. The error reported
> by 2.6.29.3 should have shown up with 2.6.25 too. Just for the record -
> we did run the full fs regression with 2.6.29-stock kernel.

Maybe 2.6.25 ignore something and 2.6.29 have a better error handling?
I'll setup a test environment as soon as possible with the same configuration but without LVM. Debian Lenny, 2.6.29 kernel and OCFS2  with NFS 
exports and hopefully that's a stable setup for the new production environment.

Regards,
Christian
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: OCFS2-fsck.txt
Url: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20090520/989e95ce/attachment-0001.txt