[Ocfs2-users] ocfs dmesg and fsck errors

Bob Ziuchkovski bziuchkovski at subscribermail.com
Mon Jan 5 16:24:45 PST 2009


Hi All,

I'm trying to move my company from our current frenzy of rsyncing 
towards ocfs2.  I've deployed ocfs2 on a few test servers and in general 
things seem to be working.  However, I've run into a couple of problems 
and wanted to run them by this mailing list.

Earlier today I encountered errors that at first appeared to be 
permission errors.  However, when I checked dmesg output, I saw the 
following entries repeated over and over:

(20830,0):ocfs2_mknod:351 ERROR: status = -2
(20830,0):ocfs2_check_dir_entry:1727 ERROR: bad entry in directory 
#12862125: rec_len is smaller than minimal - offset=258867
2, inode=1099511657728, rec_len=0, name_len=0

I'm not exactly sure what this means.  Is there a way for me to 
determine the path to the directory and/or file referenced above?

Since this seemed like it might be fs corruption, I ran the fsck.ocfs2 
utility, but in read-only mode.  I ended up with output that looks like 
the following:

Pass 0a: Checking cluster allocation chains
Pass 0b: Checking inode allocation chains
Pass 0c: Checking extent block allocation chains
Pass 1: Checking inodes and blocks.
o2fsck_mark_cluster_allocated: Internal logic faliure !! duplicate 
cluster 4387633^M
o2fsck_mark_cluster_allocated: Internal logic faliure !! duplicate 
cluster 4387634^M
o2fsck_mark_cluster_allocated: Internal logic faliure !! duplicate 
cluster 4387635^M
o2fsck_mark_cluster_allocated: Internal logic faliure !! duplicate 
cluster 4387636^M
<-----------------SNIP Similar-------------------->
Pass 2: Checking directory entries.
[DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block 632 
physical block 35102672 offset 0. Attempt to repair this block's 
directory entries? n
[DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block 633 
physical block 35102673 offset 0. Attempt to repair this block's 
directory entries? n
[DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block 634 
physical block 35102674 offset 0. Attempt to repair this block's 
directory entries? n
[DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block 635 
physical block 35102675 offset 0. Attempt to repair this block's 
directory entries? n
[DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block 636 
physical block 35102676 offset 0. Attempt to repair this block's 
directory entries? n
[DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block 637 
physical block 35102677 offset 0. Attempt to repair this block's 
directory entries? n
[DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block 638 
physical block 35102678 offset 0. Attempt to repair this block's 
directory entries? n
[DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block 639 
physical block 35102679 offset 0. Attempt to repair this block's 
directory entries? n
Pass 3: Checking directory connectivity.
Pass 4a: checking for orphaned inodes
Pass 4b: Checking inodes link counts.
[INODE_NOT_CONNECTED] Inode 0 isn't referenced by any directory entries. 
  Move it to lost+found? n
[INODE_COUNT] Inode 31231457 has a link count of 1 on disk but directory 
entry references come to 0. Update the count on disk
  to match? n
[INODE_NOT_CONNECTED] Inode 31231457 isn't referenced by any directory 
entries.  Move it to lost+found? n
[INODE_COUNT] Inode 31231458 has a link count of 1 on disk but directory 
entry references come to 0. Update the count on disk
  to match? n
[INODE_NOT_CONNECTED] Inode 31231458 isn't referenced by any directory 
entries.  Move it to lost+found? n
[INODE_COUNT] Inode 31231459 has a link count of 1 on disk but directory 
entry references come to 0. Update the count on disk
  to match? n
[INODE_NOT_CONNECTED] Inode 31231459 isn't referenced by any directory 
entries.  Move it to lost+found? n
[INODE_COUNT] Inode 31231460 has a link count of 1 on disk but directory 
entry references come to 0. Update the count on disk
  to match? n
[INODE_NOT_CONNECTED] Inode 31231460 isn't referenced by any directory 
entries.  Move it to lost+found? n
[INODE_COUNT] Inode 31231461 has a link count of 1 on disk but directory 
entry references come to 0. Update the count on disk
  to match? n
<-----------------SNIP Similar----------------------->

As far as I know, none of the nodes that are running ocfs2 have actually 
crashed and I created the filesystems just last week.  One thing I 
should mention, though, is that the filesystem in question has about 3.3 
million small files, 800k of which are contained within a single flat 
directory -- I know, it's terrible...I've inherited this mess from 
previous admins.   Additionally, I rsync'ed the files to the ocfs2 
volume from one of our existing servers.  I have never been able to fsck 
the filesystem of the existing server without errors, but fixing the 
errors generally leads to a bunch of the small files being unlinked and 
moved to lost+found.  My impression is that rsync reads things at a 
high-enough level that this shouldn't duplicate filesystem errors on a 
target volume, but maybe I'm wrong.

Anyway, any help that could be offered would be greatly appreciated. 
I'm really trying to fix the filesystem mess I've inherited, but I get 
the impression it will be an arduous task.  :)  In terms of package 
information, we're running RHEL 4u7 on x86_64 with the following packges 
installed: ocfs2-2.6.9-55.0.9.ELsmp-1.2.9-1.el4 and 
ocfs2-tools-1.2.7-1.el4.  Thanks!

Bob Ziuchkovski



More information about the Ocfs2-users mailing list