[Ocfs2-users] ocfs dmesg and fsck errors
Bob Ziuchkovski
bziuchkovski at subscribermail.com
Mon Jan 5 21:41:51 PST 2009
Thank you for the response, that was very helpful! It appears most of
the errors were easily fixed by the fsck, but one thing worries me --
I'm still getting all of the 'Internal logic failure !! duplicate
cluster' messages. Is this a problem?
Bob
Sunil Mushran wrote:
> A directory is corrupted. To get the name of the dir, do:
>
> $ debugfs.ocfs2 -R "findpath <12862125>" /dev/sdX
>
> It may take time as it will have to traverse the dirs.
>
> To fix, you will have to run fsck in rw mode. fsck.ocfs2 -fy /dev/sdX.
>
> That the nodes do not crash is as expected. Dir corruption is always
> localized
> and only rears up as an error.
>
> 800K files in one dir is not efficient since the current version does not
> support indexed dirs. We hope to add support for the same in the near term.
>
> Sunil
>
>
> Bob Ziuchkovski wrote:
>> Hi All,
>>
>> I'm trying to move my company from our current frenzy of rsyncing
>> towards ocfs2. I've deployed ocfs2 on a few test servers and in
>> general things seem to be working. However, I've run into a couple of
>> problems and wanted to run them by this mailing list.
>>
>> Earlier today I encountered errors that at first appeared to be
>> permission errors. However, when I checked dmesg output, I saw the
>> following entries repeated over and over:
>>
>> (20830,0):ocfs2_mknod:351 ERROR: status = -2
>> (20830,0):ocfs2_check_dir_entry:1727 ERROR: bad entry in directory
>> #12862125: rec_len is smaller than minimal - offset=258867
>> 2, inode=1099511657728, rec_len=0, name_len=0
>>
>> I'm not exactly sure what this means. Is there a way for me to
>> determine the path to the directory and/or file referenced above?
>>
>> Since this seemed like it might be fs corruption, I ran the fsck.ocfs2
>> utility, but in read-only mode. I ended up with output that looks
>> like the following:
>>
>> Pass 0a: Checking cluster allocation chains
>> Pass 0b: Checking inode allocation chains
>> Pass 0c: Checking extent block allocation chains
>> Pass 1: Checking inodes and blocks.
>> o2fsck_mark_cluster_allocated: Internal logic faliure !! duplicate
>> cluster 4387633^M
>> o2fsck_mark_cluster_allocated: Internal logic faliure !! duplicate
>> cluster 4387634^M
>> o2fsck_mark_cluster_allocated: Internal logic faliure !! duplicate
>> cluster 4387635^M
>> o2fsck_mark_cluster_allocated: Internal logic faliure !! duplicate
>> cluster 4387636^M
>> <-----------------SNIP Similar-------------------->
>> Pass 2: Checking directory entries.
>> [DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block
>> 632 physical block 35102672 offset 0. Attempt to repair this block's
>> directory entries? n
>> [DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block
>> 633 physical block 35102673 offset 0. Attempt to repair this block's
>> directory entries? n
>> [DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block
>> 634 physical block 35102674 offset 0. Attempt to repair this block's
>> directory entries? n
>> [DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block
>> 635 physical block 35102675 offset 0. Attempt to repair this block's
>> directory entries? n
>> [DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block
>> 636 physical block 35102676 offset 0. Attempt to repair this block's
>> directory entries? n
>> [DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block
>> 637 physical block 35102677 offset 0. Attempt to repair this block's
>> directory entries? n
>> [DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block
>> 638 physical block 35102678 offset 0. Attempt to repair this block's
>> directory entries? n
>> [DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block
>> 639 physical block 35102679 offset 0. Attempt to repair this block's
>> directory entries? n
>> Pass 3: Checking directory connectivity.
>> Pass 4a: checking for orphaned inodes
>> Pass 4b: Checking inodes link counts.
>> [INODE_NOT_CONNECTED] Inode 0 isn't referenced by any directory
>> entries. Move it to lost+found? n
>> [INODE_COUNT] Inode 31231457 has a link count of 1 on disk but
>> directory entry references come to 0. Update the count on disk
>> to match? n
>> [INODE_NOT_CONNECTED] Inode 31231457 isn't referenced by any directory
>> entries. Move it to lost+found? n
>> [INODE_COUNT] Inode 31231458 has a link count of 1 on disk but
>> directory entry references come to 0. Update the count on disk
>> to match? n
>> [INODE_NOT_CONNECTED] Inode 31231458 isn't referenced by any directory
>> entries. Move it to lost+found? n
>> [INODE_COUNT] Inode 31231459 has a link count of 1 on disk but
>> directory entry references come to 0. Update the count on disk
>> to match? n
>> [INODE_NOT_CONNECTED] Inode 31231459 isn't referenced by any directory
>> entries. Move it to lost+found? n
>> [INODE_COUNT] Inode 31231460 has a link count of 1 on disk but
>> directory entry references come to 0. Update the count on disk
>> to match? n
>> [INODE_NOT_CONNECTED] Inode 31231460 isn't referenced by any directory
>> entries. Move it to lost+found? n
>> [INODE_COUNT] Inode 31231461 has a link count of 1 on disk but
>> directory entry references come to 0. Update the count on disk
>> to match? n
>> <-----------------SNIP Similar----------------------->
>>
>> As far as I know, none of the nodes that are running ocfs2 have
>> actually crashed and I created the filesystems just last week. One
>> thing I should mention, though, is that the filesystem in question has
>> about 3.3 million small files, 800k of which are contained within a
>> single flat directory -- I know, it's terrible...I've inherited this
>> mess from previous admins. Additionally, I rsync'ed the files to the
>> ocfs2 volume from one of our existing servers. I have never been able
>> to fsck the filesystem of the existing server without errors, but
>> fixing the errors generally leads to a bunch of the small files being
>> unlinked and moved to lost+found. My impression is that rsync reads
>> things at a high-enough level that this shouldn't duplicate filesystem
>> errors on a target volume, but maybe I'm wrong.
>>
>> Anyway, any help that could be offered would be greatly appreciated.
>> I'm really trying to fix the filesystem mess I've inherited, but I get
>> the impression it will be an arduous task. :) In terms of package
>> information, we're running RHEL 4u7 on x86_64 with the following
>> packges installed: ocfs2-2.6.9-55.0.9.ELsmp-1.2.9-1.el4 and
>> ocfs2-tools-1.2.7-1.el4. Thanks!
>>
>> Bob Ziuchkovski
>>
>> _______________________________________________
>> Ocfs2-users mailing list
>> Ocfs2-users at oss.oracle.com
>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>
More information about the Ocfs2-users
mailing list