[Ocfs2-users] ocfs dmesg and fsck errors

Bob Ziuchkovski bziuchkovski at subscribermail.com
Mon Jan 5 21:41:51 PST 2009


Thank you for the response, that was very helpful!  It appears most of 
the errors were easily fixed by the fsck, but one thing worries me -- 
I'm still getting all of the 'Internal logic failure !! duplicate 
cluster' messages.  Is this a problem?

Bob

Sunil Mushran wrote:
> A directory is corrupted. To get the name of the dir, do:
> 
> $ debugfs.ocfs2 -R "findpath <12862125>" /dev/sdX
> 
> It may take time as it will have to traverse the dirs.
> 
> To fix, you will have to run fsck in rw mode. fsck.ocfs2 -fy /dev/sdX.
> 
> That the nodes do not crash is as expected. Dir corruption is always 
> localized
> and only rears up as an error.
> 
> 800K files in one dir is not efficient since the current version does not
> support indexed dirs. We hope to add support for the same in the near term.
> 
> Sunil
> 
> 
> Bob Ziuchkovski wrote:
>> Hi All,
>>
>> I'm trying to move my company from our current frenzy of rsyncing 
>> towards ocfs2.  I've deployed ocfs2 on a few test servers and in 
>> general things seem to be working.  However, I've run into a couple of 
>> problems and wanted to run them by this mailing list.
>>
>> Earlier today I encountered errors that at first appeared to be 
>> permission errors.  However, when I checked dmesg output, I saw the 
>> following entries repeated over and over:
>>
>> (20830,0):ocfs2_mknod:351 ERROR: status = -2
>> (20830,0):ocfs2_check_dir_entry:1727 ERROR: bad entry in directory 
>> #12862125: rec_len is smaller than minimal - offset=258867
>> 2, inode=1099511657728, rec_len=0, name_len=0
>>
>> I'm not exactly sure what this means.  Is there a way for me to 
>> determine the path to the directory and/or file referenced above?
>>
>> Since this seemed like it might be fs corruption, I ran the fsck.ocfs2 
>> utility, but in read-only mode.  I ended up with output that looks 
>> like the following:
>>
>> Pass 0a: Checking cluster allocation chains
>> Pass 0b: Checking inode allocation chains
>> Pass 0c: Checking extent block allocation chains
>> Pass 1: Checking inodes and blocks.
>> o2fsck_mark_cluster_allocated: Internal logic faliure !! duplicate 
>> cluster 4387633^M
>> o2fsck_mark_cluster_allocated: Internal logic faliure !! duplicate 
>> cluster 4387634^M
>> o2fsck_mark_cluster_allocated: Internal logic faliure !! duplicate 
>> cluster 4387635^M
>> o2fsck_mark_cluster_allocated: Internal logic faliure !! duplicate 
>> cluster 4387636^M
>> <-----------------SNIP Similar-------------------->
>> Pass 2: Checking directory entries.
>> [DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block 
>> 632 physical block 35102672 offset 0. Attempt to repair this block's 
>> directory entries? n
>> [DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block 
>> 633 physical block 35102673 offset 0. Attempt to repair this block's 
>> directory entries? n
>> [DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block 
>> 634 physical block 35102674 offset 0. Attempt to repair this block's 
>> directory entries? n
>> [DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block 
>> 635 physical block 35102675 offset 0. Attempt to repair this block's 
>> directory entries? n
>> [DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block 
>> 636 physical block 35102676 offset 0. Attempt to repair this block's 
>> directory entries? n
>> [DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block 
>> 637 physical block 35102677 offset 0. Attempt to repair this block's 
>> directory entries? n
>> [DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block 
>> 638 physical block 35102678 offset 0. Attempt to repair this block's 
>> directory entries? n
>> [DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block 
>> 639 physical block 35102679 offset 0. Attempt to repair this block's 
>> directory entries? n
>> Pass 3: Checking directory connectivity.
>> Pass 4a: checking for orphaned inodes
>> Pass 4b: Checking inodes link counts.
>> [INODE_NOT_CONNECTED] Inode 0 isn't referenced by any directory 
>> entries.   Move it to lost+found? n
>> [INODE_COUNT] Inode 31231457 has a link count of 1 on disk but 
>> directory entry references come to 0. Update the count on disk
>>   to match? n
>> [INODE_NOT_CONNECTED] Inode 31231457 isn't referenced by any directory 
>> entries.  Move it to lost+found? n
>> [INODE_COUNT] Inode 31231458 has a link count of 1 on disk but 
>> directory entry references come to 0. Update the count on disk
>>   to match? n
>> [INODE_NOT_CONNECTED] Inode 31231458 isn't referenced by any directory 
>> entries.  Move it to lost+found? n
>> [INODE_COUNT] Inode 31231459 has a link count of 1 on disk but 
>> directory entry references come to 0. Update the count on disk
>>   to match? n
>> [INODE_NOT_CONNECTED] Inode 31231459 isn't referenced by any directory 
>> entries.  Move it to lost+found? n
>> [INODE_COUNT] Inode 31231460 has a link count of 1 on disk but 
>> directory entry references come to 0. Update the count on disk
>>   to match? n
>> [INODE_NOT_CONNECTED] Inode 31231460 isn't referenced by any directory 
>> entries.  Move it to lost+found? n
>> [INODE_COUNT] Inode 31231461 has a link count of 1 on disk but 
>> directory entry references come to 0. Update the count on disk
>>   to match? n
>> <-----------------SNIP Similar----------------------->
>>
>> As far as I know, none of the nodes that are running ocfs2 have 
>> actually crashed and I created the filesystems just last week.  One 
>> thing I should mention, though, is that the filesystem in question has 
>> about 3.3 million small files, 800k of which are contained within a 
>> single flat directory -- I know, it's terrible...I've inherited this 
>> mess from previous admins.   Additionally, I rsync'ed the files to the 
>> ocfs2 volume from one of our existing servers.  I have never been able 
>> to fsck the filesystem of the existing server without errors, but 
>> fixing the errors generally leads to a bunch of the small files being 
>> unlinked and moved to lost+found.  My impression is that rsync reads 
>> things at a high-enough level that this shouldn't duplicate filesystem 
>> errors on a target volume, but maybe I'm wrong.
>>
>> Anyway, any help that could be offered would be greatly appreciated. 
>> I'm really trying to fix the filesystem mess I've inherited, but I get 
>> the impression it will be an arduous task.  :)  In terms of package 
>> information, we're running RHEL 4u7 on x86_64 with the following 
>> packges installed: ocfs2-2.6.9-55.0.9.ELsmp-1.2.9-1.el4 and 
>> ocfs2-tools-1.2.7-1.el4.  Thanks!
>>
>> Bob Ziuchkovski
>>
>> _______________________________________________
>> Ocfs2-users mailing list
>> Ocfs2-users at oss.oracle.com
>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>



More information about the Ocfs2-users mailing list