[Ocfs2-tools-devel] [PATCH 0/6] Fix extent records for holes/overlaps

Tiger Yang tiger.yang at oracle.com
Wed Aug 24 00:21:26 PDT 2011


Hi, Goldwyn,

I have tested the new patches with test cases of punching hole at the 
beginning and middle, them can't fix the problems.
I have studied this problem and I found the reason is the new size of 
dir inode changed by fsck is wrong. In my test case, blocksize is 1024, 
cluster size if 4096,  dir size is 23552 (23 blocks), but it taken 6 
clusters (24 blocks) , so when fsck found a hole and reduce that 
cluster, the new size became 20480, it include the last uninitialized 
block in the last cluster. so in pass2,  ocfs2_read_dir_block-> 
ocfs2_validate_meta_ecc -> ocfs2_block_check_validate will return 
OCFS2_ET_IO. For hole at the beginning, the fix should initialize "." 
and ".." in the directory as Joel commented on bug 1324.

1. puch hole at the beginning of a directory inode.
fsck can not fix the problem and the new patches did not create "." and 
".." in the directory.

before punch the hole:
 debugfs: stat d1
        Inode: 37414   Mode: 0755   Generation: 294183235 (0x1188e143)
        FS Generation: 116237964 (0x6eda68c)
        CRC32: 1dd7b434   ECC: 018c
        Type: Directory   Attr: 0x0   Flags: Valid
        Dynamic Features: (0x8) IndexedDir
        User: 0 (root)   Group: 0 (root)   Size: 23552
        Links: 2   Clusters: 6
        ctime: 0x4e53ca8b -- Tue Aug 23 23:43:07 2011
        atime: 0x4e53ca85 -- Tue Aug 23 23:43:01 2011
        mtime: 0x4e53ca8b -- Tue Aug 23 23:43:07 2011
        dtime: 0x0 -- Thu Jan  1 08:00:00 1970
        ctime_nsec: 0x1b6e214b -- 460202315
        atime_nsec: 0x07c2b896 -- 130201750
        mtime_nsec: 0x1b6e214b -- 460202315
        Refcount Block: 0
        Last Extblk: 0   Orphan Slot: 0
        Sub Alloc Slot: 0   Sub Alloc Bit: 2
        Indexed Tree Root: 33313
        Tree Depth: 0   Count: 51   Next Free Rec: 1
        ## Offset        Clusters       Block#          Flags
        0  0             6              61444           0x0
after punch a hole at the beginning:
debugfs: stat d1
        Inode: 37414   Mode: 0755   Generation: 294183235 (0x1188e143)
        FS Generation: 116237964 (0x6eda68c)
        CRC32: a3c5dad8   ECC: 07fd
        Type: Directory   Attr: 0x0   Flags: Valid
        Dynamic Features: (0x8) IndexedDir
        User: 0 (root)   Group: 0 (root)   Size: 23552
        Links: 2   Clusters: 5
        ctime: 0x4e53caaf -- Tue Aug 23 23:43:43 2011
        atime: 0x4e53ca85 -- Tue Aug 23 23:43:01 2011
        mtime: 0x4e53caaf -- Tue Aug 23 23:43:43 2011
        dtime: 0x0 -- Thu Jan  1 08:00:00 1970
        ctime_nsec: 0x10b36bc1 -- 280193985
        atime_nsec: 0x07c2b896 -- 130201750
        mtime_nsec: 0x10b36bc1 -- 280193985
        Refcount Block: 0
        Last Extblk: 0   Orphan Slot: 0
        Sub Alloc Slot: 0   Sub Alloc Bit: 2
        Indexed Tree Root: 33313
        Tree Depth: 0   Count: 51   Next Free Rec: 1
        ## Offset        Clusters       Block#          Flags
        0  1             5              61448           0x0

fsck output:
[root at node1 ~]# fsck.ocfs2 -f /dev/ubdb
fsck.ocfs2 1.8.0
Checking OCFS2 filesystem in /dev/ubdb:
  Label:              <NONE>
  UUID:               D22445F802EA4F6E8EAD2CD87308FA2F
  Number of blocks:   409600
  Block size:         1024
  Number of clusters: 102400
  Cluster size:       4096
  Number of slots:    2

/dev/ubdb was run with -f, check forced.
Pass 0a: Checking cluster allocation chains
Pass 0b: Checking inode allocation chains
Pass 0c: Checking extent block allocation chains
Pass 1: Checking inodes and blocks.
[NO_HOLES] Extent record of owner 37414 is incorrectly set to 1 instead 
of 0. Fix? <y> y
[INODE_SIZE] Inode 37414 has a size of 23552 but has 20480 bytes of 
actual data. Correct the file size? <y> y
[CLUSTER_ALLOC_BIT] Cluster 15377 is marked in the global cluster bitmap 
but it isn't in use.  Clear its bit in the bitmap? <y> y
[CLUSTER_ALLOC_BIT] Cluster 15378 is marked in the global cluster bitmap 
but it isn't in use.  Clear its bit in the bitmap? <y> y
Pass 2: Checking directory entries.
[DIRENT_NOT_DOTTY] The first directory entry in directory inode 37414 is 
'abcdefghijklmnopqrstuvwxyz123456789072' instead of '.'.  Clobber the 
current name with the expected dot name? <y> y
[DIRENT_DOT_INODE] The '.' entry in directory inode 37414 points to 
inode 37489 instead of itself.  Fix the '.' entry? <y> y
[DIRENT_LENGTH] Directory inode 37414 corrupted in logical block 0 
physical block 61448 offset 16. Attempt to repair this block's directory 
entries? <y> y
[DIRENT_NOT_DOTTY] The second directory entry in directory inode 37414 
is '' instead of '..'.  Clobber the current name with the expected dot 
name? <y> y
pass2: I/O error on channel while reading dir block 61467
pass2: OCFS2 directory corrupted while rebuild indexed dirs.
fsck.ocfs2: OCFS2 directory corrupted while performing pass 2

2. puch hole in the middle of a directory inode.
fsck can not fix the problem.

before punch the hole in the middle:
debugfs: stat d1
        Inode: 37414   Mode: 0755   Generation: 3769035313 (0xe0a6ea31)
        FS Generation: 1652593245 (0x6280925d)
        CRC32: 6e6aea83   ECC: 02a8
        Type: Directory   Attr: 0x0   Flags: Valid
        Dynamic Features: (0x8) IndexedDir
        User: 0 (root)   Group: 0 (root)   Size: 23552
        Links: 2   Clusters: 6
        ctime: 0x4e4e060f -- Fri Aug 19 14:43:27 2011
        atime: 0x4e4e061a -- Fri Aug 19 14:43:38 2011
        mtime: 0x4e4e060f -- Fri Aug 19 14:43:27 2011
        dtime: 0x0 -- Thu Jan  1 08:00:00 1970
        ctime_nsec: 0x38850838 -- 948242488
        atime_nsec: 0x12f80255 -- 318243413
        mtime_nsec: 0x38850838 -- 948242488
        Refcount Block: 0
        Last Extblk: 0   Orphan Slot: 0
        Sub Alloc Slot: 0   Sub Alloc Bit: 2
        Indexed Tree Root: 33313
        Tree Depth: 0   Count: 51   Next Free Rec: 1
        ## Offset        Clusters       Block#          Flags
        0  0             6              61444           0x0
after punch hole in the middle:
debugfs: stat d1
        Inode: 37414   Mode: 0755   Generation: 3769035313 (0xe0a6ea31)
        FS Generation: 1652593245 (0x6280925d)
        CRC32: a890f020   ECC: 008d
        Type: Directory   Attr: 0x0   Flags: Valid
        Dynamic Features: (0x8) IndexedDir
        User: 0 (root)   Group: 0 (root)   Size: 23552
        Links: 2   Clusters: 5
        ctime: 0x4e53c9ae -- Tue Aug 23 23:39:26 2011
        atime: 0x4e4e061a -- Fri Aug 19 14:43:38 2011
        mtime: 0x4e53c9ae -- Tue Aug 23 23:39:26 2011
        dtime: 0x0 -- Thu Jan  1 08:00:00 1970
        ctime_nsec: 0x0d2000e2 -- 220201186
        atime_nsec: 0x12f80255 -- 318243413
        mtime_nsec: 0x0d2000e2 -- 220201186
        Refcount Block: 0
        Last Extblk: 0   Orphan Slot: 0
        Sub Alloc Slot: 0   Sub Alloc Bit: 2
        Indexed Tree Root: 33313
        Tree Depth: 0   Count: 51   Next Free Rec: 2
        ## Offset        Clusters       Block#          Flags
        0  0             2              61444           0x0
        1  3             3              61456           0x0

fsck output:
[root at node1 ~]# fsck.ocfs2 -f /dev/ubdb
fsck.ocfs2 1.8.0
Checking OCFS2 filesystem in /dev/ubdb:
  Label:              <NONE>
  UUID:               A4F34BE7EE4B4E8A8F10EDDB82E134CD
  Number of blocks:   409600
  Block size:         1024
  Number of clusters: 102400
  Cluster size:       4096
  Number of slots:    2

/dev/ubdb was run with -f, check forced.
Pass 0a: Checking cluster allocation chains
Pass 0b: Checking inode allocation chains
Pass 0c: Checking extent block allocation chains
Pass 1: Checking inodes and blocks.
[NO_HOLES] Extent record of owner 37414 is incorrectly set to 3 instead 
of 2. Fix? <y> y
[INODE_SIZE] Inode 37414 has a size of 23552 but has 20480 bytes of 
actual data. Correct the file size? <y> y
[CLUSTER_ALLOC_BIT] Cluster 15377 is marked in the global cluster bitmap 
but it isn't in use.  Clear its bit in the bitmap? <y> y
[CLUSTER_ALLOC_BIT] Cluster 15378 is marked in the global cluster bitmap 
but it isn't in use.  Clear its bit in the bitmap? <y> y
[CLUSTER_ALLOC_BIT] Cluster 15379 is marked in the global cluster bitmap 
but it isn't in use.  Clear its bit in the bitmap? <y> y
Pass 2: Checking directory entries.
pass2: I/O error on channel while reading dir block 61467
pass2: OCFS2 directory corrupted while rebuild indexed dirs.
fsck.ocfs2: OCFS2 directory corrupted while performing pass 2

Thanks,
Tiger
 
Goldwyn Rodrigues wrote:
> Hi,
>
> I discovered a bug in my previous post. A variable swap while calling
> the extent_eb was giving unexpected results. so, I am posting this
> again with the bug fix in "Fix holes in directories".
>
> The patchset fixes corruptions due to offset corruptions of extent
> records of extent lists. The problem could be either of the two -
>
> 1. Holes: Some data structures such as directories do not have holes and
> can lead to errors as described in bug#1324
> http://oss.oracle.com/bugzilla/show_bug.cgi?id=1324
> This also creates indexes for directories which were disabled by the
> kernel in the patch posted on Aug 3 - Avoid EROFS in case of dx dir
> errors v2
> http://oss.oracle.com/pipermail/ocfs2-devel/2011-August/008312.html
>
> 2. Extent overlap: The extent records do no go in serial order with
> respect to the offset. IOW, the cpos is smaller than previous cpos +
> number of clusters.
>
>
> Changes:
>  - Incorporated Tiger's review comment on indexed dirs, hole placements
>    and coding styles
>  - Fixed a bug in dir holes correction.
>
>   




More information about the Ocfs2-tools-devel mailing list