[Ocfs2-users] ocfs2_delete_inode kernel bug

Andre Nathan andre at digirati.com.br
Wed Oct 27 21:16:03 PDT 2010


Hello Sunil

The errors happened again, but now I think it may be completely fixed. I
only got the -17 error for a single inode this time:

# grep -E "Oct 2[78]" /var/log/kern.log|grep -oE "ERROR: Inode [0-9]+"|
sort|uniq -c
35 ERROR: Inode 16671031

I ran fsck.ocfs2 -y -f in all my volumes. I got lots of messages like
this:

    Cluster 6627661 is marked in the global cluster bitmap but it 
    isn't  in use.  Clear its bit in the bitmap? y

which I'm guessing is OK.

One of the volumes had these 3 messages:

    [CHAIN_BITS] Chain 235 in allocator inode 11 has 1463093 bits 
    marked   free out of 1483776 total bits but the block groups in the 
    chain have 1464665 free out of 1483776 total.  Fix this by updating 
    the chain record? y

    [CHAIN_BITS] Chain 234 in allocator inode 11 has 1454182 bits 
    marked free out of 1483776 total bits but the block groups in the 
    chain have 1454542 free out of 1483776 total.  Fix this by updating 
    the chain record? y

    [CHAIN_GROUP_BITS] Allocator inode 11 has 11623906 bits marked used 
    out of 365955414 total bits but the chains have 11621974 used out 
    of 365955414 total.  Fix this by updating the inode counts? y

Not sure what this means but it seems fixed.

Finally, the last volume's fsck resulted in this:

    [INODE_ORPHANED] Inode 16671031 was found in the orphan directory. 
    Delete its contents and unlink it? y

    o2fsck_icount_delta: Internal logic faliure while droping icount 
    from 0 bt -1 for inode 16671031

    [INODE_ORPHANED] Inode 16665729 was found in the orphan directory.  
    Delete its contents and unlink it? y

    [INODE_ORPHANED] Inode 16671031 was found in the orphan directory. 
    Delete its contents and unlink it? y

    pass4: Attempting to free unallocated region while deleting orphan 
    inode 16671031after truncating it

    pass4: Attempting to free unallocated region while trying to replay 
    the orphan directory

    fsck.ocfs2: Attempting to free unallocated region while performing 
    pass 4

I ran fsck on that volume one more time, which resulted in

    [DIRENT_INODE_FREE] Directory entry '0000000000fe6137' refers to 
    inode number 16671031 which isn't allocated, clear the entry? y

    [INODE_COUNT] Inode 12 has a link count of 3 on disk but directory 
    entry references come to 2. Update the count on disk to match? y

After that, a new fsck ran without any messages requiring fixes.

What I'm guessing here is that last time the error occurred and I ran
fsck on all volumes, this last filesystem wasn't entirely clean after I
put it back in production, and so the error happened again for it.

I'll keep watching the cluster behavior for a few more days, but I'm
hoping now things should be more stable.

If you're interested, I have saved the full fsck outputs, and also made
a copy of the raw block for inode 16671031 after the first fsck as you
indicated.

Thanks,
Andre




More information about the Ocfs2-users mailing list