[Ocfs2-users] Issue with OCFS2 mount

Sunil Mushran sunil.mushran at gmail.com
Wed Aug 29 11:13:15 PDT 2012


Forgot to add that this issue is limited to metaecc. So you could avoid the
issue in your
same setup by not enabling metaecc on the volume. And last I checked mkfs
did not
enable it by default.

On Mon, Aug 27, 2012 at 10:35 AM, Sunil Mushran <sunil.mushran at gmail.com>wrote:

> So you are running into a bug that has been fixed in 2.6.36. Upgrade to
> that version,
> if not something more current.
>
> $ git describe --tags 13ceef09
> v2.6.35-rc3-14-g13ceef0
>
> commit 13ceef099edd2b70c5a6f3a9ef5d6d97cda2e096
> Author: Jan Kara <jack at suse.cz>
> Date:   Wed Jul 14 07:56:33 2010 +0200
>
>     jbd2/ocfs2: Fix block checksumming when a buffer is used in several
> transactions
>
>     OCFS2 uses t_commit trigger to compute and store checksum of the just
>     committed blocks. When a buffer has b_frozen_data, checksum is computed
>     for it instead of b_data but this can result in an old checksum being
>     written to the filesystem in the following scenario:
>
>     1) transaction1 is opened
>     2) handle1 is opened
>     3) journal_access(handle1, bh)
>         - This sets jh->b_transaction to transaction1
>     4) modify(bh)
>     5) journal_dirty(handle1, bh)
>     6) handle1 is closed
>     7) start committing transaction1, opening transaction2
>     8) handle2 is opened
>     9) journal_access(handle2, bh)
>         - This copies off b_frozen_data to make it safe for transaction1
> to commit.
>           jh->b_next_transaction is set to transaction2.
>     10) jbd2_journal_write_metadata() checksums b_frozen_data
>     11) the journal correctly writes b_frozen_data to the disk journal
>     12) handle2 is closed
>         - There was no dirty call for the bh on handle2, so it is never
> queued for
>           any more journal operation
>     13) Checkpointing finally happens, and it just spools the bh via
> normal buffer
>     writeback.  This will write b_data, which was never triggered on and
> thus
>     contains a wrong (old) checksum.
>
>     This patch fixes the problem by calling the trigger at the moment data
> is
>     frozen for journal commit - i.e., either when b_frozen_data is created
> by
>     do_get_write_access or just before we write a buffer to the log if
>     b_frozen_data does not exist. We also rename the trigger to t_frozen as
>     that better describes when it is called.
>
>     Signed-off-by: Jan Kara <jack at suse.cz>
>     Signed-off-by: Mark Fasheh <mfasheh at suse.com>
>     Signed-off-by: Joel Becker <joel.becker at oracle.com>
>
>
> On Mon, Aug 27, 2012 at 5:10 AM, Rory Kilkenny <Rory.Kilkenny at ticoon.com>wrote:
>
>>  # uname -a
>> Linux FILEt1 2.6.34.7-0.7-desktop #1 SMP PREEMPT 2010-12-13 11:13:53
>> +0100 x86_64 x86_64 x86_64 GNU/Linux
>>
>> # modinfo ocfs2
>> filename:       /lib/modules/2.6.34.7-0.7-desktop/kernel/fs/ocfs2/ocfs2.ko
>> license:        GPL
>> author:         Oracle
>> version:        1.5.0
>> description:    OCFS2 1.5.0
>> srcversion:     B13569B35F99D43FA80D129
>> depends:        jbd2,ocfs2_stackglue,quota_tree,ocfs2_nodemanager
>> vermagic:       2.6.34.7-0.7-desktop SMP preempt mod_unload modversions
>>
>> # mkfs.ocfs2 --version
>> mkfs.ocfs2 1.4.3
>>
>>
>>
>>
>> On 12-08-24 5:44 PM, "Sunil Mushran" <sunil.mushran at gmail.com> wrote:
>>
>> What is the version of the kernel, ocfs2 and ocfs2 tools?
>>
>> uname -a
>> modinfo ocfs2
>> mkfs.ocfs2 --version
>>
>> On Fri, Aug 24, 2012 at 1:09 PM, Rory Kilkenny <Rory.Kilkenny at ticoon.com>
>> wrote:
>>
>> We have an HP P2000 G3 Storage array, fiber connected.  The storage array
>> has a RAID5 array broken into 2 physical OCFS2 volumes (A & B).
>>
>> A & B are both mounted and formatted as NTFS.
>>
>> One of the volumes is NFS mounted.
>>
>> Every couple of months or so we start getting tons of errors on the NFS
>> mounted volume:
>>
>>
>> Aug 24 09:48:13 FILEt2 kernel: [2234285.848940]
>> (ocfs2_wq,13844,7):ocfs2_block_check_validate:443 ERROR: CRC32 failed:
>> stored: 0, computed 1467126086.  Applying ECC.
>> Aug 24 09:48:13 FILEt2 kernel: [2234285.849252]
>> (ocfs2_wq,13844,7):ocfs2_block_check_validate:457 ERROR: Fixed CRC32
>> failed: stored: 0, computed 3828104806
>> Aug 24 09:48:13 FILEt2 kernel: [2234285.849256]
>> (ocfs2_wq,13844,7):ocfs2_validate_extent_block:903 ERROR: Checksum failed
>> for extent block 1169089
>> Aug 24 09:48:13 FILEt2 kernel: [2234285.849261]
>> (ocfs2_wq,13844,7):__ocfs2_find_path:1861 ERROR: status = -5
>> Aug 24 09:48:13 FILEt2 kernel: [2234285.849264]
>> (ocfs2_wq,13844,7):ocfs2_find_leaf:1958 ERROR: status = -5
>> Aug 24 09:48:13 FILEt2 kernel: [2234285.849267]
>> (ocfs2_wq,13844,7):ocfs2_find_new_last_ext_blk:6655 ERROR: status = -5
>> Aug 24 09:48:13 FILEt2 kernel: [2234285.849270]
>> (ocfs2_wq,13844,7):ocfs2_do_truncate:6900 ERROR: status = -5
>> Aug 24 09:48:13 FILEt2 kernel: [2234285.849274]
>> (ocfs2_wq,13844,7):ocfs2_commit_truncate:7556 ERROR: status = -5
>> Aug 24 09:48:13 FILEt2 kernel: [2234285.849280]
>> (ocfs2_wq,13844,7):ocfs2_truncate_for_delete:593 ERROR: status = -5
>> Aug 24 09:48:13 FILEt2 kernel: [2234285.849284]
>> (ocfs2_wq,13844,7):ocfs2_wipe_inode:769 ERROR: status = -5
>> Aug 24 09:48:13 FILEt2 kernel: [2234285.849287]
>> (ocfs2_wq,13844,7):ocfs2_delete_inode:1067 ERROR: status = -5
>>
>>
>> If we pull all the data off, destroy the volume, rebuilt it, and copy our
>> data back, all works fine; for a while.
>>
>> This issue does not happen on the non NFS mounted volume. I am currently
>> assuming the issue is with NFS and how we have it configured (which to the
>> best of my knowledge is default).
>>
>> Has anyone had a similar experience and be able to share some insight and
>> knowledge on any tricks with NFS and OCFS2 volumes?
>>
>> Thanks in advance.
>>
>>
>>
>> _______________________________________________
>> Ocfs2-users mailing list
>> Ocfs2-users at oss.oracle.com
>> https://oss.oracle.com/mailman/listinfo/ocfs2-users
>>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20120829/bc51bdab/attachment.html 


More information about the Ocfs2-users mailing list