<HTML>

<HEAD>

<TITLE>Re: [Ocfs2-users] Issue with OCFS2 mount</TITLE>

</HEAD>

<BODY>

<FONT FACE="Calibri, Verdana, Helvetica, Arial"><SPAN STYLE='font-size:11pt'>Sunil;<BR>

<BR>

Just wanted to say thanks. &nbsp;We disabled the emtaecc feature on one of the volumes (a backup volume) to test and the issue went away.<BR>

<BR>

What we had:<BR>

<BR>

</SPAN></FONT><BLOCKQUOTE><FONT FACE="Calibri, Verdana, Helvetica, Arial"><SPAN STYLE='font-size:11pt'># tunefs.ocfs2 -q -Q &quot;All Features: %M %H %O\n&quot; &nbsp;/dev/mapper/backup-part1<BR>

All Features: backup-super strict-journal-super sparse extended-slotmap inline-data metaecc xattr indexed-dirs refcount unwritten usrquota grpquota<BR>

</SPAN></FONT></BLOCKQUOTE><FONT FACE="Calibri, Verdana, Helvetica, Arial"><SPAN STYLE='font-size:11pt'><BR>

<BR>

What we did:<BR>

<BR>

</SPAN></FONT><BLOCKQUOTE><FONT SIZE="2"><FONT FACE="Consolas, Courier New, Courier"><SPAN STYLE='font-size:10pt'># tunefs.ocfs2 --fs-features=nometaecc /dev/mapper/data2-part1<BR>

<BR>

# tunefs.ocfs2 -q -Q &quot;All Features: %M %H %O\n&quot; /dev/mapper/data2-part1<BR>

All Features: backup-super strict-journal-super sparse inline-data xattr<BR>

indexed-dirs unwritten<BR>

</SPAN></FONT></FONT></BLOCKQUOTE><FONT FACE="Calibri, Verdana, Helvetica, Arial"><SPAN STYLE='font-size:11pt'><BR>

After several days, the logs and mount look good.<BR>

<BR>

Thanks again,<BR>

-Rory<BR>

<BR>

<BR>

On 2012-08-29 2:13 PM, &quot;Sunil Mushran&quot; &lt;<a href="sunil.mushran@gmail.com">sunil.mushran@gmail.com</a>&gt; wrote:<BR>

<BR>

</SPAN></FONT><BLOCKQUOTE><FONT FACE="Calibri, Verdana, Helvetica, Arial"><SPAN STYLE='font-size:11pt'>Forgot to add that this issue is limited to metaecc. So you could avoid the issue in your<BR>

same setup by not enabling metaecc on the volume. And last I checked mkfs did not<BR>

enable it by default.<BR>

<BR>

On Mon, Aug 27, 2012 at 10:35 AM, Sunil Mushran &lt;<a href="sunil.mushran@gmail.com">sunil.mushran@gmail.com</a>&gt; wrote:<BR>

</SPAN></FONT><BLOCKQUOTE><FONT FACE="Calibri, Verdana, Helvetica, Arial"><SPAN STYLE='font-size:11pt'>So you are running into a bug that has been fixed in 2.6.36. Upgrade to that version,<BR>

if not something more current.<BR>

<BR>

$ git describe --tags 13ceef09<BR>

v2.6.35-rc3-14-g13ceef0<BR>

<BR>

commit 13ceef099edd2b70c5a6f3a9ef5d6d97cda2e096<BR>

Author: Jan Kara &lt;<a href="jack@suse.cz">jack@suse.cz</a>&gt;<BR>

Date:   Wed Jul 14 07:56:33 2010 +0200<BR>

<BR>

    jbd2/ocfs2: Fix block checksumming when a buffer is used in several transactions<BR>

    <BR>

    OCFS2 uses t_commit trigger to compute and store checksum of the just<BR>

    committed blocks. When a buffer has b_frozen_data, checksum is computed<BR>

    for it instead of b_data but this can result in an old checksum being<BR>

    written to the filesystem in the following scenario:<BR>

    <BR>

    1) transaction1 is opened<BR>

    2) handle1 is opened<BR>

    3) journal_access(handle1, bh)<BR>

        - This sets jh-&gt;b_transaction to transaction1<BR>

    4) modify(bh)<BR>

    5) journal_dirty(handle1, bh)<BR>

    6) handle1 is closed<BR>

    7) start committing transaction1, opening transaction2<BR>

    8) handle2 is opened<BR>

    9) journal_access(handle2, bh)<BR>

        - This copies off b_frozen_data to make it safe for transaction1 to commit.<BR>

          jh-&gt;b_next_transaction is set to transaction2.<BR>

    10) jbd2_journal_write_metadata() checksums b_frozen_data<BR>

    11) the journal correctly writes b_frozen_data to the disk journal<BR>

    12) handle2 is closed<BR>

        - There was no dirty call for the bh on handle2, so it is never queued for<BR>

          any more journal operation<BR>

    13) Checkpointing finally happens, and it just spools the bh via normal buffer<BR>

    writeback.  This will write b_data, which was never triggered on and thus<BR>

    contains a wrong (old) checksum.<BR>

    <BR>

    This patch fixes the problem by calling the trigger at the moment data is<BR>

    frozen for journal commit - i.e., either when b_frozen_data is created by<BR>

    do_get_write_access or just before we write a buffer to the log if<BR>

    b_frozen_data does not exist. We also rename the trigger to t_frozen as<BR>

    that better describes when it is called.<BR>

    <BR>

    Signed-off-by: Jan Kara &lt;<a href="jack@suse.cz">jack@suse.cz</a>&gt;<BR>

    Signed-off-by: Mark Fasheh &lt;<a href="mfasheh@suse.com">mfasheh@suse.com</a>&gt;<BR>

    Signed-off-by: Joel Becker &lt;<a href="joel.becker@oracle.com">joel.becker@oracle.com</a>&gt;<BR>

<BR>

<BR>

On Mon, Aug 27, 2012 at 5:10 AM, Rory Kilkenny &lt;<a href="Rory.Kilkenny@ticoon.com">Rory.Kilkenny@ticoon.com</a>&gt; wrote:<BR>

</SPAN></FONT><BLOCKQUOTE><FONT FACE="Calibri, Verdana, Helvetica, Arial"><SPAN STYLE='font-size:11pt'># uname -a<BR>

Linux FILEt1 2.6.34.7-0.7-desktop #1 SMP PREEMPT 2010-12-13 11:13:53 +0100 x86_64 x86_64 x86_64 GNU/Linux<BR>

<BR>

# modinfo ocfs2<BR>

filename:       /lib/modules/2.6.34.7-0.7-desktop/kernel/fs/ocfs2/ocfs2.ko<BR>

license:        GPL<BR>

author:         Oracle<BR>

version:        1.5.0<BR>

description:    OCFS2 1.5.0<BR>

srcversion:     B13569B35F99D43FA80D129<BR>

depends:        jbd2,ocfs2_stackglue,quota_tree,ocfs2_nodemanager<BR>

vermagic:       2.6.34.7-0.7-desktop SMP preempt mod_unload modversions <BR>

<BR>

# mkfs.ocfs2 --version<BR>

mkfs.ocfs2 1.4.3<BR>

<BR>

<BR>

<BR>

<BR>

On 12-08-24 5:44 PM, &quot;Sunil Mushran&quot; &lt;<a href="sunil.mushran@gmail.com">sunil.mushran@gmail.com</a> &lt;<a href="http://sunil.mushran@gmail.com">http://sunil.mushran@gmail.com</a>&gt; &gt; wrote:<BR>

<BR>

</SPAN></FONT><BLOCKQUOTE><FONT FACE="Calibri, Verdana, Helvetica, Arial"><SPAN STYLE='font-size:11pt'>What is the version of the kernel, ocfs2 and ocfs2 tools?<BR>

<BR>

uname -a<BR>

modinfo ocfs2<BR>

mkfs.ocfs2 --version<BR>

<BR>

On Fri, Aug 24, 2012 at 1:09 PM, Rory Kilkenny &lt;<a href="Rory.Kilkenny@ticoon.com">Rory.Kilkenny@ticoon.com</a> &lt;<a href="http://Rory.Kilkenny@ticoon.com">http://Rory.Kilkenny@ticoon.com</a>&gt; &gt; wrote:<BR>

</SPAN></FONT><BLOCKQUOTE><FONT FACE="Calibri, Verdana, Helvetica, Arial"><SPAN STYLE='font-size:11pt'>We have an HP P2000 G3 Storage array, fiber connected.  The storage array has a RAID5 array broken into 2 physical OCFS2 volumes (A &amp; B). <BR>

<BR>

A &amp; B are both mounted and formatted as NTFS.<BR>

<BR>

One of the volumes is NFS mounted.  <BR>

<BR>

Every couple of months or so we start getting tons of errors on the NFS mounted volume:<BR>

<BR>

<BR>

</SPAN></FONT><BLOCKQUOTE><FONT FACE="Calibri, Verdana, Helvetica, Arial"><SPAN STYLE='font-size:11pt'>Aug 24 09:48:13 FILEt2 kernel: [2234285.848940] (ocfs2_wq,13844,7):ocfs2_block_check_validate:443 ERROR: CRC32 failed: stored: 0, computed 1467126086.  Applying ECC.<BR>

Aug 24 09:48:13 FILEt2 kernel: [2234285.849252] (ocfs2_wq,13844,7):ocfs2_block_check_validate:457 ERROR: Fixed CRC32 failed: stored: 0, computed 3828104806<BR>

Aug 24 09:48:13 FILEt2 kernel: [2234285.849256] (ocfs2_wq,13844,7):ocfs2_validate_extent_block:903 ERROR: Checksum failed for extent block 1169089<BR>

Aug 24 09:48:13 FILEt2 kernel: [2234285.849261] (ocfs2_wq,13844,7):__ocfs2_find_path:1861 ERROR: status = -5<BR>

Aug 24 09:48:13 FILEt2 kernel: [2234285.849264] (ocfs2_wq,13844,7):ocfs2_find_leaf:1958 ERROR: status = -5<BR>

Aug 24 09:48:13 FILEt2 kernel: [2234285.849267] (ocfs2_wq,13844,7):ocfs2_find_new_last_ext_blk:6655 ERROR: status = -5<BR>

Aug 24 09:48:13 FILEt2 kernel: [2234285.849270] (ocfs2_wq,13844,7):ocfs2_do_truncate:6900 ERROR: status = -5<BR>

Aug 24 09:48:13 FILEt2 kernel: [2234285.849274] (ocfs2_wq,13844,7):ocfs2_commit_truncate:7556 ERROR: status = -5<BR>

Aug 24 09:48:13 FILEt2 kernel: [2234285.849280] (ocfs2_wq,13844,7):ocfs2_truncate_for_delete:593 ERROR: status = -5<BR>

Aug 24 09:48:13 FILEt2 kernel: [2234285.849284] (ocfs2_wq,13844,7):ocfs2_wipe_inode:769 ERROR: status = -5<BR>

Aug 24 09:48:13 FILEt2 kernel: [2234285.849287] (ocfs2_wq,13844,7):ocfs2_delete_inode:1067 ERROR: status = -5<BR>

<BR>

</SPAN></FONT></BLOCKQUOTE><FONT FACE="Calibri, Verdana, Helvetica, Arial"><SPAN STYLE='font-size:11pt'><BR>

If we pull all the data off, destroy the volume, rebuilt it, and copy our data back, all works fine; for a while.<BR>

<BR>

This issue does not happen on the non NFS mounted volume. I am currently assuming the issue is with NFS and how we have it configured (which to the best of my knowledge is default).  <BR>

<BR>

Has anyone had a similar experience and be able to share some insight and knowledge on any tricks with NFS and OCFS2 volumes?<BR>

<BR>

Thanks in advance.<BR>

<BR>

<BR>

<BR>

_______________________________________________<BR>

Ocfs2-users mailing list<BR>

<a href="Ocfs2-users@oss.oracle.com">Ocfs2-users@oss.oracle.com</a> &lt;<a href="http://Ocfs2-users@oss.oracle.com">http://Ocfs2-users@oss.oracle.com</a>&gt; <BR>

<a href="https://oss.oracle.com/mailman/listinfo/ocfs2-users">https://oss.oracle.com/mailman/listinfo/ocfs2-users</a><BR>

</SPAN></FONT></BLOCKQUOTE><FONT FACE="Calibri, Verdana, Helvetica, Arial"><SPAN STYLE='font-size:11pt'><BR>

<BR>

</SPAN></FONT></BLOCKQUOTE></BLOCKQUOTE><FONT FACE="Calibri, Verdana, Helvetica, Arial"><SPAN STYLE='font-size:11pt'><BR>

</SPAN></FONT></BLOCKQUOTE><FONT FACE="Calibri, Verdana, Helvetica, Arial"><SPAN STYLE='font-size:11pt'><BR>

<BR>

</SPAN></FONT></BLOCKQUOTE>

</BODY>

</HTML>