[Ocfs2-devel] [PATCH v3] ocfs2: return error when we attempt to access a dirty bh in jbd2
Changwei Ge
ge.changwei at h3c.com
Sun Jan 28 18:32:50 PST 2018
It looks good to me.
Reviewed-by: Changwei Ge <ge.changwei at h3c.com>
On 2018/1/29 10:02, piaojun wrote:
> We should not reuse the dirty bh in jbd2 directly due to the following
> situation:
>
> 1. When removing extent rec, we will dirty the bhs of extent rec and
> truncate log at the same time, and hand them over to jbd2.
> 2. The bhs are submitted to jbd2 area successfully.
> 3. The write-back thread of device help flush the bhs to disk but
> encounter write error due to abnormal storage link.
> 4. After a while the storage link become normal. Truncate log flush
> worker triggered by the next space reclaiming found the dirty bh of
> truncate log and clear its 'BH_Write_EIO' and then set it uptodate in
> __ocfs2_journal_access():
>
> ocfs2_truncate_log_worker
> ocfs2_flush_truncate_log
> __ocfs2_flush_truncate_log
> ocfs2_replay_truncate_records
> ocfs2_journal_access_di
> __ocfs2_journal_access // here we clear io_error and set 'tl_bh' uptodata.
>
> 5. Then jbd2 will flush the bh of truncate log to disk, but the bh of
> extent rec is still in error state, and unfortunately nobody will
> take care of it.
> 6. At last the space of extent rec was not reduced, but truncate log
> flush worker have given it back to globalalloc. That will cause
> duplicate cluster problem which could be identified by fsck.ocfs2.
>
> Sadlly we can hardly revert this but set fs read-only in case of
> ruining atomicity and consistency of space reclaim.
>
> Fixes: acf8fdbe6afb ("ocfs2: do not BUG if buffer not uptodate in __ocfs2_journal_access")
>
> Signed-off-by: Jun Piao <piaojun at huawei.com>
> Reviewed-by: Yiwen Jiang <jiangyiwen at huawei.com>
> ---
> fs/ocfs2/journal.c | 23 ++++++++++++-----------
> 1 file changed, 12 insertions(+), 11 deletions(-)
>
> diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
> index 3630443..e5dcea6 100644
> --- a/fs/ocfs2/journal.c
> +++ b/fs/ocfs2/journal.c
> @@ -666,23 +666,24 @@ static int __ocfs2_journal_access(handle_t *handle,
> /* we can safely remove this assertion after testing. */
> if (!buffer_uptodate(bh)) {
> mlog(ML_ERROR, "giving me a buffer that's not uptodate!\n");
> - mlog(ML_ERROR, "b_blocknr=%llu\n",
> - (unsigned long long)bh->b_blocknr);
> + mlog(ML_ERROR, "b_blocknr=%llu, b_state=0x%lx\n",
> + (unsigned long long)bh->b_blocknr, bh->b_state);
>
> lock_buffer(bh);
> /*
> - * A previous attempt to write this buffer head failed.
> - * Nothing we can do but to retry the write and hope for
> - * the best.
> + * A previous transaction with a couple of buffer heads fail
> + * to checkpoint, so all the bhs are marked as BH_Write_EIO.
> + * For current transaction, the bh is just among those error
> + * bhs which previous transaction handle. We can't just clear
> + * its BH_Write_EIO and reuse directly, since other bhs are
> + * not written to disk yet and that will cause metadata
> + * inconsistency. So we should set fs read-only to avoid
> + * further damage.
> */
> if (buffer_write_io_error(bh) && !buffer_uptodate(bh)) {
> - clear_buffer_write_io_error(bh);
> - set_buffer_uptodate(bh);
> - }
> -
> - if (!buffer_uptodate(bh)) {
> unlock_buffer(bh);
> - return -EIO;
> + return ocfs2_error(osb->sb, "A previous attempt to "
> + "write this buffer head failed\n");
> }
> unlock_buffer(bh);
> }
>
More information about the Ocfs2-devel
mailing list