[Ocfs2-tools-devel] [PATCH RESEND 3/3] fsck.ocfs2: support append direct io incompat feature
Joseph Qi
joseph.qi at huawei.com
Tue Mar 8 01:02:09 PST 2016
Hi Eric,
Thanks very much for reviewing. Please see my comments below.
On 2016/3/4 21:51, Eric Ren wrote:
> Hello Joseph,
>
> I don't go through all relative code pathes, but has some questions
> that confuse me. Hope you could give some help.
>
> On Thu, Mar 03, 2016 at 07:34:46PM +0800, Joseph Qi wrote:
>> Support to truncate direct io orphan entry in fsck.ocfs2.
>>
>> Signed-off-by: Joseph Qi <joseph.qi at huawei.com>
>> ---
>> fsck.ocfs2/pass4.c | 9 +++++++++
>> libocfs2/truncate.c | 12 ++++++++++++
>> 2 files changed, 21 insertions(+)
>>
>> diff --git a/fsck.ocfs2/pass4.c b/fsck.ocfs2/pass4.c
>> index 7737cc9..de3f672 100644
>> --- a/fsck.ocfs2/pass4.c
>> +++ b/fsck.ocfs2/pass4.c
>> @@ -100,6 +100,9 @@ out:
>> return;
>> }
>>
>> +#define OCFS2_DIO_ORPHAN_PREFIX "dio-"
>> +#define OCFS2_DIO_ORPHAN_PREFIX_LEN 4
>> +
>> static int replay_orphan_iterate(struct ocfs2_dir_entry *dirent,
>> uint64_t blocknr,
>> int offset,
>> @@ -138,6 +141,11 @@ static int replay_orphan_iterate(struct ocfs2_dir_entry *dirent,
>> goto out;
>> }
>>
> Code in here is:
> 133 ret = ocfs2_truncate(ost->ost_fs, dirent->inode, 0);
> 134 if (ret) {
> 135 com_err(whoami, ret, "while truncating orphan inode %"PRIu64,
> 136 (uint64_t)dirent->inode);
> 137 ret_flags |= OCFS2_DIRENT_ABORT;
> 138 goto out;
> 139 }
>
> Do we really want to truncate the data of crashed inode which was doing DIO
> instead of recovery? I read aboud your append O_DIRECT patch set. The idea looks
> like to add file to orphan dir when doing direc io, and delete from orphan dir after
> direct io is done.
>
Yes, we have to. This is to keep the same behavior that kernel behaves.
In kernel module, once dio crashes, we will do truncate in orphan
recovery to reclaim allocated clusters.
>> + /* do not delete inode in case of dio orphan entry */
>> + if (!strncmp(dirent->name, OCFS2_DIO_ORPHAN_PREFIX,
>> + OCFS2_DIO_ORPHAN_PREFIX_LEN))
>> + goto out_check;
>> +
>
> It make sense to skip _delte_inode() stuff. But I cannot understand very well what the code
> after out_check: is supposed to do? Looks it collects info about link count and inode refs,
> and then do check_link_counts() where inode would be moved to lost+found dir if no dirent
> refer to it. So is it possible to lose the inode unexpectedly?
>
Normally it won't. Otherwise I think it is a bug for inode link count.
>> ret = ocfs2_delete_inode(ost->ost_fs, dirent->inode);
>> if (ret) {
>> com_err(whoami, ret, "while deleting orphan inode %"PRIu64
>> @@ -148,6 +156,7 @@ static int replay_orphan_iterate(struct ocfs2_dir_entry *dirent,
>>
>> ost->ost_orphan_deleted_count++;
>>
>> +out_check:
>> /* Only calculate icount in force check. */
>> if (ost->ost_force) {
>> /*
>> diff --git a/libocfs2/truncate.c b/libocfs2/truncate.c
>> index 7327253..451db28 100644
>> --- a/libocfs2/truncate.c
>> +++ b/libocfs2/truncate.c
>> @@ -33,6 +33,7 @@
>> #include <assert.h>
>> #include <errno.h>
>> #include "ocfs2/ocfs2.h"
>> +#include "ocfs2/byteorder.h"
>>
>> struct truncate_ctxt {
>> uint64_t ino;
>> @@ -356,6 +357,16 @@ errcode_t ocfs2_truncate_full(ocfs2_filesys *fs, uint64_t ino,
>> if (ret)
>> goto out;
>>
>> + /* in case of dio crashed, force do trucate since blocks may already
>> + * be allocated
>> + */
>> + if (ci->ci_inode->i_flags & cpu_to_le32(OCFS2_DIO_ORPHANED_FL)) {
>> + ci->ci_inode->i_flags &= ~cpu_to_le32(OCFS2_DIO_ORPHANED_FL);
>> + ci->ci_inode->i_dio_orphaned_slot = 0;
>> + new_i_size = ci->ci_inode->i_size;
>> + goto truncate;
>> + }
>> +
>
> Could you explain a bit more the reason why we should add this lines of code here?
> Because fix_dups_func()
> ->clone_one_inode() or delete_one_inode()
> ->1355 ret = ocfs2_truncate_full(ost->ost_fs, ino, 0,
> 1356 pass1d_free_clusters, fd);
>
> As you can see the 3th parameter(new_i_size) is 0, so it won't hit go-out. Why do we
> need this line:
> new_i_size = ci->ci_inode->i_size;
> ?
>
Though replay_orphan_iterate->ocfs2_truncate passes new_i_size with 0,
but for dio entry, the new_i_size will be re-assigned to the current
inode i_size. As I explained before, this is to reclaim allocated
clusters.
In other words, clusters are allocated, but inode i_size has
not been updated and written to disk. So we have to make the inode go
back to the former status after fsck.
Thanks,
Joseph
> Thanks!
> Eric
>> if (ci->ci_inode->i_size == new_i_size)
>> goto out;
>>
>> @@ -364,6 +375,7 @@ errcode_t ocfs2_truncate_full(ocfs2_filesys *fs, uint64_t ino,
>> goto out;
>> }
>>
>> +truncate:
>> if ((S_ISLNK(ci->ci_inode->i_mode) && !ci->ci_inode->i_clusters) ||
>> (ci->ci_inode->i_dyn_features & OCFS2_INLINE_DATA_FL))
>> ret = ocfs2_truncate_inline(fs, ino, new_i_size);
>> --
>> 1.8.4.3
>>
>>
>>
>> _______________________________________________
>> Ocfs2-tools-devel mailing list
>> Ocfs2-tools-devel at oss.oracle.com
>> https://oss.oracle.com/mailman/listinfo/ocfs2-tools-devel
>>
>
> .
>
More information about the Ocfs2-tools-devel
mailing list