[Ocfs2-devel] [PATCH 5/5] Fix files when an inode is written in file_fix

Thu Jun 2 21:45:18 PDT 2016

>>> 

> 
> On 06/01/2016 11:26 PM, Gang He wrote:
>>
>>
>>
>>>>>
>>
>>>
>>> On 06/01/2016 10:06 PM, Gang He wrote:
>>>>
>>>>
>>>>
>>>>>>>
>>>>
>>>>>
>>>>> On 05/31/2016 09:05 PM, Gang He wrote:
>>>>>> Hello Goldwyn,
>>>>>>
>>>>>>
>>>>>>>>>
>>>>>>
>>>>>>>
>>>>>>> On 05/30/2016 04:45 AM, Gang He wrote:
>>>>>>>> Hello Goldwyn,
>>>>>>>>
>>>>>>>> Please see my comments inline.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Gang
>>>>>>>>
>>>>>>>>
>>>>>>>>>>>
>>>>>>>>> From: Goldwyn Rodrigues <rgoldwyn at suse.com>
>>>>>>>>>
>>>>>>>>> Check that the entriy exists and has been filed for check.
>>>>>>>>> Also perform some code cleanup
>>>>>>>>>
>>>>>>>>> Signed-off-by: Goldwyn Rodrigues <rgoldwyn at suse.com>
>>>>>>>>> ---
>>>>>>>>>      fs/ocfs2/filecheck.c | 41 +++++++++++++++++++++++++----------------
>>>>>>>>>      fs/ocfs2/filecheck.h |  1 +
>>>>>>>>>      fs/ocfs2/sysfs.c     |  2 +-
>>>>>>>>>      3 files changed, 27 insertions(+), 17 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/fs/ocfs2/filecheck.c b/fs/ocfs2/filecheck.c
>>>>>>>>> index 006d521..fc6e183 100644
>>>>>>>>> --- a/fs/ocfs2/filecheck.c
>>>>>>>>> +++ b/fs/ocfs2/filecheck.c
>>>>>>>>> @@ -198,22 +198,6 @@ ocfs2_filecheck_handle(struct ocfs2_super *osb,
>>>>>>>>>      	return ret;
>>>>>>>>>      }
>>>>>>>>>
>>>>>>>>> -static void
>>>>>>>>> -ocfs2_filecheck_handle_entry(struct ocfs2_super *osb,
>>>>>>>>> -			     struct ocfs2_filecheck_entry *entry)
>>>>>>>>> -{
>>>>>>>>> -	if (entry->fe_type == OCFS2_FILECHECK_TYPE_CHK)
>>>>>>>>> -		entry->fe_status = ocfs2_filecheck_handle(osb,
>>>>>>>>> -				entry->fe_ino, OCFS2_FI_FLAG_FILECHECK_CHK);
>>>>>>>>> -	else if (entry->fe_type == OCFS2_FILECHECK_TYPE_FIX)
>>>>>>>>> -		entry->fe_status = ocfs2_filecheck_handle(osb,
>>>>>>>>> -				entry->fe_ino, OCFS2_FI_FLAG_FILECHECK_FIX);
>>>>>>>>> -	else
>>>>>>>>> -		entry->fe_status = OCFS2_FILECHECK_ERR_UNSUPPORTED;
>>>>>>>>> -
>>>>>>>>> -	ocfs2_filecheck_done_entry(osb, entry);
>>>>>>>>> -}
>>>>>>>>> -
>>>>>>>>>      int ocfs2_filecheck_add_inode(struct ocfs2_super *osb,
>>>>>>>>>      				     unsigned long ino)
>>>>>>>>>      {
>>>>>>>>> @@ -268,3 +252,28 @@ unlock:
>>>>>>>>>      	ocfs2_filecheck_done_entry(osb, entry);
>>>>>>>>>      	return 0;
>>>>>>>>>      }
>>>>>>>>> +
>>>>>>>>> +int ocfs2_filefix_add_inode(struct ocfs2_super *osb,
>>>>>>>>> +				     unsigned long ino)
>>>>>>>>> +{
>>>>>>>>> +	struct ocfs2_filecheck_entry *entry;
>>>>>>>>> +	int ret = -ENOENT;
>>>>>>>>> +
>>>>>>>>> +	spin_lock(&osb->fc_lock);
>>>>>>>>> +	list_for_each_entry(entry, &osb->file_check_entries, fe_list)
>>>>>>>>> +		if (entry->fe_ino == ino) {
>>>>>>>>> +			entry->fe_type = OCFS2_FILECHECK_TYPE_FIX;
>>>>>>>> Gang: It looks that we can not do it directly, why? because the entry
>>>>>>> pointer can be freed by the function ocfs2_filecheck_erase_entries().
>>>>>>>> We can not use the same entry pointer within two user processes.
>>>>>>>> The simple solution is to return -EBUSY error in case there is the same ino
>>>>>>> number entry in the list, then the user can try again after the previous
>>>>> user
>>>>>>> process is returned.
>>>>>>>
>>>>>>> How? is it not protected under spinlock?
>>>>>> Gang: please aware that this spinlock fc_lock is used to protect entry list
>>>>> (adding entry/deleting entry), not protect entry itself.
>>>>>> The first user process will mark the entry's status to done after the file
>>>>> check operation is finished, then the function
>>>>> ocfs2_filecheck_erase_entries() will possibly delete this entry from the
>>>>> list, to free the entry memory during the second user process is referencing
>>>>> on this entry.
>>>>>> You know, the spinlock can not protect the file check operation (too long
>>>>> time IO operation).
>>>>>
>>>>> Yes, a possible reason for separating the lists too. The entries will be
>>>>> deleted after a check, and the user will not be able to perform a fix on it.
>>>> Gang: we can not delete the file check entry after it's operation is done,
>>> since we need to keep
>>>> them to provide the result history records.
>>>>
>>>>>
>>>>> In any case, if we check on status, we can do away with it. We may
>>>>> require filecheck_entry spinlocks later on, but for now it is not required.
>>>> Gang: Why we need a spinlock  for each filecheck entry? the entry is only
>>> occupied by one user process.
>>>> we only need to get the lock when adding/deleting the entry from the list,
>>> this operation is very short.
>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>> Anyways, I plan to make a separate list for this so we can do away with
>>>>>>> more macros.
>>>>>> Gang: you can use two lists, but I think that one list is also OK, keep the
>>>>> thing simple.
>>>>>> Just return a -EBUSY error when the same inode is on the progress, then the
>>>>> user can try again after that file check process returns.
>>>>>
>>>>> Thanks for the suggestions, I have incorporated that, with the two
>>>>> lists, of course.
>>>>>
>>>>>>
>>>>>>>
>>>>>>> Besides, any I/O and check operation should be done in a separate
>>>>>>> thread/work queue.
>>>>>> Gang: current design is no a separated thread is used to run file check
>>>>> operation, each user trigger process is used to execute its file check
>>>>> operation.
>>>>>> Why? first, keep the thing simple, avoid to involve more logic/thread.
>>>>> second, if we involved a thread/work queue to run the file check operations,
>>>>>> that means the user trigger process will return directly and need not to
>>>>> wait for the actual file check operation, there will be a risk that the user
>>>>> can
>>>>>> not see the result from the result history record (via cat check/fix), since
>>>>> the new submits result will make the oldest result records to be released,
>>>>>> we have a fixed length list to keep these status records. If we use the user
>>>>> trigger process to do the file check operation, the user can surely see the
>>>>> result after this user process returns (since this file check operation is
>>>>> done in the kernel space).
>>>>>>
>>>>>
>>>>> In the future, how do you plan to extend this? How would you check for
>>>>> extent_blocks? Or worse, system files? Would you keep the system waiting
>>>>> until all I/O has completed?
>>>> Gang: Yes, the trigger user process can wait for this I/O operation, just
>>> like a system call.
>>>> If we want to check multiple files (inode), we can use a batch to call this
>>> interface one by one
>>>> (of course,  you can use multiple threads/processes), this is why I did not
>>> use a work queue/thread
>>>> in the kernel module to do this asynchronously, because in the user space,
>>> we can do the same thing.
>>>> let's keep the kernel module logic simple.
>>>
>>>
>>> If you were planning to keep the process synchronous, why keep multiple
>>> entries. Just one would have been enough. This way the check/fix cycle
>>> would have a better probability of finding stuff in the cache as well.
>> The list be used for file check entry history status, and for supporting to 
> multiple user threads/processes to trigger a check/fix request parallelly.
>>
> 
> Why do you need history in a synchronous process. The error is received 
> as soon as the process ends. 
How do you return a file check specific error number to the trigger user process? then I use a list to save these file check entries (running/done).
The user maybe want to see the file check result again after the file check was done, this just give an chance for user to inquire the result again, but will not occupy too much memory.

Besides, in multiple threads each thread 
> will have his context so you don't need to store in a global location.
Gang: I prefer to do the thing with independent design, do not depend some other data structures/mechanism, since these data structures/mechanism will be changed according kernel versions possibly. we should try to implement our functions in our module code, then export some interfaces(extern functions) to be invoked by some kernel sensitive  data structures/mechanism (e.g.
Kobject, ioctl, etc). Why I user a global location, at that moment, I did not aware that we can embed kobject in super block structure like ext4, now we can embed kobject in the super, the idea
is better.

> Why did you need to use sysfs for this? Why couldn't you use something 
> else for synchronous processing, say an ioctl()?
Gang: Yes, this is a way to reach user process, as you said in a RFC one year ago.
Of course, we also can use ioctl to commutation, but this will means that we have to add some code to implement a user tool.

Anyway, I think it is not too important about if we use synchronous/asynchronous to handle a file check operation, if you like, please create a dedicated kernel thread to execute these file check entries in the list. It is OK for both ways.

Thanks
Gang  

> 
> 
> -- 
> Goldwyn