[Ocfs2-devel] [RFC] Online File(system) check

Gang He ghe at suse.com
Tue Apr 28 19:37:59 PDT 2015


Hi Joseph,

Thanks for your detailed description.
See my question inline.


>>> 
> Hi Goldwyn,
> 
> Thanks for the good proposal.
> 
> On 2015/4/28 20:21, Goldwyn Rodrigues wrote:
>> Hi Gang,
>> 
>> On 04/27/2015 10:00 PM, Gang He wrote:
>>> Hi Glodwyn,
>>>
>>> Very nice proposal.
>>> So far, there are some comments from me.
>>> 1) which task will we do in check/fix a file, we need to define the detailed 
> requirements further, since we just do a light-level file check/fix according 
> to inode number, we need to know which items can be done by online check, 
> which items can be done by offline fsck.
>> 
>> For the first phase (regular files), these are all the reasons the disk 
> validate function would fail. Some examples are ocfs2_validate_inode_block, 
> ocfs2_validate_extent_block etc.
>> As we take up system inodes (phase 2), we will add more functionality.
>> 
> Can we classify all corrupted cases and their corresponding fix ways? Maybe 
> we can get some hints from fsck.
> And I don't think errors=continue can fit for all cases.
> For some cases we shouldn't let it continue with errors to prevent more 
> damages.
> 
>>> 2) can we keep check and fix two option, check option is to check if a file 
> is good or bad, but not modify anything, fix option is to check and fix a 
> file if the file is corrupted.
>> 
>> Yes, there are two options, CHECKS only checks wheras FIX fixes the errors. 
> As a precautionary measure, a CHECK command should be provided before a FIX 
> is issued. IOW, a file should be checked for errors before actually fixing 
> it.
>> 
> A convenient way to know which to be checked should also be taken into 
> consideration.
> 
>>> 3) when users execute the command "echo CHECK <inode> > 
> /sys/fs/ocfs2/filecheck" to check a file, how to give the feedback 
> information besides printing the messages to syslog?
>> 
>> The output should be when you cat /sys/fs/ocfs2/filecheck. It would provide 
> the results of the last (N) files checked. I don't want to flood the kernel 
> log with this. Thanks for bringing this up, I will put it on the doc. 
> Something like:
>> 
>> Inode Status Description
>> 1234   ERROR Metadata incorrect
>> 2352   FIXED Valid flag not set
>> 9382   CHECKING -
>> 8926   GOOD -
>> 7230   CANT-FIX Please execute fsck.ocfs2 after taking filesystem offline.
>> 
>> So, for the current scenario, only 1234 can be fixed. An echo should err 
> with EINVAL if any other inode number is provided with FIX.
>> 
>> 
>>> 4) we should support a list to accept the "check/fix" requests from 
> user-space and queue them, then handle them one by one, right? what is the 
> behavior for the request user which execute "echo check ..." from the user 
> space? the user post a request to the kernel space, then the command will end 
> or wait for the file check end?
>>>
>> 
>> I would not suggest that, atleast for now. This is to improve availability. 
> However, if the filesystem is very bad, we should suggest an offline check. 
> However, the user can provide multiple CHECK requests.
My question is, if users can execute "echo check > .." to check/fix files simultaneously? since users can trigger this command from different terminates.
Second, users send a command to kernel space, the kernel space have to cache these commands in a list/array, since kernel can not finish a check request immediately, otherwise, how does the kernel accept a new request during the kernel are handing the current request.  

Thanks
Gang

>> 




More information about the Ocfs2-devel mailing list