[Ocfs2-devel] [RFC] Online File(system) check

Joseph Qi joseph.qi at huawei.com
Tue Apr 28 06:20:26 PDT 2015


Hi Goldwyn,

Thanks for the good proposal.

On 2015/4/28 20:21, Goldwyn Rodrigues wrote:
> Hi Gang,
> 
> On 04/27/2015 10:00 PM, Gang He wrote:
>> Hi Glodwyn,
>>
>> Very nice proposal.
>> So far, there are some comments from me.
>> 1) which task will we do in check/fix a file, we need to define the detailed requirements further, since we just do a light-level file check/fix according to inode number, we need to know which items can be done by online check, which items can be done by offline fsck.
> 
> For the first phase (regular files), these are all the reasons the disk validate function would fail. Some examples are ocfs2_validate_inode_block, ocfs2_validate_extent_block etc.
> As we take up system inodes (phase 2), we will add more functionality.
> 
Can we classify all corrupted cases and their corresponding fix ways? Maybe we can get some hints from fsck.
And I don't think errors=continue can fit for all cases.
For some cases we shouldn't let it continue with errors to prevent more damages.

>> 2) can we keep check and fix two option, check option is to check if a file is good or bad, but not modify anything, fix option is to check and fix a file if the file is corrupted.
> 
> Yes, there are two options, CHECKS only checks wheras FIX fixes the errors. As a precautionary measure, a CHECK command should be provided before a FIX is issued. IOW, a file should be checked for errors before actually fixing it.
> 
A convenient way to know which to be checked should also be taken into consideration.

>> 3) when users execute the command "echo CHECK <inode> > /sys/fs/ocfs2/filecheck" to check a file, how to give the feedback information besides printing the messages to syslog?
> 
> The output should be when you cat /sys/fs/ocfs2/filecheck. It would provide the results of the last (N) files checked. I don't want to flood the kernel log with this. Thanks for bringing this up, I will put it on the doc. Something like:
> 
> Inode Status Description
> 1234   ERROR Metadata incorrect
> 2352   FIXED Valid flag not set
> 9382   CHECKING -
> 8926   GOOD -
> 7230   CANT-FIX Please execute fsck.ocfs2 after taking filesystem offline.
> 
> So, for the current scenario, only 1234 can be fixed. An echo should err with EINVAL if any other inode number is provided with FIX.
> 
> 
>> 4) we should support a list to accept the "check/fix" requests from user-space and queue them, then handle them one by one, right? what is the behavior for the request user which execute "echo check ..." from the user space? the user post a request to the kernel space, then the command will end or wait for the file check end?
>>
> 
> I would not suggest that, atleast for now. This is to improve availability. However, if the filesystem is very bad, we should suggest an offline check. However, the user can provide multiple CHECK requests.
> 





More information about the Ocfs2-devel mailing list