[Ocfs2-devel] [PATCH] ocfs2: Fix quota file corruption

Marty Sweet msweet.dev at gmail.com
Wed Feb 19 09:56:35 PST 2014


Hi Jan,

What errors do you see regarding corrupted quotas, something like below?
kernel: [119890.305935] __quota_error: 3 callbacks suppressed
kernel: [119890.305941] Quota error (device dm-8): find_free_dqentry:
Data block full but it shouldn't
kernel: [119890.306071] Quota error (device dm-8): qtree_write_dquot:
Error -5 occurred while creating quota
kernel: [119890.306203] (smbd,25964,2):ocfs2_acquire_dquot:817 ERROR:
status = -5

There seems to be a lack of documentation regarding this issue, let me
know if you would like me to test anything for you.

Marty

On Wed, Feb 19, 2014 at 2:48 PM, Jan Kara <jack at suse.cz> wrote:
> On Wed 19-02-14 14:11:02, Jan Kara wrote:
>> Global quota files are accessed from different nodes and reading and
>> writing of these files happens via block device page cache. Thus even
>> though the access between nodes is properly serialized by quota file's
>> inode lock we cannot rely on consistency of block device page cache
>> between nodes. Indeed currently it is possible to corrupt quota files by
>> creating and deleting quota structures from two nodes in parallel. Fix
>> the problem by using OCFS2_BH_IGNORE_CACHE mount option when reading
>> from quota file.
>>
>> CC: Goldwyn Rodrigues <rgoldwyn at suse.de>
>> CC: Mark Fasheh <mfasheh at suse.de>
>> Signed-off-by: Jan Kara <jack at suse.cz>
>> ---
>>  fs/ocfs2/quota_global.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>>   This is a quick fix for quota file corruption I have found during my testing
>> (to be used for 3.13 and -stable kernels). Longer term we likely want to do
>> something more clever. Ideally we would invalidate relevant blocks from buffer
>> cache when node releases quota file's inode lock (similarly as we do it for
>> page cache). I wanted to do it similarly as for e.g. extent tree blocks but
>> I failed to find how consistency of those between nodes is achieved. Can
>> anyone point me in the right direction? Thanks in advance.
>   Hum, so this likely isn't the whole story because I can still see the
> corruption. At least I've now found the metadata cache and basically
> understand how it works so I'm now reading code around that and trying to
> understand what screws up quota code...
>
>                                                                 Honza
>
>>
>> diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c
>> index aaa50611ec66..f1f0cca15db6 100644
>> --- a/fs/ocfs2/quota_global.c
>> +++ b/fs/ocfs2/quota_global.c
>> @@ -151,7 +151,8 @@ int ocfs2_read_quota_phys_block(struct inode *inode, u64 p_block,
>>       int rc;
>>
>>       *bhp = NULL;
>> -     rc = ocfs2_read_blocks(INODE_CACHE(inode), p_block, 1, bhp, 0,
>> +     rc = ocfs2_read_blocks(INODE_CACHE(inode), p_block, 1, bhp,
>> +                            OCFS2_BH_IGNORE_CACHE,
>>                              ocfs2_validate_quota_block);
>>       if (rc)
>>               mlog_errno(rc);
>> --
>> 1.8.1.4
>>
> --
> Jan Kara <jack at suse.cz>
> SUSE Labs, CR
>
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel



-- 
Marty



More information about the Ocfs2-devel mailing list