[Ocfs2-devel] The last part of the file is zeroed out when write N random bytes
Gang He
ghe at suse.com
Tue Sep 28 17:57:07 PDT 2021
Hi Guys,
Just give a update.
Based on our testing, the problem was caused by the comment in fs/buffer.c.
commit 6dbf7bb555981fb5faf7b691e8f6169fc2b2e63b
Author: Jan Kara <jack at suse.cz>
Date: Fri Sep 4 10:58:51 2020 +0200
fs: Don't invalidate page buffers in block_write_full_page()
Thanks
Gang
On 2021/9/27 15:57, Gang He wrote:
>
>
> On 2021/9/27 15:49, Joseph Qi wrote:
>>
>> Last week, Andrey Markov reported a similar issue, but unfortunately not
>> on mail list.
>>
>> And Junxiao has resolved a similar issue recently. So can you reproduce
>> the bug in latest kernel?
> Yes, I can reproduce this issue with the latest code.
> The cluster size must be greater than 4K(e.g. 8K, 1M), this is the key
> to the problem.
>
> Thanks
> Gang
>
>>
>> Thanks,
>> Joseph
>>
>> On 9/27/21 3:16 PM, Gang He wrote:
>>> Hi List,
>>>
>>> I'd like to report a data loss bug when write N random bytes, since I saw there were some related commits in the past weeks.
>>> I can reproduce this bug stably with the latest ocfs2 kernel module code as below,
>>> 1) Create a three node(e.g. ghe-tw-nd1, ghe-tw-nd2, ghe-tw-nd3) ocfs2 cluster, attach a shared disk(e.g. /dev/vdb).
>>> 2) Format the disk with the command "mkfs.ocfs2 -N 4 -b 4096 -C 1048576 /dev/vdb", and mount the disk to /mnt/shared on each node. The cluster size must be greater than 4K, this is the key to the problem.
>>> 3) Copy the file write/test scripts to /mnt/shared directory, then run test script on node1 to reproduce this bug.
>>> file write script ocfs2_fallocate_bug_plain_write.py: https://pastebin.com/QsXcD8rq
>>> file test script ocfs2_loop.sh: https://pastebin.com/eTUe2hkW
>>> 4) Then, you can meet this bug, the file md5sum is different between from node1 and from node2.
>>> In fact, the last part of the file is zeroed out from node2.
>>> e.g.
>>> file dump from node1: https://pastebin.com/HB92TVS0
>>> file dump from node2: https://pastebin.com/jBG7HdSz
>>>
>>> More information,
>>> this bug does not exist on some old kernels( e.g. linux-4.12.14-120), but it will happen on some new kernels, I feel this bug is probably NOT caused by ocfs2 commits, since I used old ocfs2 kernel module code on the new kernels, the problem also happened.
>>> Anyway, if you have any comments, please reply this mail.
>>>
>>> Thanks
>>> Gang
>>>
>>> _______________________________________________
>>> Ocfs2-devel mailing list
>>> Ocfs2-devel at oss.oracle.com
>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>>
>>
>
>
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>
More information about the Ocfs2-devel
mailing list