<div dir="ltr">So it's a test issue. The utility assumes the fs allocates in 4K units. That's why it only works when clustersize is 4K.<br></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Aug 8, 2013 at 8:09 AM, David Weber <span dir="ltr"><<a href="mailto:wb@munzinger.de" target="_blank">wb@munzinger.de</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Am Donnerstag, 8. August 2013, 07:30:27 schrieb Sunil Mushran:<br>
<div class="im">> Interesting. Please can you print the inode disk using the command below.<br>
> The file path is minus the mounted dir.<br>
><br>
> debugfs.ocfs2 -R "stat /relative/path/to/file" /dev/DEVICE<br>
><br>
> It is saying that the fs has allocated a block when it did not need to. It<br>
> could be that the test utility does not handle blocks larger than 4K, or<br>
> the fiemap ioctl has a bug or the fs is indeed allocating a block when it<br>
> does not need to. The above command will show us the actual layout on disk.<br>
<br>
</div>Thank you for looking into this!<br>
<br>
# ./fiemap-tester /mnt/kvm-images/fiemap_new<br>
<div class="im">Starting infinite run, if you don't see any output then its working properly.<br>
HEY FS PERSON: your fs is weird. I specifically wanted a<br>
hole and you allocated a block anyway. FIBMAP confirms that<br>
you allocated a block, and the block is filled with 0's so<br>
everything is kosher, but you still allocated a block when<br>
didn't need to. This may or may not be what you wanted,<br>
which is why I'm only printing this message once, in case<br>
you didn't do it on purpose. This was at block 0.<br>
ERROR: preallocated extent is not marked with FIEMAP_EXTENT_UNWRITTEN: 0<br>
map is<br>
'HDHPHHDDHPHPHPHDDHHPPDDPPPHHHPDDDPDHHHHDDDPPHPPPDPHHPPDPPHHDDPDPPHDHPDDDDPDPPDPHDDPPDDPPHDDPDHHHDDPDHPHPDPPDDHPHPPHDPHPHDDHDPDPDHDHPDDPHPPPHDPPDPDDHPHDDPPHPDHPPHPPHPHHPHDHPPDDPHDHHPPHPPDHPHPHDHPPDDDDPHHHPPPHHHDDDDPDPDDPPPHPHDPPPHDPDPHDDHPPPDPDHPHHPHDHHDHPDPHDDPPHDPPDDPDDPPDHPPDPDHHPHDHPPHDDHDPHPPPDHPDDDHDDHDPPHHDDPPDPDDHDHHPHDPHHPPPDPPDHDHHPPHDPHDPPHDPHHPPP'<br>
</div>logical: [ 0.. 255] phys: 206615552..206615807 flags: 0x000 tot: 256<br>
Problem comparing fiemap and map<br>
<br>
# debugfs.ocfs2 -R "stat /fiemap_new" /dev/drbd0<br>
Inode: 92668161 Mode: 0644 Generation: 3713753505 (0xdd5b61a1)<br>
FS Generation: 2357962590 (0x8c8ba75e)<br>
CRC32: 00000000 ECC: 0000<br>
Type: Regular Attr: 0x0 Flags: Valid<br>
Dynamic Features: (0x0)<br>
User: 0 (root) Group: 0 (root) Size: 1470464<br>
Links: 1 Clusters: 2<br>
ctime: 0x5203b200 0x991cd -- Thu Aug 8 16:58:08.627149 2013<br>
atime: 0x5203b200 0xc0accc -- Thu Aug 8 16:58:08.12627148 2013<br>
mtime: 0x5203b200 0x991cd -- Thu Aug 8 16:58:08.627149 2013<br>
dtime: 0x0 -- Thu Jan 1 01:00:00 1970<br>
Refcount Block: 0<br>
Last Extblk: 0 Orphan Slot: 0<br>
Sub Alloc Slot: 0 Sub Alloc Bit: 1<br>
Tree Depth: 0 Count: 243 Next Free Rec: 2<br>
## Offset Clusters Block# Flags<br>
0 0 1 206615552 0x0<br>
1 1 1 206619648 0x0<br>
<div><div class="h5"><br>
<br>
> On Aug 8, 2013, at 2:16 AM, David Weber <<a href="mailto:wb@munzinger.de">wb@munzinger.de</a>> wrote:<br>
> > Am Mittwoch, 7. August 2013, 22:07:19 schrieb Jeff Liu:<br>
> >> On 08/07/2013 05:17 PM, David Weber wrote:<br>
> >>> Hi,<br>
> >>><br>
> >>> We are trying to use OCFS2 as VM storage. After running into problems<br>
> >>> with<br>
> >>> qemu's disk_mirror feature we now think there could be a problem with<br>
> >>> the<br>
> >>> FIEMAP ioctl in OCFS2.<br>
> >>><br>
> >>> As far as I understand the situation looks like this:<br>
> >>> Qemu inquiries the FS if the given section of the image is already<br>
> >>> allocated via the FIEMAP ioctl [1]<br>
> >>> It especially checks if fm_mapped_extents is greater 0.<br>
> >>> OCFS2 reports on sections bigger 1048576 there would be 0 mapped_extents<br>
> >>> which is wrong.<br>
> >>><br>
> >>> I extended a userspace FIEMAP util [2] a bit to specify the start and<br>
> >>> length parameter [3] as an easier testcase.<br>
> >>><br>
> >>> When we create a big file which has no holes<br>
> >>> dd if=/dev/urandom of=/mnt/kvm-images/urandom.img bs=1M count=1000<br>
> >>><br>
> >>> We get on lower sections the expected output:<br>
> >>> ./a.out /mnt/kvm-images/urandom.img 10000 10<br>
> >>> start: 2710, length: a<br>
> >>> File /mnt/kvm-images/urandom.img has 1 extents:<br>
> >>> # Logical Physical Length Flags<br>
> >>> 0: 0000000000000000 0000004ca3f00000 000000000be00000 0000<br>
> >>><br>
> >>> But on sections >= 1048576 it reports there wouldn't be any extents<br>
> >>> which<br>
> >>> is as far as I understand wrong:<br>
> >>> ./a.out /mnt/kvm-images/urandom.img 1048576 10<br>
> >>> start: 100000, length: a<br>
> >>> File /mnt/kvm-images/urandom.img has 0 extents:<br>
> >>> # Logical Physical Length Flags<br>
> >><br>
> >> Thanks for your report, looks this problem has existed over years.<br>
> >> As a quick response, could you please try the below fix?<br>
> ><br>
> > Thank you very much! This solved the problems with qemu.<br>
> ><br>
> > I found a fiemap-tester util[1] in the xfstests project and it runs fine<br>
> > on<br>
> > OCFS2 with 4K cluster size but fails with 1M. I have however no idea if<br>
> > this is a severe problem.<br>
> ><br>
> > # gcc -DHAVE_FALLOCATE=1 -o fiemap-tester fiemap-tester.c<br>
> > # ./fiemap-tester /mnt/kvm-images/fiemap_test<br>
> > Starting infinite run, if you don't see any output then its working<br>
> > properly. HEY FS PERSON: your fs is weird. I specifically wanted a<br>
> > hole and you allocated a block anyway. FIBMAP confirms that<br>
> > you allocated a block, and the block is filled with 0's so<br>
> > everything is kosher, but you still allocated a block when<br>
> > didn't need to. This may or may not be what you wanted,<br>
> > which is why I'm only printing this message once, in case<br>
> > you didn't do it on purpose. This was at block 0.<br>
> > ERROR: preallocated extent is not marked with FIEMAP_EXTENT_UNWRITTEN: 0<br>
> > map is<br>
> > 'HDHPHHDDHPHPHPHDDHHPPDDPPPHHHPDDDPDHHHHDDDPPHPPPDPHHPPDPPHHDDPDPPHDHPDDDD<br>
> > PDPPDPHDDPPDDPPHDDPDHHHDDPDHPHPDPPDDHPHPPHDPHPHDDHDPDPDHDHPDDPHPPPHDPPDPDD<br>
> > HPHDDPPHPDHPPHPPHPHHPHDHPPDDPHDHHPPHPPDHPHPHDHPPDDDDPHHHPPPHHHDDDDPDPDDPPP<br>
> > HPHDPPPHDPDPHDDHPPPDPDHPHHPHDHHDHPDPHDDPPHDPPDDPDDPPDHPPDPDHHPHDHPPHDDHDPH<br>
</div></div>> > PPPDHPDDDHDDHDPPHHDDPPDPDDHDHHPHDPHHPPPDPPDHDHHPPHDPHDPPHDPHHPPP' logical:<br>
<div class="HOEnZb"><div class="h5">> > [ 0.. 255] phys: 132160512..132160767 flags: 0x000 tot: 256<br>
> ><br>
> ><br>
> > [1]<br>
> > <a href="http://oss.sgi.com/cgi-bin/gitweb.cgi?p=xfs/cmds/xfstests.git;a=blob_plai" target="_blank">http://oss.sgi.com/cgi-bin/gitweb.cgi?p=xfs/cmds/xfstests.git;a=blob_plai</a><br>
> > n;f=src/fiemap-tester.c;hb=HEAD><br>
> >> From: Jie Liu <<a href="mailto:jeff.liu@oracle.com">jeff.liu@oracle.com</a>><br>
> >><br>
> >> Call fiemap ioctl(2) with given start offset as well as an desired<br>
> >> mapping range should show extents if possible. However, we calculate<br>
> >> the end offset of mapping via 'mapping_end -= cpos' before iterating<br>
> >> the extent records which would cause problems, e.g,<br>
> >><br>
> >> Cluster size 4096:<br>
> >> debugfs.ocfs2 1.6.3<br>
> >><br>
> >> Block Size Bits: 12 Cluster Size Bits: 12<br>
> >><br>
> >> The extended fiemap test utility From David:<br>
> >> <a href="https://gist.github.com/anonymous/6172331" target="_blank">https://gist.github.com/anonymous/6172331</a><br>
> >><br>
> >> # dd if=/dev/urandom of=/ocfs2/test_file bs=1M count=1000<br>
> >> # ./fiemap /ocfs2/test_file 4096 10<br>
> >> start: 4096, length: 10<br>
> >> File /ocfs2/test_file has 0 extents:<br>
> >> # Logical Physical Length Flags<br>
> >><br>
> >> ^^^^^ <-- No extents<br>
> >><br>
> >> In this case, at ocfs2_fiemap(): cpos == mapping_end == 1. Hence the<br>
> >> loop of searching extent records was not executed at all.<br>
> >><br>
> >> This patch remove the in question 'mapping_end -= cpos', and loops<br>
> >> until the cpos is larger than the mapping_end instead.<br>
> >><br>
> >> # ./fiemap /ocfs2/test_file 4096 10<br>
> >> start: 4096, length: 10<br>
> >> File /ocfs2/test_file has 1 extents:<br>
> >> # Logical Physical Length Flags<br>
> >> 0: 0000000000000000 0000000056a01000 0000000006a00000 0000<br>
> >><br>
> >> Reported-by: David Weber <<a href="mailto:wb@munzinger.de">wb@munzinger.de</a>><br>
> >> Cc: Mark Fashen <<a href="mailto:mfasheh@suse.de">mfasheh@suse.de</a>><br>
> >> Cc: Joel Becker <<a href="mailto:jlbec@evilplan.org">jlbec@evilplan.org</a>><br>
> >> Signed-off-by: Jie Liu <<a href="mailto:jeff.liu@oracle.com">jeff.liu@oracle.com</a>><br>
> >> ---<br>
> >> fs/ocfs2/extent_map.c | 1 -<br>
> >> 1 file changed, 1 deletion(-)<br>
> >><br>
> >> diff --git a/fs/ocfs2/extent_map.c b/fs/ocfs2/extent_map.c<br>
> >> index 2487116..8460647 100644<br>
> >> --- a/fs/ocfs2/extent_map.c<br>
> >> +++ b/fs/ocfs2/extent_map.c<br>
> >> @@ -781,7 +781,6 @@ int ocfs2_fiemap(struct inode *inode, struct<br>
> >> fiemap_extent_info *fieinfo, cpos = map_start >> osb->s_clustersize_bits;<br>
> >><br>
> >> mapping_end = ocfs2_clusters_for_bytes(inode->i_sb,<br>
> >><br>
> >> map_start + map_len);<br>
> >><br>
> >> - mapping_end -= cpos;<br>
> >><br>
> >> is_last = 0;<br>
> >> while (cpos < mapping_end && !is_last) {<br>
> >><br>
> >> u32 fe_flags;<br>
> >>><br>
> >>> We're running linux-3.11-rc4 plus the following patches:<br>
> >>> [PATCH V2] ocfs2: update inode size after zeroed the hole<br>
> >>> [PATCH RESEND] ocfs2: fix NULL pointer dereference in<br>
> >>> ocfs2_duplicate_clusters_by_page<br>
> >>> NULL pointer dereference at ocfs2_dir_foreach_blk_id<br>
> >>> [patch v3] ocfs2: ocfs2: fix recent memory corruption bug<br>
> >>><br>
> >>> o2info --volinfo /dev/drbd0<br>
> >>><br>
> >>> Label: kvm-images<br>
> >>><br>
> >>> UUID: BE7C101466AD4F2196A849C7A6031263<br>
> >>><br>
> >>> Block Size: 4096<br>
> >>><br>
> >>> Cluster Size: 1048576<br>
> >>><br>
> >>> Node Slots: 8<br>
> >>><br>
> >>> Features: backup-super strict-journal-super sparse extended-slotmap<br>
> >>> Features: inline-data xattr indexed-dirs refcount discontig-bg<br>
> >>> unwritten<br>
> >>><br>
> >>> Thanks in advance!<br>
> >>><br>
> >>> Cheers,<br>
> >>> David<br>
> >>><br>
> >>><br>
> >>> [1]<br>
> >>> <a href="http://git.qemu.org/?p=qemu.git;a=blob;f=block/raw-posix.c;h=ba721d3f5bd" target="_blank">http://git.qemu.org/?p=qemu.git;a=blob;f=block/raw-posix.c;h=ba721d3f5bd</a><br>
> >>> 9<br>
> >>> 8a6b62791c2e20dbf2894021ad76;hb=HEAD#l1087<br>
> >>><br>
> >>> [2]<br>
> >>> <a href="http://smackerelofopinion.blogspot.de/2010/01/using-fiemap-ioctl-to-get-" target="_blank">http://smackerelofopinion.blogspot.de/2010/01/using-fiemap-ioctl-to-get-</a>> >>> f<br>
> >>> ile-extents.html<br>
> >>><br>
> >>> [3] <a href="https://gist.github.com/anonymous/6172331" target="_blank">https://gist.github.com/anonymous/6172331</a><br>
> >>><br>
> >>><br>
> >>> _______________________________________________<br>
> >>> Ocfs2-devel mailing list<br>
> >>> <a href="mailto:Ocfs2-devel@oss.oracle.com">Ocfs2-devel@oss.oracle.com</a><br>
> >>> <a href="https://oss.oracle.com/mailman/listinfo/ocfs2-devel" target="_blank">https://oss.oracle.com/mailman/listinfo/ocfs2-devel</a><br>
> ><br>
> > _______________________________________________<br>
> > Ocfs2-devel mailing list<br>
> > <a href="mailto:Ocfs2-devel@oss.oracle.com">Ocfs2-devel@oss.oracle.com</a><br>
> > <a href="https://oss.oracle.com/mailman/listinfo/ocfs2-devel" target="_blank">https://oss.oracle.com/mailman/listinfo/ocfs2-devel</a><br>
</div></div></blockquote></div><br></div>