[Ocfs2-devel] Read IOPS storm in case of reflinking running VM disk

Goldwyn Rodrigues rgoldwyn at suse.de
Mon May 18 10:45:40 PDT 2015


Hi Eugene,

Sorry, had been busy with other work and this slipped on the list.

>
>  > Do you know something about such behavior?
>
>  > The question is why a reflink operation on VM disk leads to plenty of
> read
>
>  > ops? Is this related to CoW specific structures?
>

This is in fact related to the CoW. An ocfs2 file is an extent tree, 
which the extent headers marking if the extent is a reflinked or not 
with the number of reflinks.

If you perform a reflink on a file which is being changed constantly, 
not only recreate the extent tree, but also decrease the refcount of the 
ones already present. Add to it, the extents which need to be read for 
replication.


HTH,

>  >
>
>  > We can provide others details & ssh to testbed.
>
>  >
>
>  > > Hello,
>
>  > >
>
>  > > after deploying reflink-based VM snapshots to production servers we
>
>  > > discovered a performace degradation:
>
>  > >
>
>  > > OS: Opensuse 13.1, 13.2
>
>  > > Hypervisors: Xen 4.4, 4.5
>
>  > > Dom0 kernels: 3.12, 3.16, 3.18
>
>  > > DomU kernels: 3.12, 3.16, 3.18
>
>  > > Tested DomU disk backends: tapdisk2, qdisk
>
>  > >
>
>  > >
>
>  > > 1) on DomU (VM)
>
>  > > #dd if=/dev/zero of=test2 bs=1M count=6000
>
>  > >
>
>  > > 2) atop on Dom0:
>
>  > > sdb - busy:92% - read:375 - write:130902
>
>  > > Reads are from others VMs, seems OK
>
>  > >
>
>  > > 3) DomU dd finished:
>
>  > > 6291456000 bytes (6.3 GB) copied, 16.6265 s, 378 MB/s
>
>  > >
>
>  > > 4) Lets start dd again & do a snapshot:
>
>  > > #dd if=/dev/zero of=test2 bs=1M count=6000
>
>  > > #reflink test.raw ref/
>
>  > >
>
>  > > 5) atop on Dom0:
>
>  > > sdb - busy:97% - read:112740 - write:28037
>
>  > > So, Read IOPS = 112740, why?
>
>  > >
>
>  > > 6) DomU dd finished:
>
>  > > 6291456000 bytes (6.3 GB) copied, 175.45 s, 35.9 MB/s
>
>  > >
>
>  > > 7) Second & further reflinks do not change the atop stat & dd time
>
>  > > #dd if=/dev/zero of=test2 bs=1M count=6000
>
>  > > #reflink --backup=t test.raw ref/ \\ * n times
>
>  > > ~ 6291456000 bytes (6.3 GB) copied, 162.959 s, 38.6 MB/s
>
>  > >
>
>  > > The question is why reflinking a running VM disk leads to read IOPS
> storm?
>
>  > >
>
>  > >
>
>  > > Thanks!
>
>  >
>
>  > _______________________________________________
>
>  > Ocfs2-devel mailing list
>
>  > Ocfs2-devel at oss.oracle.com
>
>  > https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>

-- 
Goldwyn



More information about the Ocfs2-devel mailing list