[Ocfs2-users] OCFS2 has a likely memory leak. Bug 864

Alexei_Roudnev Alexei_Roudnev at exigengroup.com
Tue Mar 27 17:44:53 PDT 2007


Did you run slabtop ?

It shows buffer usage in the kernel (so if memory leak is in the unclosed
objects, you wil see it long before vmstat).

----- Original Message ----- 
From: "John Lange" <john.lange at open-it.ca>
To: "Alexei_Roudnev" <Alexei_Roudnev at exigengroup.com>
Cc: "Sunil Mushran" <Sunil.Mushran at oracle.com>; "ocfs2-users"
<ocfs2-users at oss.oracle.com>
Sent: Tuesday, March 27, 2007 5:13 PM
Subject: Re: [Ocfs2-users] OCFS2 has a likely memory leak. Bug 864


> For those interested in this issue I have just uploaded 3 files to the
> bug tracker including a pretty (ugly) graph...
>
> http://oss.oracle.com/bugzilla/show_bug.cgi?id=864
>
> John
>
> On Tue, 2007-03-27 at 14:19 -0700, Alexei_Roudnev wrote:
> > I can't follow the test guideline 1:1, because I run out of disk space
in a
> > very short time.
> >
> > I run delete with a big intervals (system created 3,000 directories ==
> > 300,000 files, then deleted half of them, then sleep and repeat), and
run
> > test from 2 hosts, so it created much more heavy load onto the system:
> > - some directories had simultaneous access (broker must work and control
> > access)
> > - files are not only created but deleted (one more operation),
> > - I removed extra pauses.
> >
> > It shows that system have an object leak in the kernel, allocating about
512
> > bytes per one new file (and never releasing them),
> > so running it with 600,000 files eats about 300 MB of system memory (for
> > internal buffers). It was another problem, well known as it was saying
here,
> > and I can't see +/- 56 bytes/file memory leak on base of this. There was
not
> > any memory leaks in user's space.
> >
> > On the other hand, it was not visible by vmstat at all, only by slabtop
.
> >
> > Recommendation is the same after all (I did it already) - OCFSv2 can be
used
> > with light file load - use cases when # of file operations,
> > such as create, remove, truncate etc is not high (but # of reads can be
any,
> > just as # of files in the file system). Not a surprise at all. I use
OCFSv2
> > for shared application directory (configs, binaries, logs) and cleaned
it
> > for use for the backups or even archive logs (@ Oracle) but not for the
> > customer's data (and here we see, that my forecast was correct).
> >
> >
> >
> >
> > ----- Original Message ----- 
> > From: "John Lange" <john.lange at open-it.ca>
> > To: "Alexei_Roudnev" <Alexei_Roudnev at exigengroup.com>
> > Cc: "Sunil Mushran" <Sunil.Mushran at oracle.com>; "ocfs2-users"
> > <ocfs2-users at oss.oracle.com>
> > Sent: Tuesday, March 27, 2007 1:32 PM
> > Subject: Re: [Ocfs2-users] OCFS2 has a likely memory leak. Bug 864
> >
> >
> > > Just to be clear though, you need to follow the test outlined. Do
_not_
> > > delete the files.
> > >
> > > You must create them and flush the caches and then examine the free
> > > memory. Graph it over a significant amount of time and then see if its
a
> > > downward trend.
> > >
> > > If you can, do a comparison to an ext3 partition as well.
> > >
> > > If this list supported attachments I would show you the two graphs, in
> > > one, the free memory slopes downward (ocfs2), in the other free memory
> > > is completely level.
> > >
> > > John
> > >
> > > On Tue, 2007-03-27 at 13:21 -0700, Alexei_Roudnev wrote:
> > > > Run for 1 hour (even more), created 600,000 files (and removed
then),
> > from 2
> > > > hosts.
> > > >
> > > > No memory leak problem, but # of slab-512 (not 256) growth - 77,000
> > used,
> > > > 600,000 objects active in slab-32 and slab-512.
> > > >
> > > > So there is object leak in the OCFSv2 / SLES9 Sp3, less aggressive
than
> > > > described one. Anyway, creating 100,000,000 files (and deleting
them)
> > will
> > > > kill the system for sure (as I predicted before - OCFSv2 can be
used, if
> > you
> > > > bhhave not intensive file creation or modification).
> > > >
> > > >
> > > >  Active / Total Objects (% used)    : 1586075 / 1736513 (91.3%)
> > > >  Active / Total Slabs (% used)      : 108815 / 108842 (100.0%)
> > > >  Active / Total Caches (% used)     : 96 / 133 (72.2%)
> > > >  Active / Total Size (% used)       : 398467.35K / 428911.02K
(92.9%)
> > > >  Minimum / Average / Maximum Object : 0.02K / 0.25K / 128.00K
> > > >
> > > >   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
> > > > 628096 628044  99%    0.03K   5608      112     22432K size-32
> > > > 627352 627352 100%    0.50K  78419        8    313676K size-512
> > > > 207600 151528  72%    0.09K   5190       40     20760K buffer_head
> > > >  97320  83204  85%    0.12K   3244       30     12976K size-128
> > > >  63450  14540  22%    0.25K   4230       15     16920K dentry_cache
> > > >  31584  16754  53%    0.52K   4512        7     18048K
radix_tree_node
> > > >  19228  18049  93%    0.17K    874       22      3496K
vm_area_struct
> > > >  12384   9922  80%    0.02K     86      144       344K anon_vma
> > > >   6210   5979  96%    0.25K    414       15      1656K filp
> > > >   5730   2669  46%    0.25K    382       15      1528K size-256
> > > >   3744   2782  74%    0.88K    936        4      3744K
ext3_inode_cache
> > > >   3390   3346  98%    0.62K    565        6      2260K inode_cache
> > > >   3132   1855  59%    0.06K     54       58       216K size-64
> > > >   2800    798  28%    0.02K     14      200        56K biovec-1
> > > >   2605   2438  93%    0.75K    521        5      2084K
proc_inode_cache
> > > >   2378   1794  75%    0.06K     41       58       164K ocfs2_em_ent
> > > >   1590    866  54%    0.12K     53       30       212K bio
> > > >
> > > >
> > > >
> > > >
> > > > ----- Original Message ----- 
> > > > From: "Sunil Mushran" <Sunil.Mushran at oracle.com>
> > > > To: "Alexei_Roudnev" <Alexei_Roudnev at exigengroup.com>
> > > > Cc: "John Lange" <john.lange at open-it.ca>; "ocfs2-users"
> > > > <ocfs2-users at oss.oracle.com>
> > > > Sent: Tuesday, March 27, 2007 12:36 PM
> > > > Subject: Re: [Ocfs2-users] OCFS2 has a likely memory leak. Bug 864
> > > >
> > > >
> > > > > You'll run into the size-256 slab explosion on sles9 sp3.
> > > > > That issue was addressed in 1.2.4.  sp3 ships 1.2.3.
> > > > >
> > > > > Alexei_Roudnev wrote:
> > > > > > OCFSv2 @ SLES9 Sp3 build 283 is relatively stable. I am running
your
> > > > test on
> > > > > > 2 hosts now (create files from 2 hosts, and delete them with
some
> > delay
> > > > from
> > > > > > host1 by rm -rf; without any sleep's; let's see how it works).
> > > > > >
> > > > > > I'll post results here.
> > > > > >
> > > > > > (I have impression, that SLES10 OCFSv2 is not stable at all -
many
> > > > numerous
> > > > > > complains let me think, that it is not tested well, when
> > > > > > they integrated OCFS into SLES10).
> > > > > >
> > > > >
> > > >
> > >
> > >
> >
>
>




More information about the Ocfs2-users mailing list