[Ocfs2-users] OCFS2 has a likely memory leak. Bug 864

Alexei_Roudnev Alexei_Roudnev at exigengroup.com
Tue Mar 27 14:19:07 PDT 2007


I can't follow the test guideline 1:1, because I run out of disk space in a
very short time.

I run delete with a big intervals (system created 3,000 directories ==
300,000 files, then deleted half of them, then sleep and repeat), and run
test from 2 hosts, so it created much more heavy load onto the system:
- some directories had simultaneous access (broker must work and control
access)
- files are not only created but deleted (one more operation),
- I removed extra pauses.

It shows that system have an object leak in the kernel, allocating about 512
bytes per one new file (and never releasing them),
so running it with 600,000 files eats about 300 MB of system memory (for
internal buffers). It was another problem, well known as it was saying here,
and I can't see +/- 56 bytes/file memory leak on base of this. There was not
any memory leaks in user's space.

On the other hand, it was not visible by vmstat at all, only by slabtop .

Recommendation is the same after all (I did it already) - OCFSv2 can be used
with light file load - use cases when # of file operations,
such as create, remove, truncate etc is not high (but # of reads can be any,
just as # of files in the file system). Not a surprise at all. I use OCFSv2
for shared application directory (configs, binaries, logs) and cleaned it
for use for the backups or even archive logs (@ Oracle) but not for the
customer's data (and here we see, that my forecast was correct).




----- Original Message ----- 
From: "John Lange" <john.lange at open-it.ca>
To: "Alexei_Roudnev" <Alexei_Roudnev at exigengroup.com>
Cc: "Sunil Mushran" <Sunil.Mushran at oracle.com>; "ocfs2-users"
<ocfs2-users at oss.oracle.com>
Sent: Tuesday, March 27, 2007 1:32 PM
Subject: Re: [Ocfs2-users] OCFS2 has a likely memory leak. Bug 864


> Just to be clear though, you need to follow the test outlined. Do _not_
> delete the files.
>
> You must create them and flush the caches and then examine the free
> memory. Graph it over a significant amount of time and then see if its a
> downward trend.
>
> If you can, do a comparison to an ext3 partition as well.
>
> If this list supported attachments I would show you the two graphs, in
> one, the free memory slopes downward (ocfs2), in the other free memory
> is completely level.
>
> John
>
> On Tue, 2007-03-27 at 13:21 -0700, Alexei_Roudnev wrote:
> > Run for 1 hour (even more), created 600,000 files (and removed then),
from 2
> > hosts.
> >
> > No memory leak problem, but # of slab-512 (not 256) growth - 77,000
used,
> > 600,000 objects active in slab-32 and slab-512.
> >
> > So there is object leak in the OCFSv2 / SLES9 Sp3, less aggressive than
> > described one. Anyway, creating 100,000,000 files (and deleting them)
will
> > kill the system for sure (as I predicted before - OCFSv2 can be used, if
you
> > bhhave not intensive file creation or modification).
> >
> >
> >  Active / Total Objects (% used)    : 1586075 / 1736513 (91.3%)
> >  Active / Total Slabs (% used)      : 108815 / 108842 (100.0%)
> >  Active / Total Caches (% used)     : 96 / 133 (72.2%)
> >  Active / Total Size (% used)       : 398467.35K / 428911.02K (92.9%)
> >  Minimum / Average / Maximum Object : 0.02K / 0.25K / 128.00K
> >
> >   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
> > 628096 628044  99%    0.03K   5608      112     22432K size-32
> > 627352 627352 100%    0.50K  78419        8    313676K size-512
> > 207600 151528  72%    0.09K   5190       40     20760K buffer_head
> >  97320  83204  85%    0.12K   3244       30     12976K size-128
> >  63450  14540  22%    0.25K   4230       15     16920K dentry_cache
> >  31584  16754  53%    0.52K   4512        7     18048K radix_tree_node
> >  19228  18049  93%    0.17K    874       22      3496K vm_area_struct
> >  12384   9922  80%    0.02K     86      144       344K anon_vma
> >   6210   5979  96%    0.25K    414       15      1656K filp
> >   5730   2669  46%    0.25K    382       15      1528K size-256
> >   3744   2782  74%    0.88K    936        4      3744K ext3_inode_cache
> >   3390   3346  98%    0.62K    565        6      2260K inode_cache
> >   3132   1855  59%    0.06K     54       58       216K size-64
> >   2800    798  28%    0.02K     14      200        56K biovec-1
> >   2605   2438  93%    0.75K    521        5      2084K proc_inode_cache
> >   2378   1794  75%    0.06K     41       58       164K ocfs2_em_ent
> >   1590    866  54%    0.12K     53       30       212K bio
> >
> >
> >
> >
> > ----- Original Message ----- 
> > From: "Sunil Mushran" <Sunil.Mushran at oracle.com>
> > To: "Alexei_Roudnev" <Alexei_Roudnev at exigengroup.com>
> > Cc: "John Lange" <john.lange at open-it.ca>; "ocfs2-users"
> > <ocfs2-users at oss.oracle.com>
> > Sent: Tuesday, March 27, 2007 12:36 PM
> > Subject: Re: [Ocfs2-users] OCFS2 has a likely memory leak. Bug 864
> >
> >
> > > You'll run into the size-256 slab explosion on sles9 sp3.
> > > That issue was addressed in 1.2.4.  sp3 ships 1.2.3.
> > >
> > > Alexei_Roudnev wrote:
> > > > OCFSv2 @ SLES9 Sp3 build 283 is relatively stable. I am running your
> > test on
> > > > 2 hosts now (create files from 2 hosts, and delete them with some
delay
> > from
> > > > host1 by rm -rf; without any sleep's; let's see how it works).
> > > >
> > > > I'll post results here.
> > > >
> > > > (I have impression, that SLES10 OCFSv2 is not stable at all - many
> > numerous
> > > > complains let me think, that it is not tested well, when
> > > > they integrated OCFS into SLES10).
> > > >
> > >
> >
>
>




More information about the Ocfs2-users mailing list