[Ocfs2-users] OCFS2 has a likely memory leak. Bug 864
Alexei_Roudnev
Alexei_Roudnev at exigengroup.com
Tue Mar 27 17:44:53 PDT 2007
Did you run slabtop ?
It shows buffer usage in the kernel (so if memory leak is in the unclosed
objects, you wil see it long before vmstat).
----- Original Message -----
From: "John Lange" <john.lange at open-it.ca>
To: "Alexei_Roudnev" <Alexei_Roudnev at exigengroup.com>
Cc: "Sunil Mushran" <Sunil.Mushran at oracle.com>; "ocfs2-users"
<ocfs2-users at oss.oracle.com>
Sent: Tuesday, March 27, 2007 5:13 PM
Subject: Re: [Ocfs2-users] OCFS2 has a likely memory leak. Bug 864
> For those interested in this issue I have just uploaded 3 files to the
> bug tracker including a pretty (ugly) graph...
>
> http://oss.oracle.com/bugzilla/show_bug.cgi?id=864
>
> John
>
> On Tue, 2007-03-27 at 14:19 -0700, Alexei_Roudnev wrote:
> > I can't follow the test guideline 1:1, because I run out of disk space
in a
> > very short time.
> >
> > I run delete with a big intervals (system created 3,000 directories ==
> > 300,000 files, then deleted half of them, then sleep and repeat), and
run
> > test from 2 hosts, so it created much more heavy load onto the system:
> > - some directories had simultaneous access (broker must work and control
> > access)
> > - files are not only created but deleted (one more operation),
> > - I removed extra pauses.
> >
> > It shows that system have an object leak in the kernel, allocating about
512
> > bytes per one new file (and never releasing them),
> > so running it with 600,000 files eats about 300 MB of system memory (for
> > internal buffers). It was another problem, well known as it was saying
here,
> > and I can't see +/- 56 bytes/file memory leak on base of this. There was
not
> > any memory leaks in user's space.
> >
> > On the other hand, it was not visible by vmstat at all, only by slabtop
.
> >
> > Recommendation is the same after all (I did it already) - OCFSv2 can be
used
> > with light file load - use cases when # of file operations,
> > such as create, remove, truncate etc is not high (but # of reads can be
any,
> > just as # of files in the file system). Not a surprise at all. I use
OCFSv2
> > for shared application directory (configs, binaries, logs) and cleaned
it
> > for use for the backups or even archive logs (@ Oracle) but not for the
> > customer's data (and here we see, that my forecast was correct).
> >
> >
> >
> >
> > ----- Original Message -----
> > From: "John Lange" <john.lange at open-it.ca>
> > To: "Alexei_Roudnev" <Alexei_Roudnev at exigengroup.com>
> > Cc: "Sunil Mushran" <Sunil.Mushran at oracle.com>; "ocfs2-users"
> > <ocfs2-users at oss.oracle.com>
> > Sent: Tuesday, March 27, 2007 1:32 PM
> > Subject: Re: [Ocfs2-users] OCFS2 has a likely memory leak. Bug 864
> >
> >
> > > Just to be clear though, you need to follow the test outlined. Do
_not_
> > > delete the files.
> > >
> > > You must create them and flush the caches and then examine the free
> > > memory. Graph it over a significant amount of time and then see if its
a
> > > downward trend.
> > >
> > > If you can, do a comparison to an ext3 partition as well.
> > >
> > > If this list supported attachments I would show you the two graphs, in
> > > one, the free memory slopes downward (ocfs2), in the other free memory
> > > is completely level.
> > >
> > > John
> > >
> > > On Tue, 2007-03-27 at 13:21 -0700, Alexei_Roudnev wrote:
> > > > Run for 1 hour (even more), created 600,000 files (and removed
then),
> > from 2
> > > > hosts.
> > > >
> > > > No memory leak problem, but # of slab-512 (not 256) growth - 77,000
> > used,
> > > > 600,000 objects active in slab-32 and slab-512.
> > > >
> > > > So there is object leak in the OCFSv2 / SLES9 Sp3, less aggressive
than
> > > > described one. Anyway, creating 100,000,000 files (and deleting
them)
> > will
> > > > kill the system for sure (as I predicted before - OCFSv2 can be
used, if
> > you
> > > > bhhave not intensive file creation or modification).
> > > >
> > > >
> > > > Active / Total Objects (% used) : 1586075 / 1736513 (91.3%)
> > > > Active / Total Slabs (% used) : 108815 / 108842 (100.0%)
> > > > Active / Total Caches (% used) : 96 / 133 (72.2%)
> > > > Active / Total Size (% used) : 398467.35K / 428911.02K
(92.9%)
> > > > Minimum / Average / Maximum Object : 0.02K / 0.25K / 128.00K
> > > >
> > > > OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
> > > > 628096 628044 99% 0.03K 5608 112 22432K size-32
> > > > 627352 627352 100% 0.50K 78419 8 313676K size-512
> > > > 207600 151528 72% 0.09K 5190 40 20760K buffer_head
> > > > 97320 83204 85% 0.12K 3244 30 12976K size-128
> > > > 63450 14540 22% 0.25K 4230 15 16920K dentry_cache
> > > > 31584 16754 53% 0.52K 4512 7 18048K
radix_tree_node
> > > > 19228 18049 93% 0.17K 874 22 3496K
vm_area_struct
> > > > 12384 9922 80% 0.02K 86 144 344K anon_vma
> > > > 6210 5979 96% 0.25K 414 15 1656K filp
> > > > 5730 2669 46% 0.25K 382 15 1528K size-256
> > > > 3744 2782 74% 0.88K 936 4 3744K
ext3_inode_cache
> > > > 3390 3346 98% 0.62K 565 6 2260K inode_cache
> > > > 3132 1855 59% 0.06K 54 58 216K size-64
> > > > 2800 798 28% 0.02K 14 200 56K biovec-1
> > > > 2605 2438 93% 0.75K 521 5 2084K
proc_inode_cache
> > > > 2378 1794 75% 0.06K 41 58 164K ocfs2_em_ent
> > > > 1590 866 54% 0.12K 53 30 212K bio
> > > >
> > > >
> > > >
> > > >
> > > > ----- Original Message -----
> > > > From: "Sunil Mushran" <Sunil.Mushran at oracle.com>
> > > > To: "Alexei_Roudnev" <Alexei_Roudnev at exigengroup.com>
> > > > Cc: "John Lange" <john.lange at open-it.ca>; "ocfs2-users"
> > > > <ocfs2-users at oss.oracle.com>
> > > > Sent: Tuesday, March 27, 2007 12:36 PM
> > > > Subject: Re: [Ocfs2-users] OCFS2 has a likely memory leak. Bug 864
> > > >
> > > >
> > > > > You'll run into the size-256 slab explosion on sles9 sp3.
> > > > > That issue was addressed in 1.2.4. sp3 ships 1.2.3.
> > > > >
> > > > > Alexei_Roudnev wrote:
> > > > > > OCFSv2 @ SLES9 Sp3 build 283 is relatively stable. I am running
your
> > > > test on
> > > > > > 2 hosts now (create files from 2 hosts, and delete them with
some
> > delay
> > > > from
> > > > > > host1 by rm -rf; without any sleep's; let's see how it works).
> > > > > >
> > > > > > I'll post results here.
> > > > > >
> > > > > > (I have impression, that SLES10 OCFSv2 is not stable at all -
many
> > > > numerous
> > > > > > complains let me think, that it is not tested well, when
> > > > > > they integrated OCFS into SLES10).
> > > > > >
> > > > >
> > > >
> > >
> > >
> >
>
>
More information about the Ocfs2-users
mailing list