[Ocfs2-users] ocfs2 is still eating memory

Jeff Mahoney jeffm at suse.com
Thu Mar 15 19:09:48 PDT 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John Lange wrote:
> On Wed, 2007-03-14 at 17:42 -0400, Jeff Mahoney wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>>
>> Hi John -
>>
>> I'm taking a look at the memory consumption issue you reported, and I
>> just can't seem to reproduce it in the manner you've described. I'm
>> running our CVS kernel, which at this point is really the same thing as
>> the KOTD with two OCFS2 DLM fixes I added today that should be entirely
>> unrelated to this bug.
>>
>> I created a file system with about 890,000 files, rebooted with mem=512M
>> and did a find -exec stat {}\; on the file system. I can see it sucking
>> down all the memory as you described, but it's not OOM killing or even
>> going into swap.
> 
> Agreed, it does not go into swap. It would not make any sense for the
> kernel to swap cache pages since it can just flush them (if they are
> clean) or write them to disk (if they are dirty).
> 
> They question is, why is it not flushing the cache unless you force it
> to do so?
> 
> I only ran the test environment just long enough to see that it was
> exhibiting similar behavior. Even still I did see processes killed
> though I didn't see them in the logs. However, that is might be because
> syslog was one of the processes killed.
> 
> My suspicion is disk activity alone might not cause oom-killer since the
> kernel doesn't consider cache pages as normal memory usage. Once it
> fills up the ram it might be forcing a flush on some pages just to keep
> its head above water.
> 
> However, if you start firing up other applications and consuming ram in
> the normal way that might trigger it. I really can't say because I've
> never done that much extensive testing.
> 
> In any case, having the file system use up all the RAM that way is not
> normal.

If there is no other demand for memory, why shouldn't caches stay in
memory? That's the entire point of caches. If we start OOM killing
processes due to the caches taking all the memory, that's absolutely a bug.

> Here is what Sunil Mushran from Oracle had to say about the issue:
> 
>> Well, kswapd is supposed to flush the caches. As in, the vm
>> controls the lifetime of the inodes in the inode_cache not ocfs2.
>>
>> All ocfs2 can do is free the memory associated with the inode when
>> asked to. And it does that when you manually flush the cache. Qs is
>> why the vm is not doing it on its own.
> 
> So he is saying there is a problem with the way the kernel is handling
> virtual memory. Why it only happens with ocfs2 on SUSE is unknown.

Now that's the interesting part. Might you be willing to try a mainline
kernel with OCFS2 1.2? If this is a SUSE problem, I'd like to isolate it.

- -Jeff

- --
Jeff Mahoney
SUSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iD8DBQFF+fxrLPWxlyuTD7IRAjPOAJ0WtR+NvKDxzzaYPVrlSlVKx0kF4wCgmzHL
IQZGgPj1K9V+hGqoS2gx+ro=
=U1LW
-----END PGP SIGNATURE-----



More information about the Ocfs2-users mailing list