[Ocfs2-users] ocfs2 is still eating memory

Fri Mar 16 10:43:05 PDT 2007

Just for the record, my words were based on the limited information
I got on this issue. It is hard to make any determination based upon
two vmstat outputs. As in, I typically prefer users dump /proc/meminfo
and /proc/slabinfo every few mins.

Jeff Mahoney wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> John Lange wrote:
>   
>> On Wed, 2007-03-14 at 17:42 -0400, Jeff Mahoney wrote:
>>     
>>> -----BEGIN PGP SIGNED MESSAGE-----
>>> Hash: SHA1
>>>
>>>
>>> Hi John -
>>>
>>> I'm taking a look at the memory consumption issue you reported, and I
>>> just can't seem to reproduce it in the manner you've described. I'm
>>> running our CVS kernel, which at this point is really the same thing as
>>> the KOTD with two OCFS2 DLM fixes I added today that should be entirely
>>> unrelated to this bug.
>>>
>>> I created a file system with about 890,000 files, rebooted with mem=512M
>>> and did a find -exec stat {}\; on the file system. I can see it sucking
>>> down all the memory as you described, but it's not OOM killing or even
>>> going into swap.
>>>       
>> Agreed, it does not go into swap. It would not make any sense for the
>> kernel to swap cache pages since it can just flush them (if they are
>> clean) or write them to disk (if they are dirty).
>>
>> They question is, why is it not flushing the cache unless you force it
>> to do so?
>>
>> I only ran the test environment just long enough to see that it was
>> exhibiting similar behavior. Even still I did see processes killed
>> though I didn't see them in the logs. However, that is might be because
>> syslog was one of the processes killed.
>>
>> My suspicion is disk activity alone might not cause oom-killer since the
>> kernel doesn't consider cache pages as normal memory usage. Once it
>> fills up the ram it might be forcing a flush on some pages just to keep
>> its head above water.
>>
>> However, if you start firing up other applications and consuming ram in
>> the normal way that might trigger it. I really can't say because I've
>> never done that much extensive testing.
>>
>> In any case, having the file system use up all the RAM that way is not
>> normal.
>>     
>
> If there is no other demand for memory, why shouldn't caches stay in
> memory? That's the entire point of caches. If we start OOM killing
> processes due to the caches taking all the memory, that's absolutely a bug.
>
>   
>> Here is what Sunil Mushran from Oracle had to say about the issue:
>>
>>     
>>> Well, kswapd is supposed to flush the caches. As in, the vm
>>> controls the lifetime of the inodes in the inode_cache not ocfs2.
>>>
>>> All ocfs2 can do is free the memory associated with the inode when
>>> asked to. And it does that when you manually flush the cache. Qs is
>>> why the vm is not doing it on its own.
>>>       
>> So he is saying there is a problem with the way the kernel is handling
>> virtual memory. Why it only happens with ocfs2 on SUSE is unknown.
>>     
>
> Now that's the interesting part. Might you be willing to try a mainline
> kernel with OCFS2 1.2? If this is a SUSE problem, I'd like to isolate it.
>
> - -Jeff
>
> - --
> Jeff Mahoney
> SUSE Labs
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.5 (GNU/Linux)
> Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org
>
> iD8DBQFF+fxrLPWxlyuTD7IRAjPOAJ0WtR+NvKDxzzaYPVrlSlVKx0kF4wCgmzHL
> IQZGgPj1K9V+hGqoS2gx+ro=
> =U1LW
> -----END PGP SIGNATURE-----
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>