[Ocfs2-users] out of memory problem concerning 64-bits?

Sunil Mushran Sunil.Mushran at oracle.com
Tue Nov 14 11:08:26 PST 2006


The following go up and down. As in, when lowmem gets low, kswapd
will start releasing the inodes. You can test this by running a
simple program that allocs 1G of memory.
ocfs2_lock           286    452     16  226    1 : tunables  120   60    
8 : slabdata      2      2      0
ocfs2_inode_cache  27155  27432    896    4    1 : tunables   54   27    
8 : slabdata   6858   6858      0
ocfs2_uptodate       388    476     32  119    1 : tunables  120   60    
8 : slabdata      4      4      0
ocfs2_em_ent       25831  26169     64   61    1 : tunables  120   60    
8 : slabdata    429    429      0
dlmfs_inode_cache      1      6    640    6    1 : tunables   54   27    
8 : slabdata      1      1      0
dlm_mle_cache          0      0    384   10    1 : tunables   54   27    
8 : slabdata      0      0      0


The following two are generic kernel slabs. ocfs2_dlm currently
allocates some memory from it. If these are large, (102189 * 256 = 24M,
105223 * 32 = 3M), then look at the ocfs2_dlm stats.
size-256          102189 102195    256   15    1 : tunables  120   60    
8 : slabdata   6813   6813      0
size-32           105223 105434     32  119    1 : tunables  120   60    
8 : slabdata    886    886      0


Add all the numbers (26553 + 57791 + 17197 = 101541) to see ocfs2_dlms
foot print on the two slabs. Number of objects-wise, it has the same effect
on both the slabs. MB wise, the effect on size-256 is more. In this case,
it is consuming most of both slabs.
[root at NPP_apl_04 ~]# cat /proc/fs/ocfs2_dlm/*/stat
local=17197, remote=0, unknown=0
local=57791, remote=0, unknown=0
local=26553, remote=0, unknown=0

The "remote" locks release is tied to the inode free. "unknown"
is a temporary state and the count should never account for much.
"local" are the problematic ones. These represent the locally
mastered lock resources.

Bottomline, it is currently accounting for 27M out of 900M. Check the
numbers after a few days. You'll know whether it is OCFS2 or not.

If you are confused, don't worry we are close to fixing this issue. ;)

Sunil

Michał Wilkowski wrote:
> Hello,
>
> [root at NPP_apl_04 ~]# egrep 'ocfs|dlm|size-256 |size-32 ' /proc/slabinfo
> ocfs2_lock           286    452     16  226    1 : tunables  120   
> 60    8 : slabdata      2      2      0
> ocfs2_inode_cache  27155  27432    896    4    1 : tunables   54   
> 27    8 : slabdata   6858   6858      0
> ocfs2_uptodate       388    476     32  119    1 : tunables  120   
> 60    8 : slabdata      4      4      0
> ocfs2_em_ent       25831  26169     64   61    1 : tunables  120   
> 60    8 : slabdata    429    429      0
> dlmfs_inode_cache      1      6    640    6    1 : tunables   54   
> 27    8 : slabdata      1      1      0
> dlm_mle_cache          0      0    384   10    1 : tunables   54   
> 27    8 : slabdata      0      0      0
> size-256          102189 102195    256   15    1 : tunables  120   
> 60    8 : slabdata   6813   6813      0
> size-32           105223 105434     32  119    1 : tunables  120   
> 60    8 : slabdata    886    886      0
> [root at NPP_apl_04 ~]# cat /proc/fs/ocfs2_dlm/*/stat
> local=17197, remote=0, unknown=0
> local=57791, remote=0, unknown=0
> local=26553, remote=0, unknown=0
>
> we are currently running only one node and oom-kills occur regularly 
> (once within a few days). I suppose that OCFS2 eats memory because 
> since OCFS2 filesystem has been mounted, the LowMemory is decreasing 
> and the number of OCFS2 locks is increasing (as you can see in the 
> output).
>
> Can you give the hint what I can read in slabinfo?
>
> Regards
> Michal Wilkowski
>
> Sunil Mushran napisał(a):
>> Are you sure it is ocfs2 that is eating memory?
>>
>> # egrep 'ocfs|dlm|size-256 |size-32 ' /proc/slabinfo
>> # cat /proc/fs/ocfs2_dlm/*/stat
>>
>> Email the outputs.
>>
>> Michał Wilkowski wrote:
>>> Hello,
>>> we are currently running in production the system based on Redhat 
>>> Enterprise Linux 4 Update 3, ver. 32-bits. We are periodically 
>>> encountering an OOM-killer problem. Problem was discussed in 
>>> previous posts.
>>>
>>> I have a question: does it concern only 32-bits systems or 64-bits 
>>> as well?
>>> On 32-bits system OCFS2 driver consumes Low Memory (only ~900 MB) 
>>> and therefore causes oom-kills. What about 64-bits systems? Does it 
>>> consume all memory available for system (in my case: 8 GB)  and 
>>> therefore it takes much longer to encounter oom-kills?
>>>
>>> It is a very important issue for me and a response would be greatly 
>>> appreciated.
>>>
>>> Regards
>>> Michal Wilkowski
>>>
>>> _______________________________________________
>>> Ocfs2-users mailing list
>>> Ocfs2-users at oss.oracle.com
>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>
>>



More information about the Ocfs2-users mailing list