[Ocfs2-users] Re: [Linux-HA] OCFS2 - Memory hog?

Sunil Mushran Sunil.Mushran at oracle.com
Thu Feb 15 10:59:45 PST 2007


Fixed in 1.2.4. SUSE has the patch-fix.
The patch has also been added to mainline.

John Lange wrote:
> Yes, the clients are doing lots of creates.
>
> But my question is, if this is a memory leak, why does ocfs2 eat up the
> memory as soon as the clients start accessing the filesystem. Within
> about 5-10 minutes all physical RAM is consumed but then the memory
> consumption stops. It does not go into swap.
>
> Do you happen to know what version of ocfs2 has the fix?
>
> If it was a leak would the process not be more gradual and continuous?
> It would continue to eat into swap no? And if it was a leak would the
> ram be freed when ocfs was unmounted?
>
> Is there a command that shows what is using the kernel memory?
>
> Here is what /proc/slabinfo shows (cut down for formatting). I don't
> understand how to read this so maybe someone can indicate if something
> looks wrong?
>
> =======
> # cat /proc/slabinfo
>
> # name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> 
> nfsd4_delegations      0      0    596   13    2 
> nfsd4_stateids         0      0     72   53    1 
> nfsd4_files            0      0     36  101    1 
> nfsd4_stateowners      0      0    344   11    1 
> rpc_buffers            8      8   2048    2    1 
> rpc_tasks              8     15    256   15    1 
> rpc_inode_cache        0      0    512    7    1 
> ocfs2_lock           152    203     16  203    1 
> ocfs2_inode_cache  12484  12536    896    4    1 
> ocfs2_uptodate      1381   1469     32  113    1 
> ocfs2_em_ent       37005  37406     64   59    1 
> dlmfs_inode_cache      1      6    640    6    1 
> dlm_mle_cache         10     10    384   10    1 
> configfs_dir_cache     33     78     48   78    1 
> fib6_nodes             7    113     32  113    1 
> ip6_dst_cache          7     15    256   15    1 
> ndisc_cache            1     15    256   15    1 
> RAWv6                  5      6    640    6    1 
> UDPv6                  3      6    640    6    1 
> tw_sock_TCPv6          0      0    128   30    1 
> request_sock_TCPv6      0      0    128   30    1 
> TCPv6                  8      9   1280    3    1 
> ip_fib_alias          16    113     32  113    1 
> ip_fib_hash           16    113     32  113    1 
> dm_events             16    169     20  169    1 
> dm_tio              4157   7308     16  203    1 
> dm_io               4155   6760     20  169    1 
> uhci_urb_priv          0      0     40   92    1 
> ext3_inode_cache    1062   2856    512    8    1 
> ext3_xattr             0      0     48   78    1 
> journal_handle        74    169     20  169    1 
> journal_head         583   1224     52   72    1 
> revoke_table           6    254     12  254    1 
> revoke_record          0      0     16  203    1 
> qla2xxx_srbs         244    360    128   30    1 
> scsi_cmd_cache       106    130    384   10    1 
> sgpool-256            32     32   4096    1    1 
> sgpool-128            42     42   2048    2    1 
> sgpool-64             44     44   1024    4    1 
> sgpool-32             48     48    512    8    1 
> sgpool-16             75     75    256   15    1 
> sgpool-8             153    210    128   30    1 
> scsi_io_context        0      0    104   37    1 
> UNIX                 377    399    512    7    1 
> ip_mrt_cache           0      0    128   30    1 
> tcp_bind_bucket       14    203     16  203    1 
> inet_peer_cache       81    118     64   59    1 
> secpath_cache          0      0    128   30    1 
> xfrm_dst_cache         0      0    384   10    1 
> ip_dst_cache         176    240    256   15    1 
> arp_cache              6     30    256   15    1 
> RAW                    3      7    512    7    1 
> UDP                   29     42    512    7    1 
> tw_sock_TCP            0      0    128   30    1 
> request_sock_TCP       0      0     64   59    1 
> TCP                   19     35   1152    7    2 
> flow_cache             0      0    128   30    1 
> cfq_ioc_pool         194    240     96   40    1 
> cfq_pool             185    240     96   40    1 
> crq_pool             312    468     48   78    1 
> deadline_drq           0      0     52   72    1 
> as_arq                 0      0     64   59    1 
> mqueue_inode_cache      1      6    640    6    1 
> isofs_inode_cache      0      0    384   10    1 
> minix_inode_cache      0      0    420    9    1 
> hugetlbfs_inode_cache      1     11    356   11    1 
> ext2_inode_cache       0      0    492    8    1 
> ext2_xattr             0      0     48   78    1 
> dnotify_cache          1    169     20  169    1 
> dquot                  0      0    128   30    1 
> eventpoll_pwq          1    101     36  101    1 
> eventpoll_epi          1     30    128   30    1 
> inotify_event_cache      0      0     28  127    1 
> inotify_watch_cache     40     92     40   92    1 
> kioctx                 0      0    256   15    1 
> kiocb                  0      0    128   30    1 
> fasync_cache           1    203     16  203    1 
> shmem_inode_cache    612    632    460    8    1 
> posix_timers_cache      0      0    100   39    1 
> uid_cache              7     59     64   59    1 
> blkdev_ioc           103    127     28  127    1 
> blkdev_queue          58     60    960    4    1 
> blkdev_requests      354    418    176   22    1 
> biovec-(256)         312    312   3072    2    2 
> biovec-128           368    370   1536    5    2 
> biovec-64            480    485    768    5    1 
> biovec-16            480    495    256   15    1 
> biovec-4             480    531     64   59    1 
> biovec-1            1104   5481     16  203    1 
> bio                 1140   2250    128   30    1 
> sock_inode_cache     456    483    512    7    1 
> skbuff_fclone_cache     36     40    384   10    1 
> skbuff_head_cache    655    825    256   15    1 
> file_lock_cache        5     42     92   42    1 
> acpi_operand         634    828     40   92    1 
> acpi_parse_ext         0      0     44   84    1 
> acpi_parse             0      0     28  127    1 
> acpi_state             0      0     48   78    1 
> delayacct_cache      183    390     48   78    1 
> taskstats_cache        9     32    236   16    1 
> proc_inode_cache      49    170    372   10    1 
> sigqueue              96    135    144   27    1 
> radix_tree_node    16046  16786    276   14    1 
> bdev_cache            56     56    512    7    1 
> sysfs_dir_cache     4831   4876     40   92    1 
> mnt_cache             30     60    128   30    1 
> inode_cache         1041   1276    356   11    1 
> dentry_cache       11588  13688    132   29    1 
> filp                2734   2820    192   20    1 
> names_cache           25     25   4096    1    1 
> idr_layer_cache      204    232    136   29    1 
> buffer_head       456669 459936     52   72    1 
> mm_struct            109    126    448    9    1 
> vm_area_struct      5010   5632     88   44    1 
> fs_cache             109    177     64   59    1 
> files_cache           94    135    448    9    1 
> signal_cache         159    160    384   10    1 
> sighand_cache        147    147   1344    3    1 
> task_struct          175    175   1376    5    2 
> anon_vma            2355   2540     12  254    1 
> pgd                   81     81   4096    1    1 
>
>
> On Thu, 2007-02-15 at 10:40 -0700, Robert Wipfel wrote:
>   
>>>>> On Thu, Feb 15, 2007 at 10:34 AM, in message
>>>>>           
>> <1171560898.4589.12.camel at ibmlaptop.darkcore.net>, John Lange
>> <john.lange at open-it.ca> wrote: 
>>     
>>> System is SUSE SLES 10 running heartbeat, ocfs2, evms, and exporting the
>>> file system via nfs.
>>>
>>> The ocfs2 partition is 12 Terabytes and is being exported via nfs.
>>>
>>> What we see is as soon as the nfs clients (80 nfs v2 clients) start
>>> connecting, memory usage goes up and up and up until all the physical
>>> RAM is consumed but it levels off before hitting swap. With 1G RAM, 1G
>>> of ram is used. With 2G RAM, 2G of ram is used. It just seems to consume
>>> everything.
>>>
>>> The system seems to run happily for a while. Then something happens and
>>> there is a RAM spike. Next thing you know we see the dreaded kernel
>>> oom- killer appear and start killing processes left and right resulting
>>> in a complete crash.
>>>
>>> I can confirm it is NOT nfs using the ram because when nfs is stopped,
>>> no ram is recovered. But when the ocfs2 partition is unmounted the RAM
>>> is freed.
>>>
>>> Can someone shed some light on what is going on here? Any suggestions on
>>> how to resolve this problem?
>>>       
>> Are your clients doing lots of creates? There was an OCFS2 bug
>> that left DLM structures lying around for each file create, that iirc is now
>> fixed.
>>
>> Hth,
>> Robert
>> _______________________________________________
>> Linux-HA mailing list
>> Linux-HA at lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>
>>     



More information about the Ocfs2-users mailing list