[Ocfs2-users] Re: [Linux-HA] OCFS2 - Memory hog?

Thu Feb 15 17:30:05 PST 2007

I saw this problem on a few of SLES9 SP3 updates, but now it is not an issue
anymore.

----- Original Message ----- 
From: "John Lange" <j.lange at epic.ca>
To: <linux-ha at lists.linux-ha.org>; "ocfs2-users"
<ocfs2-users at oss.oracle.com>
Sent: Thursday, February 15, 2007 10:48 AM
Subject: [Ocfs2-users] Re: [Linux-HA] OCFS2 - Memory hog?

> Yes, the clients are doing lots of creates.
>
> But my question is, if this is a memory leak, why does ocfs2 eat up the
> memory as soon as the clients start accessing the filesystem. Within
> about 5-10 minutes all physical RAM is consumed but then the memory
> consumption stops. It does not go into swap.
>
> Do you happen to know what version of ocfs2 has the fix?
>
> If it was a leak would the process not be more gradual and continuous?
> It would continue to eat into swap no? And if it was a leak would the
> ram be freed when ocfs was unmounted?
>
> Is there a command that shows what is using the kernel memory?
>
> Here is what /proc/slabinfo shows (cut down for formatting). I don't
> understand how to read this so maybe someone can indicate if something
> looks wrong?
>
> =======
> # cat /proc/slabinfo
>
> # name            <active_objs> <num_objs> <objsize> <objperslab>
<pagesperslab>
> nfsd4_delegations      0      0    596   13    2
> nfsd4_stateids         0      0     72   53    1
> nfsd4_files            0      0     36  101    1
> nfsd4_stateowners      0      0    344   11    1
> rpc_buffers            8      8   2048    2    1
> rpc_tasks              8     15    256   15    1
> rpc_inode_cache        0      0    512    7    1
> ocfs2_lock           152    203     16  203    1
> ocfs2_inode_cache  12484  12536    896    4    1
> ocfs2_uptodate      1381   1469     32  113    1
> ocfs2_em_ent       37005  37406     64   59    1
> dlmfs_inode_cache      1      6    640    6    1
> dlm_mle_cache         10     10    384   10    1
> configfs_dir_cache     33     78     48   78    1
> fib6_nodes             7    113     32  113    1
> ip6_dst_cache          7     15    256   15    1
> ndisc_cache            1     15    256   15    1
> RAWv6                  5      6    640    6    1
> UDPv6                  3      6    640    6    1
> tw_sock_TCPv6          0      0    128   30    1
> request_sock_TCPv6      0      0    128   30    1
> TCPv6                  8      9   1280    3    1
> ip_fib_alias          16    113     32  113    1
> ip_fib_hash           16    113     32  113    1
> dm_events             16    169     20  169    1
> dm_tio              4157   7308     16  203    1
> dm_io               4155   6760     20  169    1
> uhci_urb_priv          0      0     40   92    1
> ext3_inode_cache    1062   2856    512    8    1
> ext3_xattr             0      0     48   78    1
> journal_handle        74    169     20  169    1
> journal_head         583   1224     52   72    1
> revoke_table           6    254     12  254    1
> revoke_record          0      0     16  203    1
> qla2xxx_srbs         244    360    128   30    1
> scsi_cmd_cache       106    130    384   10    1
> sgpool-256            32     32   4096    1    1
> sgpool-128            42     42   2048    2    1
> sgpool-64             44     44   1024    4    1
> sgpool-32             48     48    512    8    1
> sgpool-16             75     75    256   15    1
> sgpool-8             153    210    128   30    1
> scsi_io_context        0      0    104   37    1
> UNIX                 377    399    512    7    1
> ip_mrt_cache           0      0    128   30    1
> tcp_bind_bucket       14    203     16  203    1
> inet_peer_cache       81    118     64   59    1
> secpath_cache          0      0    128   30    1
> xfrm_dst_cache         0      0    384   10    1
> ip_dst_cache         176    240    256   15    1
> arp_cache              6     30    256   15    1
> RAW                    3      7    512    7    1
> UDP                   29     42    512    7    1
> tw_sock_TCP            0      0    128   30    1
> request_sock_TCP       0      0     64   59    1
> TCP                   19     35   1152    7    2
> flow_cache             0      0    128   30    1
> cfq_ioc_pool         194    240     96   40    1
> cfq_pool             185    240     96   40    1
> crq_pool             312    468     48   78    1
> deadline_drq           0      0     52   72    1
> as_arq                 0      0     64   59    1
> mqueue_inode_cache      1      6    640    6    1
> isofs_inode_cache      0      0    384   10    1
> minix_inode_cache      0      0    420    9    1
> hugetlbfs_inode_cache      1     11    356   11    1
> ext2_inode_cache       0      0    492    8    1
> ext2_xattr             0      0     48   78    1
> dnotify_cache          1    169     20  169    1
> dquot                  0      0    128   30    1
> eventpoll_pwq          1    101     36  101    1
> eventpoll_epi          1     30    128   30    1
> inotify_event_cache      0      0     28  127    1
> inotify_watch_cache     40     92     40   92    1
> kioctx                 0      0    256   15    1
> kiocb                  0      0    128   30    1
> fasync_cache           1    203     16  203    1
> shmem_inode_cache    612    632    460    8    1
> posix_timers_cache      0      0    100   39    1
> uid_cache              7     59     64   59    1
> blkdev_ioc           103    127     28  127    1
> blkdev_queue          58     60    960    4    1
> blkdev_requests      354    418    176   22    1
> biovec-(256)         312    312   3072    2    2
> biovec-128           368    370   1536    5    2
> biovec-64            480    485    768    5    1
> biovec-16            480    495    256   15    1
> biovec-4             480    531     64   59    1
> biovec-1            1104   5481     16  203    1
> bio                 1140   2250    128   30    1
> sock_inode_cache     456    483    512    7    1
> skbuff_fclone_cache     36     40    384   10    1
> skbuff_head_cache    655    825    256   15    1
> file_lock_cache        5     42     92   42    1
> acpi_operand         634    828     40   92    1
> acpi_parse_ext         0      0     44   84    1
> acpi_parse             0      0     28  127    1
> acpi_state             0      0     48   78    1
> delayacct_cache      183    390     48   78    1
> taskstats_cache        9     32    236   16    1
> proc_inode_cache      49    170    372   10    1
> sigqueue              96    135    144   27    1
> radix_tree_node    16046  16786    276   14    1
> bdev_cache            56     56    512    7    1
> sysfs_dir_cache     4831   4876     40   92    1
> mnt_cache             30     60    128   30    1
> inode_cache         1041   1276    356   11    1
> dentry_cache       11588  13688    132   29    1
> filp                2734   2820    192   20    1
> names_cache           25     25   4096    1    1
> idr_layer_cache      204    232    136   29    1
> buffer_head       456669 459936     52   72    1
> mm_struct            109    126    448    9    1
> vm_area_struct      5010   5632     88   44    1
> fs_cache             109    177     64   59    1
> files_cache           94    135    448    9    1
> signal_cache         159    160    384   10    1
> sighand_cache        147    147   1344    3    1
> task_struct          175    175   1376    5    2
> anon_vma            2355   2540     12  254    1
> pgd                   81     81   4096    1    1
>
>
> On Thu, 2007-02-15 at 10:40 -0700, Robert Wipfel wrote:
> > >>> On Thu, Feb 15, 2007 at 10:34 AM, in message
> > <1171560898.4589.12.camel at ibmlaptop.darkcore.net>, John Lange
> > <john.lange at open-it.ca> wrote:
> > > System is SUSE SLES 10 running heartbeat, ocfs2, evms, and exporting
the
> > > file system via nfs.
> > >
> > > The ocfs2 partition is 12 Terabytes and is being exported via nfs.
> > >
> > > What we see is as soon as the nfs clients (80 nfs v2 clients) start
> > > connecting, memory usage goes up and up and up until all the physical
> > > RAM is consumed but it levels off before hitting swap. With 1G RAM, 1G
> > > of ram is used. With 2G RAM, 2G of ram is used. It just seems to
consume
> > > everything.
> > >
> > > The system seems to run happily for a while. Then something happens
and
> > > there is a RAM spike. Next thing you know we see the dreaded kernel
> > > oom- killer appear and start killing processes left and right
resulting
> > > in a complete crash.
> > >
> > > I can confirm it is NOT nfs using the ram because when nfs is stopped,
> > > no ram is recovered. But when the ocfs2 partition is unmounted the RAM
> > > is freed.
> > >
> > > Can someone shed some light on what is going on here? Any suggestions
on
> > > how to resolve this problem?
> >
> > Are your clients doing lots of creates? There was an OCFS2 bug
> > that left DLM structures lying around for each file create, that iirc is
now
> > fixed.
> >
> > Hth,
> > Robert
> > _______________________________________________
> > Linux-HA mailing list
> > Linux-HA at lists.linux-ha.org
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
> >
> -- 
> John Lange
> Epic Information Solutions
> p: (204) 975 7113
>
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>