[Ocfs2-users] umount hang + high CPU

sylarrrrrrr at aim.com sylarrrrrrr at aim.com
Tue Jul 7 03:19:19 PDT 2009


So this bug is not over yet :(

I have checked my kernel source and indeed it have this patch but I still get the hang.

PS. my linux-2.6-2.6.30/fs/ocfs2/dcache.c kernel source has:

??? 290???????? else
??? 291???????????????? mlog_errno(ret);
??? 292 
??? 293???????? /*
??? 294????????? * In case of error, manually free the allocation and do the iput().
??? 295????????? * We need to do this because error here means no d_instantiate(),
??? 296????????? * which means iput() will not be called during dput(dentry).
??? 297????????? */
??? 298???????? if (ret < 0 && !alias) {
??? 299???????????????? ocfs2_lock_res_free(&dl->dl_lockres);
??? 300???????????????? BUG_ON(dl->dl_count != 1);
??? 301???????????????? spin_lock(&dentry_attach_lock);
??? 302???????????????? dentry->d_fsdata = NULL;
??? 303???????????????? spin_unlock(&dentry_attach_lock);
??? 304???????????????? kfree(dl);
??? 305???????????????? iput(inode);
??? 306???????? }
??? 307 
??? 308???????? dput(alias);
??? 309 
??? 310???????? return ret;
??? 311 }


-----Original Message-----
From: Tao Ma <tao.ma at oracle.com>
To: sylarrrrrrr at aim.com
Cc: sunil.mushran at oracle.com; ocfs2-users at oss.oracle.com
Sent: Tue, Jul 7, 2009 11:16 am
Subject: Re: [Ocfs2-users] umount hang + high CPU









?

sylarrrrrrr at aim.com wrote:?

> That's a quick fix :D?

> 
> How do I put it in my system??

> 
> I have only recently downloaded and upgraded both tools and kernel, so I 
> gather that it is not on the recent version of either. That dlmmaster.c 
> file is not in the tools package, does that mean that this file is in 
> the kernel? Do I need to patch and compile the kernel from the 
> development code? (which is where?)?

yes, 2.6.30 already have the fix for?

http://oss.oracle.com/bugzilla/show_bug.cgi?id=914?
?

And yes, it is in the kernel. So your tools aren't affected.?
?

Regards,?

Tao?
?

> 
> 
> -----Original Message-----?

> From: Sunil Mushran <sunil.mushran at oracle.com>?

> To: sylarrrrrrr at aim.com?

> Cc: ocfs2-users at oss.oracle.com?

> Sent: Mon, Jul 6, 2009 11:03 pm?

> Subject: Re: [Ocfs2-users] umount hang + high CPU?

> 
> Fixed. Details in http://oss.oracle.com/bugzilla/show_bug.cgi?id=914 
>  
> sylarrrrrrr at aim.com <mailto:sylarrrrrrr at aim.com> wrote: 
>  > 
>  > Hi, 
>  > 
>  > On kernel 2.6.30 (and I have upgraded drbd there too to 8.3.2) I have 
>  > nothing in the logs, and the umount hangs, and after a few minutes 
> the > whole computer hangs, and I have to hard reset it. On kernel 
> 2.6.26 it > also hanged but the computer didn't hang, but it refused to 
> reboot, or > poweroff, so I also had to hard reset it. In 2.6.26 I had 
> this in syslog : 
>  > 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] > 
> (7254,1):dlm_empty_lockres:2709 ERROR: lockres > 
> O0000000000000003cb1e3000000000 still has local locks! 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] ------------[ cut 
>  > here ]------------ 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] kernel BUG at > 
> fs/ocfs2/dlm/dlmmaster.c:2710! 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] invalid opcode: > 
> 0000 [1] SMP 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] CPU 1 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] Modules linked in: 
>  > ocfs2 ppdev lp parport drbd cn rfcomm l2cap bluetooth xt_tcpudp > 
> iptable_filter battery ip_t 
>  > ables x_tables ipv6 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm > 
> ocfs2_nodemanager ocfs2_stackglue configfs linear coretemp loop > 
> snd_hda_intel snd_pcsp snd_pcm snd_timer sn 
>  > d soundcore nvidiafb i2c_i801 psmouse snd_page_alloc i2c_core button 
>  > vgastate serio_raw intel_agp evdev ext3 jbd mbcache dm_mirror dm_log 
>  > dm_snapshot dm_mod raid456 a 
>  > sync_xor async_memcpy async_tx xor raid1 md_mod sg sr_mod cdrom 
> sd_mod > ide_pci_generic ide_core ata_generic usbhid hid ff_memless 
> usb_storage > floppy ahci ohci1394 pat 
>  > a_marvell atl1e ieee1394 libata tulip scsi_mod dock ehci_hcd uhci_hcd 
>  > thermal processor fan thermal_sys 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] Pid: 7254, comm: > 
> umount Not tainted 2.6.26-2-amd64 #1 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] RIP: > 
> 0010:[<ffffffffa035981f>] [<ffffffffa035981f>] > 
> :ocfs2_dlm:dlm_empty_lockres+0x13fb/0x14a0 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] RSP: > 
> 0018:ffff81023c971c18 EFLAGS: 00010292 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] RAX: > 
> 0000000000000079 RBX: ffff8101db4dae40 RCX: ffffffff804fe108 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] RDX: > 
> 0000000100000000 RSI: 0000000000000096 RDI: 0000000000000286 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] RBP: > 
> ffff8101db4dae40 R08: ffffffff804fe0f0 R09: ffff81000103b918 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] R10: > 
> ffff81000103b880 R11: 0000000000000046 R12: ffff8101cae4e800 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] R13: > 
> 000000000000001f R14: 00000000ffffffd9 R15: 00000000000000c5 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] FS: > 
> 0000000000000000(0000) GS:ffff81023f08e8c0(0063) knlGS:00000000f7deb6f0 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] CS: 0010 DS: 002b 
>  > ES: 002b CR0: 000000008005003b 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] CR2: > 
> 00000000f7e2e2a0 CR3: 00000001dc9ef000 CR4: 00000000000006e0 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] DR0: > 
> 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] DR3: > 
> 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] Process umount > 
> (pid: 7254, threadinfo ffff81023c970000, task ffff81019e998040) 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] Stack: > 
> ffff81020c580800 ffff810100000001 0000000000000001 0000000000000000 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] ffff8101cae4ea48 > 
> 0000000000000000 ffff81020c580800 0000000000000000 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] 00008101cae4ea38 > 
> 0000000000000003 0000000000000000 ffff81019e998040 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] Call Trace: 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] > 
> [<ffffffff802461b1>] ? autoremove_wake_function+0x0/0x2e 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] > 
> [<ffffffffa034d1af>] ? :ocfs2_dlm:__dlm_lockres_unused+0x33/0x50 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] > 
> [<ffffffffa0349819>] ? :ocfs2_dlm:dlm_unregister_domain+0x1c8/0x756 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] > 
> [<ffffffff8022898e>] ? enqueue_task+0x56/0x61 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] > 
> [<ffffffffa037819d>] ? :ocfs2_stack_o2cb:o2cb_cluster_disconnect+0x30/0x40 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] > 
> [<ffffffffa030f252>] ? :ocfs2_stackglue:ocfs2_cluster_disconnect+0x21/0x40 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] > 
> [<ffffffffa04760c3>] ? :ocfs2:ocfs2_dlm_shutdown+0xbd/0x12e 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] > 
> [<ffffffffa0499496>] ? :ocfs2:ocfs2_dismount_volume+0x1a1/0x34e 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] > 
> [<ffffffff80271aa6>] ? filemap_write_and_wait+0x26/0x31 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] > 
> [<ffffffffa0499995>] ? :ocfs2:ocfs2_put_super+0x67/0xb8 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] > 
> [<ffffffff8029c9a1>] ? generic_shutdown_super+0x60/0xee 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] > 
> [<ffffffff8029ca3c>] ? kill_block_super+0xd/0x1e 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] > 
> [<ffffffff8029caf8>] ? deactivate_super+0x5f/0x78 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] > 
> [<ffffffff802afdf2>] ? sys_umount+0x2f9/0x353 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] > 
> [<ffffffff80221fac>] ? do_page_fault+0x5d8/0x9c8 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320327] > 
> [<ffffffff8022562c>] ? sys32_stat64+0x11/0x29 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320337] > 
> [<ffffffff8031db03>] ? __up_write+0x21/0x10e 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320337] > 
> [<ffffffff80224c52>] ? sysenter_do_call+0x1b/0x66 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320337] 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320337] 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320337] Code: 00 00 8b b0 
>  > 98 01 00 00 48 c7 c7 2f a9 36 a0 31 c0 65 8b 14 25 24 00 00 00 48 89 
>  > 0c 24 89 d2 48 c7 c1 00 48 36 a0 e8 e8 bb ed df <0f> 0b eb fe 48 f7 
> 05 > 32 dc fc ff 00 09 00 00 74 4d 48 f7 05 2d 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320337] RIP > 
> [<ffffffffa035981f>] :ocfs2_dlm:dlm_empty_lockres+0x13fb/0x14a0 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320337] RSP > 
> <ffff81023c971c18> 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320337] ---[ end trace > 
> 10e3d919ff4fa443 ]--- 
>  > Jul 5 21:10:34 ocfs2Server kernel: [249187.320337] ------------[ cut 
>  > here ]------------ 
>  > 
>  > PS. I see that both kernels have the same 1.5.0 version, so upgrading 
>  > was pointless in this regard. 
>  > 
>  > 
>  > -----Original Message----- 
>  > From: Tao Ma <tao.ma at oracle.com <mailto:tao.ma at oracle.com> 
> <mailto:tao.ma at oracle.com <mailto:tao.ma at oracle.com?>>> 
>  > To: sylarrrrrrr at aim.com <mailto:sylarrrrrrr at aim.com> 
> <mailto:sylarrrrrrr at aim.com <mailto:sylarrrrrrr at aim.com?>> 
>  > Cc: ocfs2-users at oss.oracle.com <mailto:ocfs2-users at oss.oracle.com> 
> <mailto:ocfs2-users at oss.oracle.com <mailto:ocfs2-users at oss.oracle.com?>> 
>  > Sent: Sun, Jul 5, 2009 9:22 pm 
>  > Subject: Re: [Ocfs2-users] umount hang + high CPU 
>  > 
>  > Hi, > Is there something in your system log? > I would guess there 
> should be some info there. > > Regards, > Tao > > sylarrrrrrr at aim.com 
> <mailto:sylarrrrrrr at aim.com> wrote: > > Hi, > > > I had a problem where 
> I got a "kernel bug" in the logs in ocfs2. > That > happened when I 
> unmounted the volume after a day or two that it > was > mounted, so I 
> thought I needed to upgrade the kernel (maybe the > next > version will 
> be bug free), so I did to 2.6.30, and now I tried > mounting > and 
> unmounting the volume right away... and it hanged, and > the CPU got > 
> high with that umount process. > > > Please advice > > > PS. tools and 
> console packages are version 1.4.2. > > > *A Good Credit Score is 700 or 
> Above. See yours in just 2 easy > steps! > > 
> <http://pr.atwola.com/promoclk/100126575x1222887319x1201497660/aol?redir=http://www.freecreditreport.com/pm/default.aspx?sc=668072%26hmpgID=62%26bcd=JulystepsfooterNO62>* 
> <http://pr.atwola.com/promoclk/100126575x1222887319x1201497660/aol?redir=http://www.freecreditreport.com/pm/default.aspx?sc=668072%26hmpgID=62%26bcd=JulystepsfooterNO62%3E*> 
>  > > > > > > 
> ------------------------------------------------------------------------ 
>  > > > _______________________________________________ > > Ocfs2-users 
> mailing list > > Ocfs2-users at oss.oracle.com 
> <mailto:Ocfs2-users at oss.oracle.com> > > 
> http://oss.oracle.com/mailman/listinfo/ocfs2-users > 
>  > ------------------------------------------------------------------------ 
>  > *A Good Credit Score is 700 or Above. See yours in just 2 easy steps! 
>  > 
> <http://pr.atwola.com/promoclk/100126575x1222377077x1201454398/aol?redir=http://www.freecreditreport.com/pm/default.aspx?sc=668072%26hmpgID=62%26bcd=JulystepsfooterNO62>* 
> <http://pr.atwola.com/promoclk/100126575x1222377077x1201454398/aol?redir=http://www.freecreditreport.com/pm/default.aspx?sc=668072%26hmpgID=62%26bcd=JulystepsfooterNO62%3E*> 
>  > 
>  > 
>  > ------------------------------------------------------------------------ 
>  > *A Good Credit Score is 700 or Above. See yours in just 2 easy steps! 
>  > 
> <http://pr.atwola.com/promoclk/100126575x1222377077x1201454398/aol?redir=http://www.freecreditreport.com/pm/default.aspx?sc=668072%26hmpgID=62%26bcd=JulystepsfooterNO62>* 
> <http://pr.atwola.com/promoclk/100126575x1222377077x1201454398/aol?redir=http://www.freecreditreport.com/pm/default.aspx?sc=668072%26hmpgID=62%26bcd=JulystepsfooterNO62%3E*> 
>  > 
>  > ------------------------------------------------------------------------ 
>  > 
>  > _______________________________________________ 
>  > Ocfs2-users mailing list 
>  > Ocfs2-users at oss.oracle.com <mailto:Ocfs2-users at oss.oracle.com> 
>  > http://oss.oracle.com/mailman/listinfo/ocfs2-users 
>  
> 
> ------------------------------------------------------------------------?

> *A Good Credit Score is 700 or Above. See yours in just 2 easy steps! 
> <http://pr.atwola.com/promoclk/100126575x1222585089x1201462806/aol?redir=http://www.freecreditreport.com/pm/default.aspx?sc=668072%26hmpgID=62%26bcd=JulystepsfooterNO62>* 
> 
> 
> 
> ------------------------------------------------------------------------?

> 
> _______________________________________________?

> Ocfs2-users mailing list?

> Ocfs2-users at oss.oracle.com?

> http://oss.oracle.com/mailman/listinfo/ocfs2-users?



 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20090707/19243d19/attachment-0001.html 


More information about the Ocfs2-users mailing list