[Ocfs2-users] Kernel Panic ocfs2_inode_lock_full

Sunil Mushran sunil.mushran at oracle.com
Thu Feb 18 11:19:00 PST 2010


http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6ca497a83e592d64e050c4d04b6dedb8c915f39a

The race has a tiny window. The backdoor read (via nfs file handle)
has to happen after a file has been deleted (and committed to journal)
and before the journal has checkpointed (or replayed). I am surprised
you are hitting it so often.

Are you doing a lot of deletes?

Sunil

michael.a.jaquays at verizon.com wrote:
> Are there any workarounds besides the nordirplus option on the nfs clients?
>
> -Mike Jaquays
> Office: 972-718-2982
> Cell: 214-587-3882 
>
> -----Original Message-----
> From: Sunil Mushran [mailto:sunil.mushran at oracle.com] 
> Sent: Thursday, February 18, 2010 11:47 AM
> To: Jaquays, Michael A.
> Cc: ocfs2-users at oss.oracle.com
> Subject: Re: [Ocfs2-users] Kernel Panic ocfs2_inode_lock_full
>
> Yes, this is a known issue. Only occurs when nfs is in the equation.
> This issue has been fixed in mainline quite some time ago. We are
> in the process of backporting that to 1.4.
>
> michael.a.jaquays at verizon.com wrote:
>> All,
>>
>> I have a 3 node cluster that is experiencing kernel panics once every few days.  We are sharing some of the ocfs2 filesystems via nfs to some web app servers.  The app servers mount the filesystems with the nordirplus option.  Are there any known pitfalls with using an nfs4 server and ocfs2?  I haven't seen a case where all three nodes are down at the same time, but the issue seems to travel from node to node.  Here are the node details:
>>
>> OS:	RHEL5.4
>> Kernel:	2.6.18-164.11.1.el5 #1 SMP Wed Jan 6 13:26:04 EST 2010 x86_64 x86_64 x86_64 GNU/Linux
>>
>> OCFS2 Packages:	ocfs2console-1.4.3-1.el5
>> 			ocfs2-tools-1.4.3-1.el5
>> 			ocfs2-2.6.18-164.11.1.el5-1.4.4-1.el5
>>
>>
>> The following is always logged in /var/log/messages right before the node panics:
>>
>> kernel: (11915,0):ocfs2_inode_lock_update:1970 ERROR: bug expression: inode->i_generation != le32_to_cpu(fe->i_generation)
>>
>> kernel: (11915,0):ocfs2_inode_lock_update:1970 ERROR: Invalid dinode 446146 disk generation: 1276645928 inode->i_generation: 1276645926
>>
>> kernel: ----------- [cut here ] --------- [please bite here ] ---------
>>
>> The following is part of the kernel panic:
>>
>> Call Trace:
>>  [<ffffffff885a2940>]	:ocfs2:ocfs2_delete_inode+0x187/0x73f
>>  [<ffffffff885a27b9>]	:ocfs2:ocfs2_delete_inode+0x0/0x73f
>>  [<ffffffff8002f463>]	generic_delete_inode+0xc6/0x143
>>  [<ffffffff885a22e3>]	:ocfs2:ocfs2_drop_inode+0xca/0x12b
>>  [<ffffffff885a693f>]	:ocfs2:ocfs2_complete_recovery+0x77e/0x910
>>  [<ffffffff885a61c1>]	:ocfs2:ocfs2_complete_recovery+0x0/0x910
>>  [<ffffffff8004d8ed>]	run_workqueue+0x94/0xe4
>>  [<ffffffff8004a12f>]	worker_thread+0x0/0x122
>>  [<ffffffff8009fe9f>]	keventd_create_kthread+0x0/0xc4
>>  [<ffffffff8004a21f>]	worker_thread+0xf0/0x122
>>  [<ffffffff8008c86c>]	default_wake_function+0x0/0xe
>>  [<ffffffff8009fe9f>]	keventd_create_kthread+0x0/0xc4
>>  [<ffffffff80032950>]	kthread+0xfe/0x132
>>  [<ffffffff8005dfb1>]	child_rip+0xa/0x11
>>  [<ffffffff8009fe9f>]	keventd_create_kthread+0x0/0xc4
>>  [<ffffffff80032852>]	kthread+0x0/0x132
>>  [<ffffffff8005dfa6>]	child_rip+0x0/0x11
>>
>>
>>
>> Code: 0f 0b 68 1b 3d 5c 88 c2 b2 07 48 83 7b 48 00 75 0a f6 43 2c
>> RIP [<ffffffff885928f5>]  :ocfs2:ocfs2_inode_lock_full+0x99e/0xe3c
>>  RSP <ffff810c0af0fc70>
>>  <0>Kernel panic - not syncing: Fatal exception
>>
>> Any help anyone could provide would be appreciated.
>>
>>
>> -Mike 
>>
>> _______________________________________________
>> Ocfs2-users mailing list
>> Ocfs2-users at oss.oracle.com
>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>   
>>     
>
>   




More information about the Ocfs2-users mailing list