[Ocfs2-users] Panic
Laurence Mayer
laurence at istraresearch.com
Wed Oct 7 11:22:00 PDT 2009
Nope, the node that crashed is not the NFS server.
How should I proceed?
What do you suggest?
Could this happen again?
On Wed, Oct 7, 2009 at 8:16 PM, Sunil Mushran <sunil.mushran at oracle.com>wrote:
> And does the node exporting the volume encounter the oops?
>
> If so, the likeliest candidate would be:
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=6ca497a83e592d64e050c4d04b6dedb8c915f39a
>
> If it is on another node, I am currently unsure whether a nfs
> export on one node could cause this to occur on another. Need more
> coffee.
>
> The problem in short is due to how nfs bypasses the normal fs lookup
> to access files. It uses the file handle to directly access the inode,
> bypassing the locking. Normally that is not a problem. The race window
> is if the file is deleted (on any node in the cluster) and nfs reads that
> inode without the lock. In the oops we see the disk generation is greater
> than the in-memory inode generation. That means the inode was deleted and
> reused. The fix closes the race window.
>
> Sunil
>
> Laurence Mayer wrote:
>
>> Yes.
>> We have setup 10 node cluster, with one of the nodes exporting the NFS to
>> the workstations.
>> Please expand your answer.
>> Thanks
>> Laurence
>>
>>
>> On Wed, Oct 7, 2009 at 7:12 PM, Sunil Mushran <sunil.mushran at oracle.com<mailto:
>> sunil.mushran at oracle.com>> wrote:
>>
>> Are you exporting this volume via nfs? We fixed a small race (in
>> the nfs
>> access path) that could lead to this oops.
>>
>> Laurence Mayer wrote:
>>
>> Hi again,
>> OS: Ubuntu 8.04 x64
>> Kern: Linux n1 2.6.24-24-server #1 SMP Tue Jul 7 19:39:36 UTC
>> 2009 x86_64 GNU/Linux
>> 10 Node Cluster
>> OCFS2 Version: 1.3.9-0ubuntu1
>> I received this panic on the 5th Oct, I cannot work out why
>> this has started to happen.
>> Please please can you provide directions.
>> Let me know if you require any further details or information.
>> Oct 5 10:21:22 n1 kernel: [1006473.993681]
>> (1387,3):ocfs2_meta_lock_update:1675 ERROR: bug expression:
>> inode->i_generation != le32_to_cpu(fe->i_generation)
>> Oct 5 10:21:22 n1 kernel: [1006473.993756]
>> (1387,3):ocfs2_meta_lock_update:1675 ERROR: Invalid dinode
>> 3064741 disk generation: 1309441612 inode->i_generation: 13
>> 09441501
>> Oct 5 10:21:22 n1 kernel: [1006473.993865] ------------[ cut
>> here ]------------
>> Oct 5 10:21:22 n1 kernel: [1006473.993896] kernel BUG at
>> /build/buildd/linux-2.6.24/fs/ocfs2/dlmglue.c:1675!
>> Oct 5 10:21:22 n1 kernel: [1006473.993949] invalid opcode:
>> 0000 [3] SMP
>> Oct 5 10:21:22 n1 kernel: [1006473.993982] CPU 3
>> Oct 5 10:21:22 n1 kernel: [1006473.994008] Modules linked in:
>> ocfs2 crc32c libcrc32c nfsd auth_rpcgss exportfs ipmi_devintf
>> ipmi_si ipmi_msghandler ipv6 ocfs2_dlmfs ocfs2_dlm
>> ocfs2_nodemanager configfs iptable_filter ip_tables x_tables
>> xfs ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr
>> iscsi_tcp libiscsi scsi_transport_iscsi nfs lockd nfs_acl
>> sunrpc parport_pc lp parport loop serio_raw psmouse i2c_piix4
>> i2c_core dcdbas evdev button k8temp shpchp pci_hotplug pcspkr
>> ext3 jbd mbcache sg sr_mod cdrom sd_mod ata_generic pata_acpi
>> usbhid hid ehci_hcd tg3 sata_svw pata_serverworks ohci_hcd
>> libata scsi_mod usbcore thermal processor fan fbcon tileblit
>> font bitblit softcursor fuse
>> Oct 5 10:21:22 n1 kernel: [1006473.994445] Pid: 1387, comm: R
>> Tainted: G D 2.6.24-24-server #1
>> Oct 5 10:21:22 n1 kernel: [1006473.994479] RIP:
>> 0010:[<ffffffff8856c404>] [<ffffffff8856c404>]
>> :ocfs2:ocfs2_meta_lock_full+0x6a4/0xec0
>> Oct 5 10:21:22 n1 kernel: [1006473.994558] RSP:
>> 0018:ffff8101238f9d58 EFLAGS: 00010296
>> Oct 5 10:21:22 n1 kernel: [1006473.994590] RAX:
>> 0000000000000093 RBX: ffff8102eaf03000 RCX: 00000000ffffffff
>> Oct 5 10:21:22 n1 kernel: [1006473.994642] RDX:
>> 00000000ffffffff RSI: 0000000000000000 RDI: ffffffff8058ffa4
>> Oct 5 10:21:22 n1 kernel: [1006473.994694] RBP:
>> 0000000100080000 R08: 0000000000000000 R09: 00000000ffffffff
>> Oct 5 10:21:22 n1 kernel: [1006473.994746] R10:
>> 0000000000000000 R11: 0000000000000000 R12: ffff81012599ee00
>> Oct 5 10:21:22 n1 kernel: [1006473.994799] R13:
>> ffff81012599ef08 R14: ffff81012599f2b8 R15: ffff81012599ef08
>> Oct 5 10:21:22 n1 kernel: [1006473.994851] FS:
>> 00002b3802fed670(0000) GS:ffff810418022c80(0000)
>> knlGS:00000000f546bb90
>> Oct 5 10:21:22 n1 kernel: [1006473.994906] CS: 0010 DS: 0000
>> ES: 0000 CR0: 000000008005003b
>> Oct 5 10:21:22 n1 kernel: [1006473.994938] CR2:
>> 00007f5db5542000 CR3: 0000000167ddf000 CR4: 00000000000006e0
>> Oct 5 10:21:22 n1 kernel: [1006473.994990] DR0:
>> 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> Oct 5 10:21:22 n1 kernel: [1006473.995042] DR3:
>> 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Oct 5 10:21:22 n1 kernel: [1006473.995095] Process R (pid:
>> 1387, threadinfo ffff8101238f8000, task ffff8104110cc000)
>> Oct 5 10:21:22 n1 kernel: [1006473.995148] Stack:
>> 000000004e0c7e4c ffff81044e0c7ddd ffff8101a3b4d2b8
>> 00000000802c34c0
>> Oct 5 10:21:22 n1 kernel: [1006473.995212] 0000000000000000
>> 0000000100000000 ffffffff80680c00 00000000804715e2
>> Oct 5 10:21:22 n1 kernel: [1006473.995272] 0000000100000000
>> ffff8101238f9e48 ffff810245558b80 ffff81031e358680
>> Oct 5 10:21:22 n1 kernel: [1006473.995313] Call Trace:
>> Oct 5 10:21:22 n1 kernel: [1006473.995380]
>> [<ffffffff8857d03f>] :ocfs2:ocfs2_inode_revalidate+0x5f/0x290
>> Oct 5 10:21:22 n1 kernel: [1006473.995427]
>> [<ffffffff88577fe6>] :ocfs2:ocfs2_getattr+0x56/0x1c0
>> Oct 5 10:21:22 n1 kernel: [1006473.995470]
>> [vfs_stat_fd+0x46/0x80] vfs_stat_fd+0x46/0x80
>> Oct 5 10:21:22 n1 kernel: [1006473.995514]
>> [<ffffffff88569634>] :ocfs2:ocfs2_meta_unlock+0x1b4/0x210
>> Oct 5 10:21:22 n1 kernel: [1006473.995553]
>> [filldir+0x0/0xf0] filldir+0x0/0xf0
>> Oct 5 10:21:22 n1 kernel: [1006473.995594]
>> [<ffffffff8856799e>] :ocfs2:ocfs2_readdir+0xce/0x230
>> Oct 5 10:21:22 n1 kernel: [1006473.995631]
>> [sys_newstat+0x27/0x50] sys_newstat+0x27/0x50
>> Oct 5 10:21:22 n1 kernel: [1006473.995664]
>> [vfs_readdir+0xa5/0xd0] vfs_readdir+0xa5/0xd0
>> Oct 5 10:21:22 n1 kernel: [1006473.995699]
>> [sys_getdents+0xcf/0xe0] sys_getdents+0xcf/0xe0
>> Oct 5 10:21:22 n1 kernel: [1006473.997568]
>> [system_call+0x7e/0x83] system_call+0x7e/0x83
>> Oct 5 10:21:22 n1 kernel: [1006473.997605]
>> Oct 5 10:21:22 n1 kernel: [1006473.997627]
>> Oct 5 10:21:22 n1 kernel: [1006473.997628] Code: 0f 0b eb fe
>> 83 fd fe 0f 84 73 fc ff ff 81 fd 00 fe ff ff 0f
>> Oct 5 10:21:22 n1 kernel: [1006473.997745] RIP
>> [<ffffffff8856c404>] :ocfs2:ocfs2_meta_lock_full+0x6a4/0xec0
>> Oct 5 10:21:22 n1 kernel: [1006473.997808] RSP
>> <ffff8101238f9d58>
>> Thanks
>> Laurence
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Ocfs2-users mailing list
>> Ocfs2-users at oss.oracle.com <mailto:Ocfs2-users at oss.oracle.com>
>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20091007/a74ba8e2/attachment.html
More information about the Ocfs2-users
mailing list