[Ocfs2-users] Panic

Sunil Mushran sunil.mushran at oracle.com
Wed Oct 7 11:16:57 PDT 2009


And does the node exporting the volume encounter the oops?

If so, the likeliest candidate would be:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=6ca497a83e592d64e050c4d04b6dedb8c915f39a

If it is on another node, I am currently unsure whether a nfs
export on one node could cause this to occur on another. Need more
coffee.

The problem in short is due to how nfs bypasses the normal fs lookup
to access files. It uses the file handle to directly access the inode,
bypassing the locking. Normally that is not a problem. The race window
is if the file is deleted (on any node in the cluster) and nfs reads that
inode without the lock. In the oops we see the disk generation is greater
than the in-memory inode generation. That means the inode was deleted and
reused. The fix closes the race window.

Sunil

Laurence Mayer wrote:
> Yes.
> We have setup 10 node cluster, with one of the nodes exporting the NFS 
> to the workstations.
>  
> Please expand your answer.
>  
> Thanks
> Laurence
>
>
>  
> On Wed, Oct 7, 2009 at 7:12 PM, Sunil Mushran 
> <sunil.mushran at oracle.com <mailto:sunil.mushran at oracle.com>> wrote:
>
>     Are you exporting this volume via nfs? We fixed a small race (in
>     the nfs
>     access path) that could lead to this oops.
>
>     Laurence Mayer wrote:
>
>         Hi again,
>          OS: Ubuntu 8.04 x64
>         Kern: Linux n1 2.6.24-24-server #1 SMP Tue Jul 7 19:39:36 UTC
>         2009 x86_64 GNU/Linux
>         10 Node Cluster
>         OCFS2 Version:  1.3.9-0ubuntu1
>          I received this panic on the 5th Oct, I cannot work out why
>         this has started to happen.
>         Please please can you provide directions.
>         Let me know if you require any further details or information.
>          Oct  5 10:21:22 n1 kernel: [1006473.993681]
>         (1387,3):ocfs2_meta_lock_update:1675 ERROR: bug expression:
>         inode->i_generation != le32_to_cpu(fe->i_generation)
>         Oct  5 10:21:22 n1 kernel: [1006473.993756]
>         (1387,3):ocfs2_meta_lock_update:1675 ERROR: Invalid dinode
>         3064741 disk generation: 1309441612 inode->i_generation: 13
>         09441501
>         Oct  5 10:21:22 n1 kernel: [1006473.993865] ------------[ cut
>         here ]------------
>         Oct  5 10:21:22 n1 kernel: [1006473.993896] kernel BUG at
>         /build/buildd/linux-2.6.24/fs/ocfs2/dlmglue.c:1675!
>         Oct  5 10:21:22 n1 kernel: [1006473.993949] invalid opcode:
>         0000 [3] SMP
>         Oct  5 10:21:22 n1 kernel: [1006473.993982] CPU 3
>         Oct  5 10:21:22 n1 kernel: [1006473.994008] Modules linked in:
>         ocfs2 crc32c libcrc32c nfsd auth_rpcgss exportfs ipmi_devintf
>         ipmi_si ipmi_msghandler ipv6 ocfs2_dlmfs ocfs2_dlm
>         ocfs2_nodemanager configfs iptable_filter ip_tables x_tables
>         xfs ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr
>         iscsi_tcp libiscsi scsi_transport_iscsi nfs lockd nfs_acl
>         sunrpc parport_pc lp parport loop serio_raw psmouse i2c_piix4
>         i2c_core dcdbas evdev button k8temp shpchp pci_hotplug pcspkr
>         ext3 jbd mbcache sg sr_mod cdrom sd_mod ata_generic pata_acpi
>         usbhid hid ehci_hcd tg3 sata_svw pata_serverworks ohci_hcd
>         libata scsi_mod usbcore thermal processor fan fbcon tileblit
>         font bitblit softcursor fuse
>         Oct  5 10:21:22 n1 kernel: [1006473.994445] Pid: 1387, comm: R
>         Tainted: G      D 2.6.24-24-server #1
>         Oct  5 10:21:22 n1 kernel: [1006473.994479] RIP:
>         0010:[<ffffffff8856c404>]  [<ffffffff8856c404>]
>         :ocfs2:ocfs2_meta_lock_full+0x6a4/0xec0
>         Oct  5 10:21:22 n1 kernel: [1006473.994558] RSP:
>         0018:ffff8101238f9d58  EFLAGS: 00010296
>         Oct  5 10:21:22 n1 kernel: [1006473.994590] RAX:
>         0000000000000093 RBX: ffff8102eaf03000 RCX: 00000000ffffffff
>         Oct  5 10:21:22 n1 kernel: [1006473.994642] RDX:
>         00000000ffffffff RSI: 0000000000000000 RDI: ffffffff8058ffa4
>         Oct  5 10:21:22 n1 kernel: [1006473.994694] RBP:
>         0000000100080000 R08: 0000000000000000 R09: 00000000ffffffff
>         Oct  5 10:21:22 n1 kernel: [1006473.994746] R10:
>         0000000000000000 R11: 0000000000000000 R12: ffff81012599ee00
>         Oct  5 10:21:22 n1 kernel: [1006473.994799] R13:
>         ffff81012599ef08 R14: ffff81012599f2b8 R15: ffff81012599ef08
>         Oct  5 10:21:22 n1 kernel: [1006473.994851] FS:
>          00002b3802fed670(0000) GS:ffff810418022c80(0000)
>         knlGS:00000000f546bb90
>         Oct  5 10:21:22 n1 kernel: [1006473.994906] CS:  0010 DS: 0000
>         ES: 0000 CR0: 000000008005003b
>         Oct  5 10:21:22 n1 kernel: [1006473.994938] CR2:
>         00007f5db5542000 CR3: 0000000167ddf000 CR4: 00000000000006e0
>         Oct  5 10:21:22 n1 kernel: [1006473.994990] DR0:
>         0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>         Oct  5 10:21:22 n1 kernel: [1006473.995042] DR3:
>         0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>         Oct  5 10:21:22 n1 kernel: [1006473.995095] Process R (pid:
>         1387, threadinfo ffff8101238f8000, task ffff8104110cc000)
>         Oct  5 10:21:22 n1 kernel: [1006473.995148] Stack:
>          000000004e0c7e4c ffff81044e0c7ddd ffff8101a3b4d2b8
>         00000000802c34c0
>         Oct  5 10:21:22 n1 kernel: [1006473.995212]  0000000000000000
>         0000000100000000 ffffffff80680c00 00000000804715e2
>         Oct  5 10:21:22 n1 kernel: [1006473.995272]  0000000100000000
>         ffff8101238f9e48 ffff810245558b80 ffff81031e358680
>         Oct  5 10:21:22 n1 kernel: [1006473.995313] Call Trace:
>         Oct  5 10:21:22 n1 kernel: [1006473.995380]
>          [<ffffffff8857d03f>] :ocfs2:ocfs2_inode_revalidate+0x5f/0x290
>         Oct  5 10:21:22 n1 kernel: [1006473.995427]
>          [<ffffffff88577fe6>] :ocfs2:ocfs2_getattr+0x56/0x1c0
>         Oct  5 10:21:22 n1 kernel: [1006473.995470]
>          [vfs_stat_fd+0x46/0x80] vfs_stat_fd+0x46/0x80
>         Oct  5 10:21:22 n1 kernel: [1006473.995514]
>          [<ffffffff88569634>] :ocfs2:ocfs2_meta_unlock+0x1b4/0x210
>         Oct  5 10:21:22 n1 kernel: [1006473.995553]
>          [filldir+0x0/0xf0] filldir+0x0/0xf0
>         Oct  5 10:21:22 n1 kernel: [1006473.995594]
>          [<ffffffff8856799e>] :ocfs2:ocfs2_readdir+0xce/0x230
>         Oct  5 10:21:22 n1 kernel: [1006473.995631]
>          [sys_newstat+0x27/0x50] sys_newstat+0x27/0x50
>         Oct  5 10:21:22 n1 kernel: [1006473.995664]
>          [vfs_readdir+0xa5/0xd0] vfs_readdir+0xa5/0xd0
>         Oct  5 10:21:22 n1 kernel: [1006473.995699]
>          [sys_getdents+0xcf/0xe0] sys_getdents+0xcf/0xe0
>         Oct  5 10:21:22 n1 kernel: [1006473.997568]
>          [system_call+0x7e/0x83] system_call+0x7e/0x83
>         Oct  5 10:21:22 n1 kernel: [1006473.997605]
>         Oct  5 10:21:22 n1 kernel: [1006473.997627]
>         Oct  5 10:21:22 n1 kernel: [1006473.997628] Code: 0f 0b eb fe
>         83 fd fe 0f 84 73 fc ff ff 81 fd 00 fe ff ff 0f
>         Oct  5 10:21:22 n1 kernel: [1006473.997745] RIP
>          [<ffffffff8856c404>] :ocfs2:ocfs2_meta_lock_full+0x6a4/0xec0
>         Oct  5 10:21:22 n1 kernel: [1006473.997808]  RSP
>         <ffff8101238f9d58>
>           Thanks
>         Laurence
>         ------------------------------------------------------------------------
>
>         _______________________________________________
>         Ocfs2-users mailing list
>         Ocfs2-users at oss.oracle.com <mailto:Ocfs2-users at oss.oracle.com>
>         http://oss.oracle.com/mailman/listinfo/ocfs2-users
>
>
>




More information about the Ocfs2-users mailing list