[Ocfs2-users] kernel BUG at fs/dlm/lowcomms.c:647!
Joel Becker
Joel.Becker at oracle.com
Wed Oct 20 22:21:41 PDT 2010
On Wed, Oct 20, 2010 at 04:15:15PM +0200, Welterlen Benoit wrote:
> I'm doing some tests on OCFS2 with a 2.6.32-100 kernel (Oracle) or
> RHEL6/fedora and I have a hang in lowcomms.c as you can see below.
> I have a crash dump if you need more information. I'm lost and I need
> help to know where to search to debug this problem.
Whee! Userspace stack on the 2.6.32-100 kernel ;-) We haven't
actually tested this configuration yet; it's not supported officially.
However, it "should" work, just as the userspace stack stuff has worked
for a while. I've forwarded this report on to the fs/dlm maintainer for
pointers to see if we can get you any help.
Joel
> Thanks
>
> Regards,
>
> Benoit
>
>
>
> Kernel 2.6.32-100.0.19.el5 on an x86_64
> chili0 login: ------------[ cut here ]------------
> kernel BUG at fs/dlm/lowcomms.c:647!
> invalid opcode: 0000 [#1] SMP
> last sysfs file: /sys/kernel/dlm/14E8093BB71D447EBEE691622CF86B9C/control
> CPU 34
> Modules linked in: ocfs2(U) ocfs2_nodemanager(U) nfsd(U) exportfs(U)
> sctp(U) libcrc32c(U) ocfs2_stack_user(U) ocfs2_stackglue(U) dlm(U)
> configfs(U) acpi_cpufreq(U) freq_table(U) ipmi_devintf(U) ipmi_si(U)
> ipmi_msghandler(U) nfs(U) lockd(U) fscache(U) nfs_acl(U) auth_rpcgss(U)
> sunrpc(U) ipv6(U) scsi_dh_emc(U) dm_round_robin(U) dm_multipath(U)
> iTCO_wdt(U) iTCO_vendor_support(U) mlx4_core(U) i2c_i801(U) igb(U)
> pcspkr(U) i2c_core(U) ioatdma(U) dca(U) ahci(U) uhci_hcd(U) ehci_hcd(U)
> lpfc(U) scsi_transport_fc(U) scsi_tgt(U) [last unloaded: ocfs2_nodemanager]
> Pid: 27062, comm: dlm_recv/34 Not tainted 2.6.32-100.0.19.el5 #1 bullx
> super-node
> RIP: 0010:[<ffffffffa02406c3>] [<ffffffffa02406c3>]
> receive_from_sock+0x554/0x6ed [dlm]
> RSP: 0018:ffff880c77c6bc60 EFLAGS: 00010246
> RAX: 0000000000000030 RBX: ffff8810774b8d30 RCX: ffff88087c4548f8
> RDX: 0000000000000030 RSI: ffff880876dce000 RDI: ffffffff81398045
> RBP: ffff880c77c6be50 R08: ffff000000000000 R09: ffff880c77c6b900
> R10: ffff880c77c6b8f0 R11: 0000000000000030 R12: 0000000000000030
> R13: ffff8810774b8d20 R14: ffff880c7caa00c0 R15: ffffffffa023ecca
> FS: 0000000000000000(0000) GS:ffff88048e600000(0000)
> knlGS:0000000000000000
> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000fcb078 CR3: 0000000001001000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process dlm_recv/34 (pid: 27062, threadinfo ffff880c77c6a000, task
> ffff880c7caa00c0)
> Stack:
> ffff880c77c6bc70 ffffffff8122fa24 ffff880c77c6bc90 ffffffff8122faca
> <0> ffff88048e414ec0 0000100000000002 0000000000000000 ffffffff00000000
> <0> 0000000000000000 0000000000000000 ffffffffa024bb20 0000000000000030
> Call Trace:
> [<ffffffff8122fa24>] ? cpumask_next+0x19/0x1b
> [<ffffffff8122faca>] ? cpumask_next_and+0x20/0x32
> [<ffffffffa023ecca>] ? process_recv_sockets+0x0/0x28 [dlm]
> [<ffffffffa023ecea>] process_recv_sockets+0x20/0x28 [dlm]
> [<ffffffff81071802>] worker_thread+0x14d/0x1ed
> [<ffffffff81075a7c>] ? autoremove_wake_function+0x0/0x3d
> [<ffffffff810716b5>] ? worker_thread+0x0/0x1ed
> [<ffffffff810756d3>] kthread+0x6e/0x76
> [<ffffffff81012dea>] child_rip+0xa/0x20
> [<ffffffff81075665>] ? kthread+0x0/0x76
> [<ffffffff81012de0>] ? child_rip+0x0/0x20
> Code: 29 e7 ff ff e9 2d 01 00 00 41 8b 74 24 10 0f b7 d0 48 c7 c7 d1 8c
> 24 a0 31 c0 e8 ab 71 e1 e0 e9 12 01 00 00 41 83 7d 08 00 75 04 <0f> 0b
> eb fe 4d 8d 7d 68 49 be 00 00 00 00 00 16 00 00 41 8b 55
> RIP [<ffffffffa02406c3>] receive_from_sock+0x554/0x6ed [dlm]
> RSP <ffff880c77c6bc60>
> Initializing cgroup subsys cpuset
> Initializing cgroup subsys cpu
> Linux version 2.6.32-100.0.19.el5 (mockbuild at ca-build9.us.oracle.com)
> (gcc version 4.1.2 20080704 (Red Hat 4.1.2-48)) #1 SMP Fri Sep 17
> 17:51:41 EDT 2010
> Command line: ro root=/dev/mapper/vg_chili0-lv_root
> rd_LVM_LV=vg_chili0/lv_root rd_LVM_LV=vg_chili0/lv_swap rd_NO_LUKS
> rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16
> KEYBOARDTYPE=pc KEYTABLE=fr-pc cgroup_disable=memory selinux=0
> pcie_aspm=off nmi_watchdog=0 console=ttyS1,115200 maxcpus=1
> reset_devices memmap=exactmap memmap=640K at 0K memmap=195948K at 33408K
> elfcorehdr=229356K memmap=308K#1993940K memmap=16K#2077704K
> memmap=4K#2077748K memmap=4K#2077764K memmap=44K#2077768K
> memmap=72K#2077812K memmap=4K#2077884K memmap=4K#2077888K
> memmap=4K#2077892K memmap=4K#2078024K memmap=2716K#2078052K
> memmap=1024K#69204860K memmap=128K#69205884K
> KERNEL supported cpus:
> Intel GenuineIntel
> AMD AuthenticAMD
> Centaur CentaurHauls
> BIOS-provided physical RAM map:
>
> From the dump :
> GNU gdb (GDB) 7.0
> Copyright (C) 2009 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later
> <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law. Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-unknown-linux-gnu"...
>
> KERNEL: /usr/lib/debug/lib/modules/2.6.32-100.0.19.el5/vmlinux
> DUMPFILE: /var/var/crash/127.0.0.1-2010-10-18-16:42:07/vmcore
> [PARTIAL DUMP]
> CPUS: 64
> DATE: Mon Oct 18 16:41:48 2010
> UPTIME: 00:15:00
> LOAD AVERAGE: 1.06, 1.22, 1.65
> TASKS: 1594
> NODENAME: chili0
> RELEASE: 2.6.32-100.0.19.el5
> VERSION: #1 SMP Fri Sep 17 17:51:41 EDT 2010
> MACHINE: x86_64 (1999 Mhz)
> MEMORY: 64 GB
> PANIC: "kernel BUG at fs/dlm/lowcomms.c:647!"
> PID: 27062
> COMMAND: "dlm_recv/34"
> TASK: ffff880c7caa00c0 [THREAD_INFO: ffff880c77c6a000]
> CPU: 34
> STATE: TASK_RUNNING (PANIC)
>
> crash> bt
> PID: 27062 TASK: ffff880c7caa00c0 CPU: 34 COMMAND: "dlm_recv/34"
> #0 [ffff880c77c6b910] machine_kexec at ffffffff8102cc9b
> #1 [ffff880c77c6b990] crash_kexec at ffffffff810964d4
> #2 [ffff880c77c6ba60] oops_end at ffffffff81439bd9
> #3 [ffff880c77c6ba90] die at ffffffff81015639
> #4 [ffff880c77c6bac0] do_trap at ffffffff8143952c
> #5 [ffff880c77c6bb10] do_invalid_op at ffffffff81013902
> #6 [ffff880c77c6bbb0] invalid_op at ffffffff81012b7b
> [exception RIP: receive_from_sock+1364]
> RIP: ffffffffa02406c3 RSP: ffff880c77c6bc60 RFLAGS: 00010246
> RAX: 0000000000000030 RBX: ffff8810774b8d30 RCX: ffff88087c4548f8
> RDX: 0000000000000030 RSI: ffff880876dce000 RDI: ffffffff81398045
> RBP: ffff880c77c6be50 R8: ffff000000000000 R9: ffff880c77c6b900
> R10: ffff880c77c6b8f0 R11: 0000000000000030 R12: 0000000000000030
> R13: ffff8810774b8d20 R14: ffff880c7caa00c0 R15: ffffffffa023ecca
> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> #7 [ffff880c77c6be58] process_recv_sockets at ffffffffa023ecea
> #8 [ffff880c77c6be78] worker_thread at ffffffff81071802
> #9 [ffff880c77c6bee8] kthread at ffffffff810756d3
> #10 [ffff880c77c6bf48] kernel_thread at ffffffff81012dea
>
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users
--
"Every new beginning comes from some other beginning's end."
Joel Becker
Senior Development Manager
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127
More information about the Ocfs2-users
mailing list