[Ocfs2-users] kernel BUG at fs/dlm/lowcomms.c:647!
Welterlen Benoit
Benoit.Welterlen at bull.net
Wed Oct 20 07:15:15 PDT 2010
Hi all,
I'm doing some tests on OCFS2 with a 2.6.32-100 kernel (Oracle) or
RHEL6/fedora and I have a hang in lowcomms.c as you can see below.
I have a crash dump if you need more information. I'm lost and I need
help to know where to search to debug this problem.
Thanks
Regards,
Benoit
Kernel 2.6.32-100.0.19.el5 on an x86_64
chili0 login: ------------[ cut here ]------------
kernel BUG at fs/dlm/lowcomms.c:647!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/kernel/dlm/14E8093BB71D447EBEE691622CF86B9C/control
CPU 34
Modules linked in: ocfs2(U) ocfs2_nodemanager(U) nfsd(U) exportfs(U)
sctp(U) libcrc32c(U) ocfs2_stack_user(U) ocfs2_stackglue(U) dlm(U)
configfs(U) acpi_cpufreq(U) freq_table(U) ipmi_devintf(U) ipmi_si(U)
ipmi_msghandler(U) nfs(U) lockd(U) fscache(U) nfs_acl(U) auth_rpcgss(U)
sunrpc(U) ipv6(U) scsi_dh_emc(U) dm_round_robin(U) dm_multipath(U)
iTCO_wdt(U) iTCO_vendor_support(U) mlx4_core(U) i2c_i801(U) igb(U)
pcspkr(U) i2c_core(U) ioatdma(U) dca(U) ahci(U) uhci_hcd(U) ehci_hcd(U)
lpfc(U) scsi_transport_fc(U) scsi_tgt(U) [last unloaded: ocfs2_nodemanager]
Pid: 27062, comm: dlm_recv/34 Not tainted 2.6.32-100.0.19.el5 #1 bullx
super-node
RIP: 0010:[<ffffffffa02406c3>] [<ffffffffa02406c3>]
receive_from_sock+0x554/0x6ed [dlm]
RSP: 0018:ffff880c77c6bc60 EFLAGS: 00010246
RAX: 0000000000000030 RBX: ffff8810774b8d30 RCX: ffff88087c4548f8
RDX: 0000000000000030 RSI: ffff880876dce000 RDI: ffffffff81398045
RBP: ffff880c77c6be50 R08: ffff000000000000 R09: ffff880c77c6b900
R10: ffff880c77c6b8f0 R11: 0000000000000030 R12: 0000000000000030
R13: ffff8810774b8d20 R14: ffff880c7caa00c0 R15: ffffffffa023ecca
FS: 0000000000000000(0000) GS:ffff88048e600000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000fcb078 CR3: 0000000001001000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process dlm_recv/34 (pid: 27062, threadinfo ffff880c77c6a000, task
ffff880c7caa00c0)
Stack:
ffff880c77c6bc70 ffffffff8122fa24 ffff880c77c6bc90 ffffffff8122faca
<0> ffff88048e414ec0 0000100000000002 0000000000000000 ffffffff00000000
<0> 0000000000000000 0000000000000000 ffffffffa024bb20 0000000000000030
Call Trace:
[<ffffffff8122fa24>] ? cpumask_next+0x19/0x1b
[<ffffffff8122faca>] ? cpumask_next_and+0x20/0x32
[<ffffffffa023ecca>] ? process_recv_sockets+0x0/0x28 [dlm]
[<ffffffffa023ecea>] process_recv_sockets+0x20/0x28 [dlm]
[<ffffffff81071802>] worker_thread+0x14d/0x1ed
[<ffffffff81075a7c>] ? autoremove_wake_function+0x0/0x3d
[<ffffffff810716b5>] ? worker_thread+0x0/0x1ed
[<ffffffff810756d3>] kthread+0x6e/0x76
[<ffffffff81012dea>] child_rip+0xa/0x20
[<ffffffff81075665>] ? kthread+0x0/0x76
[<ffffffff81012de0>] ? child_rip+0x0/0x20
Code: 29 e7 ff ff e9 2d 01 00 00 41 8b 74 24 10 0f b7 d0 48 c7 c7 d1 8c
24 a0 31 c0 e8 ab 71 e1 e0 e9 12 01 00 00 41 83 7d 08 00 75 04 <0f> 0b
eb fe 4d 8d 7d 68 49 be 00 00 00 00 00 16 00 00 41 8b 55
RIP [<ffffffffa02406c3>] receive_from_sock+0x554/0x6ed [dlm]
RSP <ffff880c77c6bc60>
Initializing cgroup subsys cpuset
Initializing cgroup subsys cpu
Linux version 2.6.32-100.0.19.el5 (mockbuild at ca-build9.us.oracle.com)
(gcc version 4.1.2 20080704 (Red Hat 4.1.2-48)) #1 SMP Fri Sep 17
17:51:41 EDT 2010
Command line: ro root=/dev/mapper/vg_chili0-lv_root
rd_LVM_LV=vg_chili0/lv_root rd_LVM_LV=vg_chili0/lv_swap rd_NO_LUKS
rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16
KEYBOARDTYPE=pc KEYTABLE=fr-pc cgroup_disable=memory selinux=0
pcie_aspm=off nmi_watchdog=0 console=ttyS1,115200 maxcpus=1
reset_devices memmap=exactmap memmap=640K at 0K memmap=195948K at 33408K
elfcorehdr=229356K memmap=308K#1993940K memmap=16K#2077704K
memmap=4K#2077748K memmap=4K#2077764K memmap=44K#2077768K
memmap=72K#2077812K memmap=4K#2077884K memmap=4K#2077888K
memmap=4K#2077892K memmap=4K#2078024K memmap=2716K#2078052K
memmap=1024K#69204860K memmap=128K#69205884K
KERNEL supported cpus:
Intel GenuineIntel
AMD AuthenticAMD
Centaur CentaurHauls
BIOS-provided physical RAM map:
From the dump :
GNU gdb (GDB) 7.0
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...
KERNEL: /usr/lib/debug/lib/modules/2.6.32-100.0.19.el5/vmlinux
DUMPFILE: /var/var/crash/127.0.0.1-2010-10-18-16:42:07/vmcore
[PARTIAL DUMP]
CPUS: 64
DATE: Mon Oct 18 16:41:48 2010
UPTIME: 00:15:00
LOAD AVERAGE: 1.06, 1.22, 1.65
TASKS: 1594
NODENAME: chili0
RELEASE: 2.6.32-100.0.19.el5
VERSION: #1 SMP Fri Sep 17 17:51:41 EDT 2010
MACHINE: x86_64 (1999 Mhz)
MEMORY: 64 GB
PANIC: "kernel BUG at fs/dlm/lowcomms.c:647!"
PID: 27062
COMMAND: "dlm_recv/34"
TASK: ffff880c7caa00c0 [THREAD_INFO: ffff880c77c6a000]
CPU: 34
STATE: TASK_RUNNING (PANIC)
crash> bt
PID: 27062 TASK: ffff880c7caa00c0 CPU: 34 COMMAND: "dlm_recv/34"
#0 [ffff880c77c6b910] machine_kexec at ffffffff8102cc9b
#1 [ffff880c77c6b990] crash_kexec at ffffffff810964d4
#2 [ffff880c77c6ba60] oops_end at ffffffff81439bd9
#3 [ffff880c77c6ba90] die at ffffffff81015639
#4 [ffff880c77c6bac0] do_trap at ffffffff8143952c
#5 [ffff880c77c6bb10] do_invalid_op at ffffffff81013902
#6 [ffff880c77c6bbb0] invalid_op at ffffffff81012b7b
[exception RIP: receive_from_sock+1364]
RIP: ffffffffa02406c3 RSP: ffff880c77c6bc60 RFLAGS: 00010246
RAX: 0000000000000030 RBX: ffff8810774b8d30 RCX: ffff88087c4548f8
RDX: 0000000000000030 RSI: ffff880876dce000 RDI: ffffffff81398045
RBP: ffff880c77c6be50 R8: ffff000000000000 R9: ffff880c77c6b900
R10: ffff880c77c6b8f0 R11: 0000000000000030 R12: 0000000000000030
R13: ffff8810774b8d20 R14: ffff880c7caa00c0 R15: ffffffffa023ecca
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#7 [ffff880c77c6be58] process_recv_sockets at ffffffffa023ecea
#8 [ffff880c77c6be78] worker_thread at ffffffff81071802
#9 [ffff880c77c6bee8] kthread at ffffffff810756d3
#10 [ffff880c77c6bf48] kernel_thread at ffffffff81012dea
More information about the Ocfs2-users
mailing list