[Ocfs2-users] kernel BUG at fs/dlm/lowcomms.c:647!

Mon Nov 22 08:33:09 PST 2010

What does the stack of [ocfs2rec] thread look like?

On 11/22/2010 07:31 AM, Welterlen Benoit wrote:
> Hi,
>
> I have a new question ;-)
> Our cluster is now working with sctp and pcmk stack.
> When a node is lost or the eth is down, pcmk is fencing the other node
> and a recovery is made and all pending operations end correctly.
>
> Now, we try to have a double ethernet.
> When one link is down, pacemaker does'nt fence the other node, what's
> expected. But all pending operations on OCFS directories are freezed :
>
> [root at chili0 ~]# cat /proc/25695/stack
> [<ffffffffa0425f65>] ocfs2_wait_for_recovery+0x85/0xd0 [ocfs2]
> [<ffffffffa041883c>] ocfs2_inode_lock_full_nested+0x19c/0x920 [ocfs2]
> [<ffffffffa0422566>] ocfs2_inode_revalidate+0x56/0x110 [ocfs2]
> [<ffffffffa041d70b>] ocfs2_getattr+0x3b/0x100 [ocfs2]
> [<ffffffff8115cac1>] vfs_getattr+0x51/0x80
> [<ffffffff8115cb50>] vfs_fstatat+0x60/0x80
> [<ffffffff8115cbde>] vfs_lstat+0x1e/0x20
> [<ffffffff8115cc04>] sys_newlstat+0x24/0x50
> [<ffffffff8100c172>] system_call_fastpath+0x16/0x1b
>
> It's the same if we shutdown eth0 or eth1, so I think OCFS is using both
> interfaces, but can't manage a link loss.
> Maybe pcmk should proceed like juste after a fence and ask for a
> recovery to OCFS.
> Should OCFS work with 2 links and manage the loss of one ?
>
> Thanks
>
> Regards,
>
> Benoit
>
>
> Le 08/11/2010 17:06, Welterlen Benoit a écrit :
>> Hi,
>>
>> Thanks for your answer.
>> I tried to force TCP instead of SCTP.
>> There is no crash anymore, but OCFS is unavailable when I stop one node :
>> The OCFS directory is not accessible :
>>
>> ls /BCM/conf
>> [root at chili0 ~]# cat /proc/23921/stack
>> [<ffffffffa02c5d53>] ocfs2_wait_for_recovery+0x77/0x8f [ocfs2]
>> [<ffffffffa02b08a8>] ocfs2_inode_lock_full_nested+0x160/0xb8d [ocfs2]
>> [<ffffffffa02c35e2>] ocfs2_inode_revalidate+0x163/0x25c [ocfs2]
>> [<ffffffffa02bd9f4>] ocfs2_getattr+0x8b/0x19c [ocfs2]
>> [<ffffffff8111c30f>] vfs_getattr+0x4c/0x69
>> [<ffffffff8111c37c>] vfs_fstatat+0x50/0x67
>> [<ffffffff8111c479>] vfs_stat+0x1b/0x1d
>> [<ffffffff8111c49a>] sys_newstat+0x1f/0x39
>> [<ffffffff81011db2>] system_call_fastpath+0x16/0x1b
>> [<ffffffffffffffff>] 0xffffffffffffffff
>>
>> Any access to the filesystem hangs.
>>
>> Regards,
>>
>> Benoit
>>
>> Le 21/10/2010 19:13, David Teigland a écrit :
>>
>>> On Thu, Oct 21, 2010 at 06:11:36PM +0200, Welterlen Benoit wrote:
>>>
>>>
>>>> Hi,
>>>>
>>>> Thanks for your answer.
>>>> sctp is automatically selected to be used ("dlm: Using SCTP for
>>>> communications") and I have no option to modify that
>>>> (/sys/kernel/config/dlm/cluster/protocol is set to 1 and is only
>>>> available once the service is started, I can modify it at this time
>>>> ?)
>>>>
>>>>
>>> Yes, there's a dlm_controld command line option (or cluster.conf which I
>>> don't suppose you're using with pacemaker.)  You can set that to get TCP,
>>> but that obviates corosync redundant ring also (which is what is used to
>>> auto-select SCTP).
>>>
>>>
>>>
>>>> dlm: closing connection to node 33554699
>>>>
>>>>
>>>>
>>>>> Here, OCFS is down on M2, then restart :
>>>>>
>>>>>
>>>> dlm: 77410678764B4782BDAE3E888E0C8C4D: recover 3
>>>> dlm: 77410678764B4782BDAE3E888E0C8C4D: add member 33554699
>>>> cm->sockaddr_storage  ffff8804796268a0
>>>> dlm_nodeid_to_addr -EEXIST;
>>>> dlm: no address for nodeid 33554699
>>>>
>>>>
>>> I suspect the pacemaker version of dlm_controld is doing an unusual
>>> sequence of node addition/removal.  The stuff above eventually leads to
>>> the oops below:
>>>
>>>
>>>
>>>> sctp_packet_config: packet:ffff88107cf36598 vtag:0x2b186924
>>>> sctp_packet_config: packet:ffff880c8e6037d0 vtag:0x2b186924
>>>> sctp_packet_config: packet:ffff88107cf36598 vtag:0x2b186924
>>>> process_sctp_notification SCTP_RESTART
>>>> get_comm  nodeid : 0, sockaddr_storage : ffff88047a357c54
>>>> get_comm cm->addr_count : 0000000000000001, cm->addr[0] : ffff880479626740
>>>> addr : ffff88047a357c54
>>>> dlm: reject connect from unknown addr
>>>> 02 00 00 00 0b 01 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>> 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>> in __nodeid2con , nodeid : 0
>>>> sctp_packet_config: packet:ffff88107cf36598 vtag:0x2b186924
>>>> ------------[ cut here ]------------
>>>> kernel BUG at fs/dlm/lowcomms.c:661!
>>>> invalid opcode: 0000 [#1] SMP
>>>> last sysfs file: /sys/kernel/dlm/77410678764B4782BDAE3E888E0C8C4D/control
>>>> CPU 35
>>>> Modules linked in: sctp(U) libcrc32c(U) ocfs2(U)
>>>> ocfs2_nodemanager(U) ocfs2_stack_user(U) ocfs2_stackglue(U) dlm(U)
>>>> configfs(U) acpi_cpufreq(U) freq_table(U) ipmi_devintf(U) ipmi_si(U)
>>>> ipmi_msghandler(U) nfs(U) lockd(U) fscache(U) nfs_acl(U)
>>>> auth_rpcgss(U) ext3(U) jbd(U) sunrpc(U) ipv6(U) dm_mirror(U)
>>>> dm_region_hash(U) dm_log(U) scsi_dh_emc(U) dm_round_robin(U)
>>>> dm_multipath(U) ioatdma(U) i7core_edac(U) igb(U) edac_core(U) sg(U)
>>>> i2c_i801(U) i2c_core(U) iTCO_wdt(U) dca(U) iTCO_vendor_support(U)
>>>> ext4(U) mbcache(U) jbd2(U) usbhid(U) hid(U) sd_mod(U) crc_t10dif(U)
>>>> lpfc(U) ahci(U) scsi_transport_fc(U) ehci_hcd(U) uhci_hcd(U)
>>>> scsi_tgt(U) dm_mod(U) [last unloaded: ipmi_msghandler]
>>>>
>>>> Modules linked in: sctp(U) libcrc32c(U) ocfs2(U)
>>>> ocfs2_nodemanager(U) ocfs2_stack_user(U) ocfs2_stackglue(U) dlm(U)
>>>> configfs(U) acpi_cpufreq(U) freq_table(U) ipmi_devintf(U) ipmi_si(U)
>>>> ipmi_msghandler(U) nfs(U) lockd(U) fscache(U) nfs_acl(U)
>>>> auth_rpcgss(U) ext3(U) jbd(U) sunrpc(U) ipv6(U) dm_mirror(U)
>>>> dm_region_hash(U) dm_log(U) scsi_dh_emc(U) dm_round_robin(U)
>>>> dm_multipath(U) ioatdma(U) i7core_edac(U) igb(U) edac_core(U) sg(U)
>>>> i2c_i801(U) i2c_core(U) iTCO_wdt(U) dca(U) iTCO_vendor_support(U)
>>>> ext4(U) mbcache(U) jbd2(U) usbhid(U) hid(U) sd_mod(U) crc_t10dif(U)
>>>> lpfc(U) ahci(U) scsi_transport_fc(U) ehci_hcd(U) uhci_hcd(U)
>>>> scsi_tgt(U) dm_mod(U) [last unloaded: ipmi_msghandler]
>>>> Pid: 4306, comm: dlm_recv/35 Not tainted
>>>> 2.6.32-30.el6.Bull.14.x86_64 #2 bullx super-node
>>>> RIP: 0010:[<ffffffffa039edeb>]  [<ffffffffa039edeb>]
>>>> receive_from_sock+0x38b/0x430 [dlm]
>>>> RSP: 0018:ffff88047a357d10  EFLAGS: 00010246
>>>> RAX: 0000000000000095 RBX: ffff88087c5aad20 RCX: 0000000000001b62
>>>> RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246
>>>> RBP: ffff88047a357e10 R08: 00000000ffffffff R09: 0000000000000000
>>>> R10: ffff881079423480 R11: 0000000000000000 R12: ffff88047a357da0
>>>> R13: 0000000000000030 R14: ffff88087c5aad30 R15: ffff88047a357d40
>>>> FS:  0000000000000000(0000) GS:ffff880c8e600000(0000) knlGS:0000000000000000
>>>> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
>>>> CR2: 000000359bea6df0 CR3: 0000000001001000 CR4: 00000000000006e0
>>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>>> Process dlm_recv/35 (pid: 4306, threadinfo ffff88047a356000, task
>>>> ffff88047aeaf2e0)
>>>> Stack:
>>>>     ffff88047a357db0 0000000000000246 ffff880c7da288c0 0000000000000246
>>>> <0>    ffff88047a357d70 0000100081049472 0000000000000000 0000000000000000
>>>> <0>    ffff88047a357d80 0000000000000002 ffff88047a357dd0 0000000000000000
>>>> Call Trace:
>>>>     [<ffffffffa039dd70>] ? process_recv_sockets+0x0/0x50 [dlm]
>>>>     [<ffffffffa039dda6>] process_recv_sockets+0x36/0x50 [dlm]
>>>>     [<ffffffffa039dd70>] ? process_recv_sockets+0x0/0x50 [dlm]
>>>>     [<ffffffff8107ce9d>] worker_thread+0x16d/0x290
>>>>     [<ffffffff81082430>] ? autoremove_wake_function+0x0/0x40
>>>>     [<ffffffff8107cd30>] ? worker_thread+0x0/0x290
>>>>     [<ffffffff810820d6>] kthread+0x96/0xa0
>>>>     [<ffffffff8100d1aa>] child_rip+0xa/0x20
>>>>     [<ffffffff81082040>] ? kthread+0x0/0xa0
>>>>     [<ffffffff8100d1a0>] ? child_rip+0x0/0x20
>>>> Code: f1 4c 8b 8d 70 ff ff ff 48 c1 e6 0c 4c 01 d6 e8 f1 33 0b e1 4c
>>>> 89 f7 e8 14 4b 0b e1 31 f6 48 89 df e8 6a f4 ff ff e9 cd fd ff ff
>>>> <0f>    0b 0f 1f 00 eb fb 48 c7 c7 e8 9a 3a a0 31 c0 e8 c5 33 0b e1
>>>> RIP  [<ffffffffa039edeb>] receive_from_sock+0x38b/0x430 [dlm]
>>>>     RSP<ffff88047a357d10>
>>>> crash>
>>>>
>>>>
>>>>
>>>> Le 21/10/2010 17:13, David Teigland a ?crit :
>>>>
>>>>
>>>>>> kernel BUG at fs/dlm/lowcomms.c:647!
>>>>>>
>>>>>>
>>>>> That looks like an interesting one, I haven't seen it before.
>>>>> First ensure dlm is not configured to use sctp (that code is
>>>>> not widely tested.)  Other than that, if you'd like to start
>>>>> debugging this before I get around to it, replace the BUG_ON
>>>>> with some printk's and return error.  The conn with nodeid 0 is
>>>>> the listening socket, for which tcp_accept_from_sock() should
>>>>> be called rather than receive_from_sock().
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>           KERNEL: /usr/lib/debug/lib/modules/2.6.32-100.0.19.el5/vmlinux
>>>>>>         DUMPFILE: /var/var/crash/127.0.0.1-2010-10-18-16:42:07/vmcore
>>>>>> [PARTIAL DUMP]
>>>>>>             CPUS: 64
>>>>>>             DATE: Mon Oct 18 16:41:48 2010
>>>>>>           UPTIME: 00:15:00
>>>>>> LOAD AVERAGE: 1.06, 1.22, 1.65
>>>>>>            TASKS: 1594
>>>>>>         NODENAME: chili0
>>>>>>          RELEASE: 2.6.32-100.0.19.el5
>>>>>>          VERSION: #1 SMP Fri Sep 17 17:51:41 EDT 2010
>>>>>>          MACHINE: x86_64  (1999 Mhz)
>>>>>>           MEMORY: 64 GB
>>>>>>            PANIC: "kernel BUG at fs/dlm/lowcomms.c:647!"
>>>>>>              PID: 27062
>>>>>>          COMMAND: "dlm_recv/34"
>>>>>>             TASK: ffff880c7caa00c0  [THREAD_INFO: ffff880c77c6a000]
>>>>>>              CPU: 34
>>>>>>            STATE: TASK_RUNNING (PANIC)
>>>>>>
>>>>>> crash>     bt
>>>>>> PID: 27062  TASK: ffff880c7caa00c0  CPU: 34  COMMAND: "dlm_recv/34"
>>>>>>      #0 [ffff880c77c6b910] machine_kexec at ffffffff8102cc9b
>>>>>>      #1 [ffff880c77c6b990] crash_kexec at ffffffff810964d4
>>>>>>      #2 [ffff880c77c6ba60] oops_end at ffffffff81439bd9
>>>>>>      #3 [ffff880c77c6ba90] die at ffffffff81015639
>>>>>>      #4 [ffff880c77c6bac0] do_trap at ffffffff8143952c
>>>>>>      #5 [ffff880c77c6bb10] do_invalid_op at ffffffff81013902
>>>>>>      #6 [ffff880c77c6bbb0] invalid_op at ffffffff81012b7b
>>>>>>         [exception RIP: receive_from_sock+1364]
>>>>>>         RIP: ffffffffa02406c3  RSP: ffff880c77c6bc60  RFLAGS: 00010246
>>>>>>         RAX: 0000000000000030  RBX: ffff8810774b8d30  RCX: ffff88087c4548f8
>>>>>>         RDX: 0000000000000030  RSI: ffff880876dce000  RDI: ffffffff81398045
>>>>>>         RBP: ffff880c77c6be50   R8: ffff000000000000   R9: ffff880c77c6b900
>>>>>>         R10: ffff880c77c6b8f0  R11: 0000000000000030  R12: 0000000000000030
>>>>>>         R13: ffff8810774b8d20  R14: ffff880c7caa00c0  R15: ffffffffa023ecca
>>>>>>         ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
>>>>>>      #7 [ffff880c77c6be58] process_recv_sockets at ffffffffa023ecea
>>>>>>      #8 [ffff880c77c6be78] worker_thread at ffffffff81071802
>>>>>>      #9 [ffff880c77c6bee8] kthread at ffffffff810756d3
>>>>>> #10 [ffff880c77c6bf48] kernel_thread at ffffffff81012dea
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Ocfs2-users mailing list
>>>>>> Ocfs2-users at oss.oracle.com
>>>>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>> _______________________________________________
>> Ocfs2-users mailing list
>> Ocfs2-users at oss.oracle.com
>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>
>>
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users