[Ocfs2-users] soft lockup of process that doing I/O on an OCFS2 filesystem
Robin Lee
robinlee.sysu at gmail.com
Tue Jun 4 03:01:46 PDT 2019
Hi,
In an OCFS2 cluster of XenServer 7.1.1 hosts, we met a soft lockup and
the kernel crashed.
The kernel is based on Linux 4.4.27.
The cluster has 9 hosts and 8 OCFS2 filesystems.
dmesg from kdump console log:
[ 713656.182631] EMERG: NMI watchdog: BUG: soft lockup - CPU#0 stuck
for 23s! [tapdisk:30306]
[ 713656.182681] WARN: Modules linked in: iptable_filter ip_tables
xfs usb_storage tun ocfs2 quota_tree ocfs2_dlmfs ocfs2_stack_o2cb
ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue dm_service_time nbd
vport_geneve(O) vport_vxlan(O) vport_gre(O) openvswitch(O)
nf_conntrack_ipv6 nf_nat_ipv6 nf_nat_ipv4 nf_defrag_ipv6 nf_nat gre
8021q garp mrp stp llc ipt_REJECT nf_reject_ipv4 xt_limit
nf_conntrack_ipv4 nf_defrag_ipv4 xt_tcpudp xt_multiport xt_conntrack
dm_multipath nls_iso8859_1 nls_cp437 vfat fat x86_pkg_temp_thermal
coretemp crc32_pclmul aesni_intel aes_x86_64 ablk_helper cryptd lrw
gf128mul glue_helper lpc_ich igb(O) ptp sb_edac edac_core mfd_core
pps_core i2c_i801 ipmi_devintf dm_mod sg ipmi_si ipmi_msghandler
shpchp hed nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc
x_tables hid_generic usbhid hid
[ 713656.182777] WARN: sr_mod sd_mod cdrom megaraid_sas(O) ahci
libahci libata lpfc(O) ehci_pci ehci_hcd scsi_transport_fc
scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh_alua scsi_mod ip_vs
libcrc32c nf_conntrack efivarfs ipv6 autofs4 [last unloaded:
ip_tables]
[ 713656.182804] WARN: CPU: 0 PID: 30306 Comm: tapdisk Tainted: G
O 4.4.0+2 #1
[ 713656.182806] WARN: Hardware name: Inspur NF8460M3/NF8460M3, BIOS
4.1.01 11/29/2016
[ 713656.182809] WARN: task: ffff88016cea9c00 ti: ffff880057190000
task.ti: ffff880057190000
[ 713656.182812] WARN: RIP: e030:[<ffffffff810013aa>]
[<ffffffff810013aa>] xen_hypercall_sched_op+0xa/0x20
[ 713656.182828] WARN: RSP: e02b:ffff8801c6803b80 EFLAGS: 00000202
[ 713656.182830] WARN: RAX: 0000000000000000 RBX: 0000000000000000
RCX: ffffffff810013aa
[ 713656.182831] WARN: RDX: 0000000000000000 RSI: ffff8801c6803ba0
RDI: 0000000000000003
[ 713656.182832] WARN: RBP: ffff8801c6803bc0 R08: 0000000000000000
R09: ffff8801c5800490
[ 713656.182834] WARN: R10: ffff8801c5800680 R11: 0000000000000202
R12: ffff880196d7c1ec
[ 713656.182836] WARN: R13: 0000000000000003 R14: ffff8801c707c500
R15: 0000000000000003
[ 713656.182849] WARN: FS: 00007fb8fefa8780(0000)
GS:ffff8801c6800000(0000) knlGS:0000000000000000
[ 713656.182851] WARN: CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 713656.182853] WARN: CR2: 00007f916c44c000 CR3: 0000000059614000
CR4: 0000000000042660
[ 713656.182856] WARN: Stack:
[ 713656.182857] WARN: 0000000000000000 00000000fffffffa
ffffffff813c43fe 00000002810bf707
[ 713656.182860] WARN: ffff8801c6803b9c ffffffff00000001
0000000000000000 0000000000000079
[ 713656.182863] WARN: ffff8801c6803bd0 ffffffff813c5410
ffff8801c6803bf8 ffffffff81014bc3
[ 713656.182866] WARN: Call Trace:
[ 713656.182870] WARN: <IRQ>
[ 713656.182882] WARN: [<ffffffff813c43fe>] ? xen_poll_irq_timeout+0x3e/0x50
[ 713656.182897] WARN: [<ffffffff813c5410>] xen_poll_irq+0x10/0x20
[ 713656.182904] WARN: [<ffffffff81014bc3>] xen_qlock_wait+0x33/0x40
[ 713656.182911] WARN: [<ffffffff810b1abc>]
__pv_queued_spin_lock_slowpath+0x17c/0x210
[ 713656.182918] WARN: [<ffffffff81156cb7>] queued_spin_lock_slowpath+0xb/0xf
[ 713656.182927] WARN: [<ffffffff815a060f>] _raw_spin_lock+0x1f/0x30
[ 713656.182931] WARN: [<ffffffff8159e885>] __mutex_unlock_slowpath+0x25/0x50
[ 713656.182934] WARN: [<ffffffff8159e8cb>] mutex_unlock+0x1b/0x20
[ 713656.183008] WARN: [<ffffffffa07286f6>]
ocfs2_dio_end_io+0x46/0x80 [ocfs2]
[ 713656.183014] WARN: [<ffffffff811ed1e6>] dio_complete+0x96/0x160
[ 713656.183016] WARN: [<ffffffff811ed38f>] dio_bio_end_aio+0xdf/0x100
[ 713656.183037] WARN: [<ffffffff812ea1a6>] bio_endio+0x46/0x70
[ 713656.183041] WARN: [<ffffffff812f16c7>] blk_update_request+0x197/0x2c0
[ 713656.183053] WARN: [<ffffffffa02c323e>] end_clone_bio+0x5e/0x70 [dm_mod]
[ 713656.183056] WARN: [<ffffffff812ea1a6>] bio_endio+0x46/0x70
[ 713656.183058] WARN: [<ffffffff812f16c7>] blk_update_request+0x197/0x2c0
[ 713656.183103] WARN: [<ffffffffa00d08a4>]
scsi_end_request+0x34/0x1a0 [scsi_mod]
[ 713656.183112] WARN: [<ffffffffa00d37c9>]
scsi_io_completion+0x239/0x610 [scsi_mod]
[ 713656.183116] WARN: [<ffffffff810c0021>] ?
handle_irq_event_percpu+0x171/0x1e0
[ 713656.183123] WARN: [<ffffffffa00caf98>]
scsi_finish_command+0x108/0x120 [scsi_mod]
[ 713656.183132] WARN: [<ffffffffa00d2ed8>]
scsi_softirq_done+0x128/0x140 [scsi_mod]
[ 713656.183136] WARN: [<ffffffff812f8948>] blk_done_softirq+0x78/0x90
[ 713656.183143] WARN: [<ffffffff81076199>] __do_softirq+0x129/0x290
[ 713656.183146] WARN: [<ffffffff810764d2>] irq_exit+0x42/0x90
[ 713656.183148] WARN: [<ffffffff813c52b5>] xen_evtchn_do_upcall+0x35/0x50
[ 713656.183154] WARN: [<ffffffff815a26ae>]
xen_do_hypervisor_callback+0x1e/0x40
[ 713656.183156] WARN: <EOI>
[ 713656.183159] WARN: [<ffffffff810b16a2>] ?
__raw_callee_save___pv_queued_spin_unlock+0x2/0x20
[ 713656.183162] WARN: [<ffffffff8159e9ae>] ?
__mutex_lock_slowpath+0x9e/0x100
[ 713656.183165] WARN: [<ffffffff8159ea2f>] ? mutex_lock+0x1f/0x2f
[ 713656.183191] WARN: [<ffffffffa0746c3e>] ?
ocfs2_file_write_iter+0xa9e/0xd00 [ocfs2]
[ 713656.183195] WARN: [<ffffffff8100bb6e>] ? xen_pte_val+0xe/0x10
[ 713656.183218] WARN: [<ffffffffa07461a0>] ?
ocfs2_check_range_for_refcount+0x130/0x130 [ocfs2]
[ 713656.183238] WARN: [<ffffffff811fc099>] ? aio_run_iocb+0x229/0x270
[ 713656.183243] WARN: [<ffffffff81065f94>] ? get_user_pages_fast+0x104/0x160
[ 713656.183272] WARN: [<ffffffff813c1e45>] ? gnttab_batch_copy+0x25/0xe0
[ 713656.183281] WARN: [<ffffffff811a3813>] ? kmem_cache_alloc+0xb3/0x140
[ 713656.183289] WARN: [<ffffffff811cf7d3>] ? __fdget+0x13/0x20
[ 713656.183292] WARN: [<ffffffff811fd6b9>] ? do_io_submit+0x389/0x4a0
[ 713656.183296] WARN: [<ffffffff811fd7e0>] ? SyS_io_submit+0x10/0x20
[ 713656.183304] WARN: [<ffffffff815a096e>] ?
entry_SYSCALL_64_fastpath+0x12/0x71
[ 713656.183307] WARN: Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b
59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53
b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc
cc cc cc cc cc
[ 713656.183346] EMERG: Kernel panic - not syncing: softlockup: hung tasks
[ 713656.183356] WARN: CPU: 0 PID: 30306 Comm: tapdisk Tainted: G
O L 4.4.0+2 #1
[ 713656.183362] WARN: Hardware name: Inspur NF8460M3/NF8460M3, BIOS
4.1.01 11/29/2016
[ 713656.183368] WARN: 0000000000000000 ffff8801c68037c0
ffffffff8131ac13 ffffffff8180f35f
[ 713656.183376] WARN: 00000000000a242d ffff8801c6803840
ffffffff81156825 ffff8801c6803b80
[ 713656.183384] WARN: 0000000000000008 ffff8801c6803850
ffff8801c68037f0 00000000000000c5
[ 713656.183393] WARN: Call Trace:
[ 713656.183396] WARN: <IRQ> [<ffffffff8131ac13>] dump_stack+0x63/0x90
[ 713656.183410] WARN: [<ffffffff81156825>] panic+0xcb/0x241
[ 713656.183421] WARN: [<ffffffff8110bf94>] watchdog_timer_fn+0x1a4/0x1d0
[ 713656.183427] WARN: [<ffffffff8110bdf0>] ? watchdog+0x40/0x40
[ 713656.183440] WARN: [<ffffffff810d1904>] __hrtimer_run_queues+0x134/0x250
[ 713656.183447] WARN: [<ffffffff810d1d36>] hrtimer_interrupt+0xa6/0x180
[ 713656.183453] WARN: [<ffffffff8100c71e>] xen_timer_interrupt+0x2e/0x130
[ 713656.183459] WARN: [<ffffffff810bff2f>]
handle_irq_event_percpu+0x7f/0x1e0
[ 713656.183467] WARN: [<ffffffff810c347a>] handle_percpu_irq+0x3a/0x50
[ 713656.183491] WARN: [<ffffffff810bf732>] generic_handle_irq+0x22/0x30
[ 713656.183510] WARN: [<ffffffff813c648b>]
__evtchn_fifo_handle_events+0x14b/0x170
[ 713656.183516] WARN: [<ffffffff813c64c0>]
evtchn_fifo_handle_events+0x10/0x20
[ 713656.183523] WARN: [<ffffffff813c34da>] __xen_evtchn_do_upcall+0x4a/0x80
[ 713656.183529] WARN: [<ffffffff813c52b0>] xen_evtchn_do_upcall+0x30/0x50
[ 713656.183534] WARN: [<ffffffff815a26ae>]
xen_do_hypervisor_callback+0x1e/0x40
[ 713656.183540] WARN: [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
[ 713656.183546] WARN: [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
[ 713656.183551] WARN: [<ffffffff813c43fe>] ? xen_poll_irq_timeout+0x3e/0x50
[ 713656.183557] WARN: [<ffffffff813c5410>] ? xen_poll_irq+0x10/0x20
[ 713656.183562] WARN: [<ffffffff81014bc3>] ? xen_qlock_wait+0x33/0x40
[ 713656.183568] WARN: [<ffffffff810b1abc>] ?
__pv_queued_spin_lock_slowpath+0x17c/0x210
[ 713656.183575] WARN: [<ffffffff81156cb7>] ?
queued_spin_lock_slowpath+0xb/0xf
[ 713656.183581] WARN: [<ffffffff815a060f>] ? _raw_spin_lock+0x1f/0x30
[ 713656.183586] WARN: [<ffffffff8159e885>] ?
__mutex_unlock_slowpath+0x25/0x50
[ 713656.183593] WARN: [<ffffffff8159e8cb>] ? mutex_unlock+0x1b/0x20
[ 713656.183618] WARN: [<ffffffffa07286f6>] ?
ocfs2_dio_end_io+0x46/0x80 [ocfs2]
[ 713656.183624] WARN: [<ffffffff811ed1e6>] ? dio_complete+0x96/0x160
[ 713656.183630] WARN: [<ffffffff811ed38f>] ? dio_bio_end_aio+0xdf/0x100
[ 713656.183639] WARN: [<ffffffff812ea1a6>] ? bio_endio+0x46/0x70
[ 713656.183645] WARN: [<ffffffff812f16c7>] ? blk_update_request+0x197/0x2c0
[ 713656.183655] WARN: [<ffffffffa02c323e>] ?
end_clone_bio+0x5e/0x70 [dm_mod]
[ 713656.183660] WARN: [<ffffffff812ea1a6>] ? bio_endio+0x46/0x70
[ 713656.183666] WARN: [<ffffffff812f16c7>] ? blk_update_request+0x197/0x2c0
[ 713656.183679] WARN: [<ffffffffa00d08a4>] ?
scsi_end_request+0x34/0x1a0 [scsi_mod]
[ 713656.183691] WARN: [<ffffffffa00d37c9>] ?
scsi_io_completion+0x239/0x610 [scsi_mod]
[ 713656.183698] WARN: [<ffffffff810c0021>] ?
handle_irq_event_percpu+0x171/0x1e0
[ 713656.183709] WARN: [<ffffffffa00caf98>] ?
scsi_finish_command+0x108/0x120 [scsi_mod]
[ 713656.183721] WARN: [<ffffffffa00d2ed8>] ?
scsi_softirq_done+0x128/0x140 [scsi_mod]
[ 713656.183728] WARN: [<ffffffff812f8948>] ? blk_done_softirq+0x78/0x90
[ 713656.183741] WARN: [<ffffffff81076199>] ? __do_softirq+0x129/0x290
[ 713656.183753] WARN: [<ffffffff810764d2>] ? irq_exit+0x42/0x90
[ 713656.183764] WARN: [<ffffffff813c52b5>] ? xen_evtchn_do_upcall+0x35/0x50
[ 713656.183776] WARN: [<ffffffff815a26ae>] ?
xen_do_hypervisor_callback+0x1e/0x40
[ 713656.183786] WARN: <EOI> [<ffffffff810b16a2>] ?
__raw_callee_save___pv_queued_spin_unlock+0x2/0x20
[ 713656.183802] WARN: [<ffffffff8159e9ae>] ?
__mutex_lock_slowpath+0x9e/0x100
[ 713656.183814] WARN: [<ffffffff8159ea2f>] ? mutex_lock+0x1f/0x2f
[ 713656.183847] WARN: [<ffffffffa0746c3e>] ?
ocfs2_file_write_iter+0xa9e/0xd00 [ocfs2]
[ 713656.183860] WARN: [<ffffffff8100bb6e>] ? xen_pte_val+0xe/0x10
[ 713656.183892] WARN: [<ffffffffa07461a0>] ?
ocfs2_check_range_for_refcount+0x130/0x130 [ocfs2]
[ 713656.183905] WARN: [<ffffffff811fc099>] ? aio_run_iocb+0x229/0x270
[ 713656.183916] WARN: [<ffffffff81065f94>] ? get_user_pages_fast+0x104/0x160
[ 713656.183929] WARN: [<ffffffff813c1e45>] ? gnttab_batch_copy+0x25/0xe0
[ 713656.183942] WARN: [<ffffffff811a3813>] ? kmem_cache_alloc+0xb3/0x140
[ 713656.183953] WARN: [<ffffffff811cf7d3>] ? __fdget+0x13/0x20
[ 713656.183964] WARN: [<ffffffff811fd6b9>] ? do_io_submit+0x389/0x4a0
[ 713656.183976] WARN: [<ffffffff811fd7e0>] ? SyS_io_submit+0x10/0x20
[ 713656.183992] WARN: [<ffffffff815a096e>] ?
entry_SYSCALL_64_fastpath+0x12/0x71
-robin
More information about the Ocfs2-users
mailing list