[rds-devel] soft lockups with rds

Or Gerlitz ogerlitz at voltaire.com
Sun Oct 26 08:40:09 PDT 2008


Hi Andy,

Doing some rds-stress runs with a RH5 system that uses ofed 1.3.1 I got 
bunch of soft lockups e.
g the one below, any idea? does this ring any bell?

Or.

> BUG: soft lockup detected on CPU#0!
>
> Call Trace:
>  <IRQ>  [<ffffffff800b2c85>] softlockup_tick+0xdb/0xed
>  [<ffffffff800933d1>] update_process_times+0x42/0x68
>  [<ffffffff80073d97>] smp_local_timer_interrupt+0x23/0x47
>  [<ffffffff80074459>] smp_apic_timer_interrupt+0x41/0x47
>  [<ffffffff8005bcc2>] apic_timer_interrupt+0x66/0x6c
>  [<ffffffff88709bc2>] :rds:rds_ib_send_cq_comp_handler+0x0/0x2a6
>  [<ffffffff88709ca3>] :rds:rds_ib_send_cq_comp_handler+0xe1/0x2a6
>  [<ffffffff88709c88>] :rds:rds_ib_send_cq_comp_handler+0xc6/0x2a6
>  [<ffffffff88309a6c>] :mlx4_core:mlx4_eq_int+0x3b/0x26f
>  [<ffffffff88309caf>] :mlx4_core:mlx4_msi_x_interrupt+0xf/0x17
>  [<ffffffff80010705>] handle_IRQ_event+0x29/0x58
>  [<ffffffff800b2fc4>] __do_IRQ+0xa4/0x105
>  [<ffffffff8006a193>] do_IRQ+0xe7/0xf5
>  [<ffffffff8005b649>] ret_from_intr+0x0/0xa
>  [<ffffffff8006267e>] _read_lock+0x4/0xc
>  [<ffffffff800127d9>] sock_def_readable+0x10/0x5f
>  [<ffffffff8001b329>] tcp_rcv_established+0x62c/0x917
>  [<ffffffff8003aaf8>] tcp_v4_do_rcv+0x2a/0x300
>  [<ffffffff800370f8>] ip_route_input+0xc5d/0xc8a
>  [<ffffffff80026d1e>] tcp_v4_rcv+0x9f6/0xa5f
>  [<ffffffff80033f5b>] ip_local_deliver+0x19d/0x263
>  [<ffffffff80035033>] ip_rcv+0x49c/0x4df
>  [<ffffffff8001fd92>] netif_receive_skb+0x33c/0x3ba
>  [<ffffffff88584d9f>] :ib_ipoib:ipoib_ib_handle_rx_wc+0x407/0x43c
>  [<ffffffff88585d8d>] :ib_ipoib:ipoib_poll+0x9f/0x18b
>  [<ffffffff8000c39b>] net_rx_action+0xa4/0x1a5
>  [<ffffffff80011c19>] __do_softirq+0x5e/0xd5
>  [<ffffffff8005c330>] call_softirq+0x1c/0x28
>  <EOI>  [<ffffffff8006a310>] do_softirq+0x2c/0x85
>  [<ffffffff8008e619>] local_bh_enable_ip+0x48/0x59
>  [<ffffffff8003ccaa>] rt_run_flush+0x7f/0xb8
>  [<ffffffff8023c376>] ip_mc_dec_group+0x86/0xab
>  [<ffffffff8023d8ff>] ip_mc_drop_socket+0x4d/0x8b
>  [<ffffffff8023a447>] inet_release+0x1a/0x55
>  [<ffffffff80052c24>] sock_release+0x19/0x99
>  [<ffffffff80052e1e>] sock_close+0x2c/0x30
>  [<ffffffff80012281>] __fput+0xae/0x198
>  [<ffffffff8002362c>] filp_close+0x5c/0x64
>  [<ffffffff8001d5b2>] sys_close+0x88/0xa2
>  [<ffffffff8005f013>] sysenter_do_call+0x1b/0x67
>
>   
> BUG: soft lockup detected on CPU#1!
>
> Call Trace:
>  <IRQ>  [<ffffffff800b50fa>] softlockup_tick+0xd5/0xe7
>  [<ffffffff800930e2>] update_process_times+0x42/0x68
>  [<ffffffff800746e3>] smp_local_timer_interrupt+0x23/0x47
>  [<ffffffff80074da5>] smp_apic_timer_interrupt+0x41/0x47
>  [<ffffffff8005bc8e>] apic_timer_interrupt+0x66/0x6c
>  [<ffffffff80062b70>] _write_unlock_irqrestore+0x9/0xa
>  [<ffffffff88582b07>] :rds:rds_recv_incoming+0x1dc/0x1ed
>  [<ffffffff8858818f>] :rds:rds_ib_recv_cq_comp_handler+0x4b5/0x752
>  [<ffffffff88588e49>] :rds:rds_ib_send_cq_comp_handler+0x28f/0x2a6
>  [<ffffffff88188a7f>] :mlx4_core:mlx4_eq_int+0x3b/0x26f
>  [<ffffffff88188cc2>] :mlx4_core:mlx4_msi_x_interrupt+0xf/0x17
>  [<ffffffff800107a0>] handle_IRQ_event+0x29/0x58
>  [<ffffffff800b5482>] __do_IRQ+0xa4/0x105
>  [<ffffffff8006a3bd>] do_IRQ+0xe7/0xf5
>  [<ffffffff8005b615>] ret_from_intr+0x0/0xa
>  [<ffffffff880229e5>] :ehci_hcd:ehci_work+0x264/0x6e3
>  [<ffffffff880229d8>] :ehci_hcd:ehci_work+0x257/0x6e3
>  [<ffffffff8005b615>] ret_from_intr+0x0/0xa
>  [<ffffffff8802567c>] :ehci_hcd:ehci_irq+0x13d/0x156
>  [<ffffffff801dadd4>] usb_hcd_irq+0x27/0x55
>  [<ffffffff800107a0>] handle_IRQ_event+0x29/0x58
>  [<ffffffff800b5482>] __do_IRQ+0xa4/0x105
>  [<ffffffff880243f3>] :ehci_hcd:ehci_watchdog+0x0/0x61
>  [<ffffffff8006a3bd>] do_IRQ+0xe7/0xf5
>  [<ffffffff8005b615>] ret_from_intr+0x0/0xa
>  [<ffffffff88587cda>] :rds:rds_ib_recv_cq_comp_handler+0x0/0x752
>  [<ffffffff800928e1>] run_timer_softirq+0x12a/0x1b0
>  [<ffffffff80011cb4>] __do_softirq+0x5e/0xd5
>  [<ffffffff80151593>] end_msi_irq_w_maskbit+0xf/0x1c
>  [<ffffffff8005c2fc>] call_softirq+0x1c/0x28
>  [<ffffffff8006a53a>] do_softirq+0x2c/0x85
>  [<ffffffff8006a3c2>] do_IRQ+0xec/0xf5
>  [<ffffffff8005b615>] ret_from_intr+0x0/0xa
>  <EOI>  [<ffffffff800c2f8b>] __kzalloc+0x1a/0x21
>  [<ffffffff800c2f7a>] __kzalloc+0x9/0x21
>  [<ffffffff88581e0b>] :rds:rds_message_alloc+0x16/0x56
>  [<ffffffff8011c70b>] avc_has_perm+0x43/0x55
>  [<ffffffff885820cc>] :rds:rds_message_copy_from_user+0x29/0x135
>  [<ffffffff88583377>] :rds:rds_sendmsg+0xe6/0x4ca
>  [<ffffffff80052bd5>] sock_sendmsg+0xf3/0x110
>  [<ffffffff800884ac>] default_wake_function+0x0/0xe
>  [<ffffffff8009b446>] autoremove_wake_function+0x0/0x2e
>  [<ffffffff88587ba8>] :rds:rds_ib_inc_free+0x0/0x51
>  [<ffffffff80209069>] sys_sendmsg+0x217/0x28a
>  [<ffffffff80060fd6>] thread_return+0xad/0xeb
>  [<ffffffff8005b28d>] tracesys+0xd5/0xe0
>
>   





More information about the rds-devel mailing list