[rds-devel] soft lockups with rds

Richard Frank richard.frank at oracle.com
Tue Oct 28 10:22:33 PDT 2008


Can you provide the rds-stress incantations that produced these errors 
along with rds-info -c output...

Rick

Or Gerlitz wrote:
> Hi Andy,
>
> Doing some rds-stress runs with a RH5 system that uses ofed 1.3.1 I got 
> bunch of soft lockups e.
> g the one below, any idea? does this ring any bell?
>
> Or.
>
>   
>> BUG: soft lockup detected on CPU#0!
>>
>> Call Trace:
>>  <IRQ>  [<ffffffff800b2c85>] softlockup_tick+0xdb/0xed
>>  [<ffffffff800933d1>] update_process_times+0x42/0x68
>>  [<ffffffff80073d97>] smp_local_timer_interrupt+0x23/0x47
>>  [<ffffffff80074459>] smp_apic_timer_interrupt+0x41/0x47
>>  [<ffffffff8005bcc2>] apic_timer_interrupt+0x66/0x6c
>>  [<ffffffff88709bc2>] :rds:rds_ib_send_cq_comp_handler+0x0/0x2a6
>>  [<ffffffff88709ca3>] :rds:rds_ib_send_cq_comp_handler+0xe1/0x2a6
>>  [<ffffffff88709c88>] :rds:rds_ib_send_cq_comp_handler+0xc6/0x2a6
>>  [<ffffffff88309a6c>] :mlx4_core:mlx4_eq_int+0x3b/0x26f
>>  [<ffffffff88309caf>] :mlx4_core:mlx4_msi_x_interrupt+0xf/0x17
>>  [<ffffffff80010705>] handle_IRQ_event+0x29/0x58
>>  [<ffffffff800b2fc4>] __do_IRQ+0xa4/0x105
>>  [<ffffffff8006a193>] do_IRQ+0xe7/0xf5
>>  [<ffffffff8005b649>] ret_from_intr+0x0/0xa
>>  [<ffffffff8006267e>] _read_lock+0x4/0xc
>>  [<ffffffff800127d9>] sock_def_readable+0x10/0x5f
>>  [<ffffffff8001b329>] tcp_rcv_established+0x62c/0x917
>>  [<ffffffff8003aaf8>] tcp_v4_do_rcv+0x2a/0x300
>>  [<ffffffff800370f8>] ip_route_input+0xc5d/0xc8a
>>  [<ffffffff80026d1e>] tcp_v4_rcv+0x9f6/0xa5f
>>  [<ffffffff80033f5b>] ip_local_deliver+0x19d/0x263
>>  [<ffffffff80035033>] ip_rcv+0x49c/0x4df
>>  [<ffffffff8001fd92>] netif_receive_skb+0x33c/0x3ba
>>  [<ffffffff88584d9f>] :ib_ipoib:ipoib_ib_handle_rx_wc+0x407/0x43c
>>  [<ffffffff88585d8d>] :ib_ipoib:ipoib_poll+0x9f/0x18b
>>  [<ffffffff8000c39b>] net_rx_action+0xa4/0x1a5
>>  [<ffffffff80011c19>] __do_softirq+0x5e/0xd5
>>  [<ffffffff8005c330>] call_softirq+0x1c/0x28
>>  <EOI>  [<ffffffff8006a310>] do_softirq+0x2c/0x85
>>  [<ffffffff8008e619>] local_bh_enable_ip+0x48/0x59
>>  [<ffffffff8003ccaa>] rt_run_flush+0x7f/0xb8
>>  [<ffffffff8023c376>] ip_mc_dec_group+0x86/0xab
>>  [<ffffffff8023d8ff>] ip_mc_drop_socket+0x4d/0x8b
>>  [<ffffffff8023a447>] inet_release+0x1a/0x55
>>  [<ffffffff80052c24>] sock_release+0x19/0x99
>>  [<ffffffff80052e1e>] sock_close+0x2c/0x30
>>  [<ffffffff80012281>] __fput+0xae/0x198
>>  [<ffffffff8002362c>] filp_close+0x5c/0x64
>>  [<ffffffff8001d5b2>] sys_close+0x88/0xa2
>>  [<ffffffff8005f013>] sysenter_do_call+0x1b/0x67
>>
>>   
>> BUG: soft lockup detected on CPU#1!
>>
>> Call Trace:
>>  <IRQ>  [<ffffffff800b50fa>] softlockup_tick+0xd5/0xe7
>>  [<ffffffff800930e2>] update_process_times+0x42/0x68
>>  [<ffffffff800746e3>] smp_local_timer_interrupt+0x23/0x47
>>  [<ffffffff80074da5>] smp_apic_timer_interrupt+0x41/0x47
>>  [<ffffffff8005bc8e>] apic_timer_interrupt+0x66/0x6c
>>  [<ffffffff80062b70>] _write_unlock_irqrestore+0x9/0xa
>>  [<ffffffff88582b07>] :rds:rds_recv_incoming+0x1dc/0x1ed
>>  [<ffffffff8858818f>] :rds:rds_ib_recv_cq_comp_handler+0x4b5/0x752
>>  [<ffffffff88588e49>] :rds:rds_ib_send_cq_comp_handler+0x28f/0x2a6
>>  [<ffffffff88188a7f>] :mlx4_core:mlx4_eq_int+0x3b/0x26f
>>  [<ffffffff88188cc2>] :mlx4_core:mlx4_msi_x_interrupt+0xf/0x17
>>  [<ffffffff800107a0>] handle_IRQ_event+0x29/0x58
>>  [<ffffffff800b5482>] __do_IRQ+0xa4/0x105
>>  [<ffffffff8006a3bd>] do_IRQ+0xe7/0xf5
>>  [<ffffffff8005b615>] ret_from_intr+0x0/0xa
>>  [<ffffffff880229e5>] :ehci_hcd:ehci_work+0x264/0x6e3
>>  [<ffffffff880229d8>] :ehci_hcd:ehci_work+0x257/0x6e3
>>  [<ffffffff8005b615>] ret_from_intr+0x0/0xa
>>  [<ffffffff8802567c>] :ehci_hcd:ehci_irq+0x13d/0x156
>>  [<ffffffff801dadd4>] usb_hcd_irq+0x27/0x55
>>  [<ffffffff800107a0>] handle_IRQ_event+0x29/0x58
>>  [<ffffffff800b5482>] __do_IRQ+0xa4/0x105
>>  [<ffffffff880243f3>] :ehci_hcd:ehci_watchdog+0x0/0x61
>>  [<ffffffff8006a3bd>] do_IRQ+0xe7/0xf5
>>  [<ffffffff8005b615>] ret_from_intr+0x0/0xa
>>  [<ffffffff88587cda>] :rds:rds_ib_recv_cq_comp_handler+0x0/0x752
>>  [<ffffffff800928e1>] run_timer_softirq+0x12a/0x1b0
>>  [<ffffffff80011cb4>] __do_softirq+0x5e/0xd5
>>  [<ffffffff80151593>] end_msi_irq_w_maskbit+0xf/0x1c
>>  [<ffffffff8005c2fc>] call_softirq+0x1c/0x28
>>  [<ffffffff8006a53a>] do_softirq+0x2c/0x85
>>  [<ffffffff8006a3c2>] do_IRQ+0xec/0xf5
>>  [<ffffffff8005b615>] ret_from_intr+0x0/0xa
>>  <EOI>  [<ffffffff800c2f8b>] __kzalloc+0x1a/0x21
>>  [<ffffffff800c2f7a>] __kzalloc+0x9/0x21
>>  [<ffffffff88581e0b>] :rds:rds_message_alloc+0x16/0x56
>>  [<ffffffff8011c70b>] avc_has_perm+0x43/0x55
>>  [<ffffffff885820cc>] :rds:rds_message_copy_from_user+0x29/0x135
>>  [<ffffffff88583377>] :rds:rds_sendmsg+0xe6/0x4ca
>>  [<ffffffff80052bd5>] sock_sendmsg+0xf3/0x110
>>  [<ffffffff800884ac>] default_wake_function+0x0/0xe
>>  [<ffffffff8009b446>] autoremove_wake_function+0x0/0x2e
>>  [<ffffffff88587ba8>] :rds:rds_ib_inc_free+0x0/0x51
>>  [<ffffffff80209069>] sys_sendmsg+0x217/0x28a
>>  [<ffffffff80060fd6>] thread_return+0xad/0xeb
>>  [<ffffffff8005b28d>] tracesys+0xd5/0xe0
>>
>>   
>>     
>
>
>
> _______________________________________________
> rds-devel mailing list
> rds-devel at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/rds-devel
>   



More information about the rds-devel mailing list