[rds-devel] Has anyone tried RDS on IA64 Linux?

Pradeep pradeep at cup.hp.com
Mon Apr 28 14:02:48 PDT 2008


Hello,

I've installed 
http://www.openfabrics.org/downloads/OFED/ofed-1.3-daily/OFED-1.3-20080408-0623.tgz
on two of IA-64 Linux systems(with Red Hat Enterprise Linux AS release 
4) and tried running rds-stress.
Occasionally I'm hitting a panic in connection establishment path.

RDS/IB: incoming connect while connecting
Unable to handle kernel paging request at virtual address a0109f8899a5e430
krdsd[3885]: Oops 8813272891392 [1]
Modules linked in: rds(U) rdma_ucm(U) ib_sdp(U) rdma_cm(U) iw_cm(U) 
ib_addr(U) ib_ipoib(U) ib_cm(U) ib_sa(U) md5 ipv6 ib_uverbs(U) 
ib_umad(U) mlx4_ib(U) mlx4_core(U) vfat fat dm_multipath dm_mod sr_mod 
usb_storage button joydev ohci_hcd ehci_hcd ib_mthca(U) ib_mad(U) 
ib_core(U) shpchp cxgb3(U) tg3 ext3 jbd mptscsih mptsas mptspi mptfc 
mptscsi mptbase sd_mod scsi_mod

Pid: 3885, CPU 0, comm:                krdsd
psr : 0000101008126030 ifs : 8000000000000002 ip  : 
[<a00000010045c4a0>]    Not tainted
ip is at mark_clean+0x80/0xe0
unat: 0000000000000000 pfs : 0000000000000812 rsc : 0000000000000003
rnat: 0000000000000000 bsps: 0000000000000000 pr  : 0000000000005a81
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70433f
csd : 0000000000000000 ssd : 0000000000000000
b0  : a00000010045cca0 b6  : a00000010045da00 b7  : a00000010045da00
f6  : 10004bc00000000000000 f7  : 1003e000000000000002f
f8  : 000000000000000000000 f9  : 1000b8000000000000000
f10 : 1003e00000000000007f8 f11 : 1003e0000000000000000
r1  : a0000001009bb9e0 r2  : 526d2f677472c000 r3  : 000203f12f50bc86
r8  : 00024da5ecee8e50 r9  : 000049b4bd9dd1ca r10 : a0000001007bcbb8
r11 : e00000003fcba2a0 r12 : e00000003bf97cc0 r13 : e00000003bf90000
r14 : ffffffffffffc000 r15 : 526d2f677472ad2e r16 : a0007fff1f200000
r17 : a0109f8899a5e430 r18 : 0000000000000400 r19 : 526d2f67d7a0d4a3
r20 : a0000001007cec98 r21 : e00000003ed85838 r22 : e0000003ffdf6a80
r23 : a0007fff2000c078 r24 : a0007fff2006d1d0 r25 : a0007fff2000c070
r26 : 0000000000004000 r27 : a00000010045da00 r28 : a0000001007d27c0
r29 : e00000003ed85838 r30 : e0000003f0420000 r31 : 0000000000000000

Call Trace:
 [<a000000100016da0>] show_stack+0x80/0xa0
                                sp=e00000003bf97850 bsp=e00000003bf91158
 [<a0000001000176b0>] show_regs+0x890/0x8c0
                                sp=e00000003bf97a20 bsp=e00000003bf91110
 [<a00000010003e8f0>] die+0x150/0x240
                                sp=e00000003bf97a40 bsp=e00000003bf910d0
 [<a0000001000644a0>] ia64_do_page_fault+0x8c0/0xbc0
                                sp=e00000003bf97a40 bsp=e00000003bf91068
 [<a00000010000f600>] ia64_leave_kernel+0x0/0x260
                                sp=e00000003bf97af0 bsp=e00000003bf91068
 [<a00000010045c4a0>] mark_clean+0x80/0xe0
                                sp=e00000003bf97cc0 bsp=e00000003bf91058
 [<a00000010045cca0>] sba_unmap_single+0x5a0/0x800
                                sp=e00000003bf97cc0 bsp=e00000003bf90fd0
 [<a00000010045da70>] sba_unmap_sg+0x70/0xa0
                                sp=e00000003bf97cc0 bsp=e00000003bf90f98
 [<a000000200924d00>] rds_ib_send_clear_ring+0x160/0x1e0 [rds]
                                sp=e00000003bf97cc0 bsp=e00000003bf90f58
 [<a00000020091fd80>] rds_ib_conn_shutdown+0x9c0/0xc80 [rds]
                                sp=e00000003bf97cc0 bsp=e00000003bf90ef8
 [<a000000200916690>] rds_shutdown_worker+0x250/0x680 [rds]
                                sp=e00000003bf97db0 bsp=e00000003bf90ed0
 [<a0000001000a3b70>] worker_thread+0x430/0x580
                                sp=e00000003bf97db0 bsp=e00000003bf90e70
 [<a0000001000aec20>] kthread+0x1e0/0x240
                                sp=e00000003bf97e20 bsp=e00000003bf90e38
 [<a000000100018c70>] kernel_thread_helper+0x30/0x60
                                sp=e00000003bf97e30 bsp=e00000003bf90e10
 [<a000000100008c60>] start_kernel_thread+0x20/0x40
                                sp=e00000003bf97e30 bsp=e00000003bf90e10
Kernel panic - not syncing: Fatal exception

Config:
Two rx2660 systems with dual-core Montecitos:
OS: Red Hat Enterprise Linux AS release 4
2.6.9-42.0.10.EL #1 SMP Fri Feb 16 17:01:51 EST 2007 ia64 ia64 ia64 
GNU/Linux
IB card: ConnectX

Another panic is seen when large buffers are used for RDMA(1MB):

[root at hpatm125 ~]# rds-stress -r 10.0.0.125 -s 10.0.0.4 -p 4000 -t4 -d8 
-D1048576 -T10
connecting to 10.0.0.4:4000
negotiated options, tasks will start in 2 seconds
Starting up..Kernel panic - not syncing: 
arch/ia64/hp/common/sba_iommu.c: I/O MMU @ c0000000fed01000 is out of 
mapping resources

[ didn't print a stack trace in this case ]


Thanks,
Pradeep



More information about the rds-devel mailing list