[rds-devel] Questions on the RDMA interface

Olaf Kirch olaf.kirch at oracle.com
Fri Nov 23 04:21:49 PST 2007


Hi,

while merging the zerocopy changes into rds-stress, I ran into a
couple of things:

 -	The RDS_BARRIER setsockopt takes a struct rds_barrier_args,
	which passes an *address* to an RDMA id rather than the RDMA
	id itself. It seems that's because we want to return the
	ID of the last RDMA op that completed. Correct?

 -	Various RDMA related getsockopt calls pass pointers in a uint64_t,
	such as rdma_id_addr in rds_barrier_args, or key_addr and
	phy_addr in rds_get_mr_args. Why? Is that for 32bit/64bit user
	space/kernel combinations?

 -	The way we use the barrier call with MSG_DONTWAIT is a bit weird.

	Calling rds_barrier with a rdma_wait_id of 0 does what I
	would expect the call to do in non-blocking mode, ie it
	obtains the ID of the last RDMA operation that completed.

	Specifying MSG_DONTWAIT does much more - it tells the kernel
	to wake us up (from poll or whatever) when the given RDMA ID
	has completed. So at a minimum, we should name the flag
	differently (RDS_RDMA_WAKE_BARRIER or some such)

 -	rds_rdma_args can take an arbitrary number of local iovecs and a
	single remote iovec. Why?

 -	rds_iovec.bytes is a 64bit quantity, but RDMA transfers are limited
	to 2^31 bytes, according to the spec. The kernel should enforce this
	limit, or we should reflect that in the typedef.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
okir at lst.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax



More information about the rds-devel mailing list