[rds-devel] So what is an RDS rdma operation anyway ?

Wed Nov 14 17:31:11 PST 2007

RDS rdma operations allow RDS clients to issue a remote direct memory 
access (rdma) to a remote host's memory using an RDS socket and 
additional RDS services.

RDS rdma operations require RDS transports capable of RDMA operations - 
such as IB or IWARP.

Today, RDS RDMA operations are not reliable. Yea, I know - then why are 
they performed by the RDS ULP ? Call it circumstance, or just a fact of 
life - either is fine.  The reason they are not reliable is more a 
practical issue - as it would be possible to make them reliable.  If we 
wanted them to be reliable (survive path and adapter failures) we'd need 
to deal with multiple rdma keys - as today rdma keys are scoped to 
individual adapters. So for simplicity they are not reliable.

Therefore, it is up to clients to either use synchronous RDMA operations 
which return RDMA status - or to submit rdma operations asynchronously 
and detect failed rdma operations (no response from rdma server 
timeout). Since it is uncommon for network paths to fail there are many 
advantages to issuing rdma operations asynchronously most notable - IOP 
rates.

So how do we perform async RDMA operations ?

RDMA operations can be blocking or non-blocking. For blocking rdma 
operations, when the RDS rdma operation returns - ownership of the 
buffers are returned to the RDS client and status of the RDMA operation 
is returned.

Assuming a non-blocking RDS socket, once the send with rdma passes arg 
checking - the send operation returns. When the service returns it only 
implies that the arguments have passed checking and the rdma operation 
is queued for delivery it may or may not be on the wire. The service 
also returns an RDMA identifier which is relative to the destination of 
the rdma operation.

Steps to perform an async RDS rdma:

1) create an RDS socket .
2) the rdma client requests RDS to create a rdma key for a local buffer 
to be used as either the source
or destination of an rdma operation.
3) the rdma client sends the rdma key along with the description of the 
local buffer and an indication either read or write the buffer to a 
remote rdma server (host).
4) the rdma server issues an rdma read or write via RDS providing the 
rdma key and remote buffer description and optionally provides immediate 
data to be delivered along with the rdma as a normal message to the 
requesting host. RDS returns an rdma operation identifier for the rdma 
initiated. While the rdma is in progress ownerhip of the local source / 
destination buffers belongs to RDS.
5) the rdma server uses an rds barrier operation to detect when RDS has 
returned ownership of local buffers used in the rdma operation. When 
ownership is returned, the client can re-use the buffers.
6) the rdma client host recv's the immediate data message which 
indicates completion of the requested rdma.
7) the rdma client free's the rdma key.