[rds-devel] So what is an RDS barrier good for ?

Richard Frank richard.frank at oracle.com
Wed Nov 14 17:32:04 PST 2007


RDS barriers are used to detect:

a) Ownership of RDMA source buffers is returned to RDS client.
b) All other RDMA operations with identifiers preceding last completed 
RDMA to a destination are complete both from a local buffer ownership 
perspective and from a remote completion perspective. The definition of 
remote completion is that the RDMA has completed remotely to/from host 
memory.

An RDS barrier takes in an rdma_id of an operation to test for 
completion and always returns the last completed RDMA operation for the 
specified destination. It is legal to specify an rdma id of zero - just 
to get the last rdma operation that completed.

When an RDMA operation is initiated ownership of the source buffers is 
given to RDS.

After issuing an RDMA a barrier is used to determine when RDS is done 
with the local source buffers thus ownership of the buffers is returned 
to the client.

The implementation of the barrier is very light weight and it always 
returns the last completed rdma_id to the specified destination.

Keep in mind that rdma_ids completed are specific to a destination. The 
completion of rdma_id 10 to destination x does not imply that rdma_id 9 
to destination y is complete.

Barrier operations can be async (nonblocking) or sync - wait for a 
specific RDMA operation to complete.

For blocking barrier operations, if the rdma_id specified is not 
complete, then the caller is waited until the rdma is complete. Note 
that the barrier can return early, even if the rdma is not complete. A 
status of success indicates the rdma completed, eagain says try again, 
otherwise some other error occurred.

For non-blocking barrier operations, if the rdma_id specified is not 
complete, then eagain is returned. Additionally, the local socket used 
in the barrier is armed with the rdma_id specified. Arming of the socket 
allows a subsequent poll call to only wake when the armed rdma_id 
completes - vs - a non-armed socket which would wake from poll when any 
rdma completes.

Consider the case of issuing ten rdma operations to a destination - and 
then calling poll to wait for the next rdma to complete. In theory poll 
would wake ten times - with subsequent calls to RDS barrier to see which 
rdma has completed.  By arming the socket with the id of the last rdma 
operation initiated (tenth) - a subsequent poll call only wake when the 
last rdma completes. Additionally - a single call to rds barrier with 
the id of the tenth rdma operation that returns with success indicates 
that all ten rdma operations are complete. To summarize, with socket 
arming via a barrier operation - a single poll wake and barrier call 
detect that all ten operations are complete - vs - ten polls + ten 
barrier calls.




More information about the rds-devel mailing list