[rds-devel] Re: process death - with outstanding rdma operations

Tue Jan 22 05:07:28 PST 2008

Richard Frank wrote:
> Or Gerlitz wrote:

>> So a possible design you suggest here is that the client app would set 
>> a per key timer and revoke the key when the timer expires, OK. As for 
>> client process death making RDS to revoke the key, it means that RDS 
>> has to manage per process book-accounting for registrations (namely 
>> pages locked and keys) done by it on behalf of that process, correct?

> Yes - I believe we are doing this now.

Its unclear from your response if what you refer to the first case 
(client app revokes) or the second case (RDS driver revokes).

> longer valid resulting in an access error. In this case the RC is 
> silently reconnected by the transport. Ideally, only the rdmas with 
> access errors are dropped.

does it means that buddy rdma that complete with flush error are 
retried? I was thinking that rdmas are not reliable.

> The idea is to limit the potential bad behavior of processes (DOS) to 
> the set of processes within a group - or more importantly to exclude 
> processes outside of a group from affecting another group.
> 
> The RDS driver / transport - creates an RC per group - vs - system wide. 
> Each group has a private RC that it is  sharing with all processes in 
> the group ror send / rdma operations.

So with this design, from the view point of the IB stack, each RDS 
process group is associated with a port (listener) etc (pd,qp,fmr_pool) 
on the local node (vs the situation today is that rds calls rdma_listen 
once on one port)

How do you manage this - today RDS connects to the remote side based on 
the --ip-- address provided to sendmsg() and ignores the port, so now 
per process groups ports are exchanged out of band and RDS would take 
into account the port provided by the app when they call sendmsg()?

Or