[rds-devel] Re: process death - with outstanding rdma operations

Mon Jan 21 07:50:03 PST 2008

Or Gerlitz wrote:
> Richard Frank wrote:
>> Or Gerlitz wrote:
>
>> Good points Or - rdsv3.h describes the user interface and expected 
>> behavior - not the driver implementation..
>> We need to update this doc..
>
>>> 1. death of a remote process holding keys sent by local process/es
>
>> The rdma server does not hold the key - it's given permission to use 
>> it - and the permission can be revoked at anytime 1) explicitly by 
>> the client 2) if the client process dies - by the RDS driver...
>
> So a possible design you suggest here is that the client app would set 
> a per key timer and revoke the key when the timer expires, OK. As for 
> client process death making RDS to revoke the key, it means that RDS 
> has to manage per process book-accounting for registrations (namely 
> pages locked and keys) done by it on behalf of that process, correct?
Yes - I believe we are doing this now.

>
>> It's up to the client side app to decide if / when to revoke keys 
>> (release the keys).
>
>>> 2. whether an IB RC connection should be broken b/c one of the 
>>> processes using it has died or leaked resources
>
>> When a process dies all resources held by the driver must be cleaned 
>> up - this should not need to be doc'd ?
>> If a client process dies - and it's keys are released - and then 
>> subsequent use by the rdma server gets an access error - that's what 
>> we want - right ?
>
> I am not sure that the best way to go here is to break this IB RC 
> connection (under IB RC each completion with error moves the QP into 
> the error state) as soon as the client process dies, maybe wait some 
> grace period before revoking the keys (unpinning the pages etc) in the 
> hope that the remote server is done with them?
>
Currently, we let the transport deal with this - in the case of IB - the 
RC is broken when / if the rdma server attempts to use a key that is no 
longer valid resulting in an access error. In this case the RC is 
silently reconnected by the transport. Ideally, only the rdmas with 
access errors are dropped.

>> One addition recently proposed (with patch) is to have per GID RCs - 
>> this isolates behavior to processes that agree to play together....
>
> can you clarify this, I don't manage to follow?
>
The idea is to limit the potential bad behavior of processes (DOS) to 
the set of processes within a group - or more importantly to exclude 
processes outside of a group from affecting another group.

The RDS driver / transport - creates an RC per group - vs - system wide. 
Each group has a private RC that it is  sharing with all processes in 
the group ror send / rdma operations.

>> Why would the RDS driver / ULP deal with a poorly behaved client / 
>> rdma server ? Beyond the resource (key) quota - what else would you 
>> do ? Seems like a app issue ?
>
> b/c user space is not reliable (ie allowed to do DOS attack on the 
> system, if you like) and kernel modules are expected to be well 
> behaved, the way to go here seems to me as you agreed to, impose some 
> resource limitation on user space process registrations, similar to 
> the socket send_buf, etc.
>
Yes - understood - we need to limit the allocation of the "key" resource 
via a quota system..

>>> 4. no limitation on how many registrations a process can do
>
>> We need to limit the keys allocated by a process - one proposal 
>> discussed is to overload mlock - an alternative is to have a per 
>> socket limit similar to so_snd/rcv... via a new ioctl... I prefer the 
>> later - as the key pool is a very limited resource - or is it ?
>
> I am not following here as well, why not a setsockopt similar to 
> so_sndbf?
>
An alternative to using a new quota - say via a new ioctl to set the 
limit / and or / proc interface to set default values....is to overload 
mlock quota. Some issues with using mlock approach are - 1) we'd need 
add capability for non-root processes to set mlock 2) a single key can 
represent 1 to n pages of vm - so how does one set a reasonable mlock 
quota / limit for keys vs memory locked. ?

> Or
>
>
>