[rds-devel] The meaning of MR invalidation

Or Gerlitz ogerlitz at voltaire.com
Thu Feb 14 06:31:29 PST 2008


Or Gerlitz wrote:
> On 2/13/08, Olaf Kirch <olaf.kirch at oracle.com> wrote:
>> That's one of the things I have no real understanding of. What is the
>> actual difference in performance when you use an FMR exactly once?

> Let me think about this and check with the Mellanox architects,

Hi Olaf,

For every incoming RDMA IB packet the HCA does TPT cache lookup.

Hence, if the I/Os served by specific mappings (rkey) are large in size, 
such that they span m >> 1 IB MTU sized packets (for example the IB MTU 
is 2K and the I/O is 1M so 256 IB packets are needed to serve the RDMA 
operation) after one cache miss under which the HCA have to issue a look 
up in its network MMU, you might have all the other packets being served 
by the cache.

When there are multiple I/Os are running in parallel, and each being 
served by different FMR --> different rkey --> different cache slots, 
first, they all compete on the cache and second, since fmr_unmap does 
SYNC_TPT which flushes the cache, one have to try and avoid calling 
fmr_unmap when possible.

So basically, when each fmr is remapped n times, over time, you get less 
SYNC_TPT calls compared to the case where each fmr is mapped once before 
moved to the unmap queue. However, if you use enough FMRs such that you 
don't call SYNC_TPT "too much" the use-once design should function quite 
well compared to use-n design.

For example, a scheme where you have to serve 1000 1MB IOs/sec, and you 
alloc 5k FMRs and once every 4 seconds you unmap 4K FMRs from a 
background thread, might work quite good, but this has to be validated 
ofcourse.


> the flow I see for rds in that case would be something like:

> rds_pool_start: alloc N FMRs

> rds_pool_get: get FMR from the free list and map it
> rds_pool_put: put the used FMR in the dirty list

> rds_unmap_background_thread:  if the dirty list size > M call
> fmr_unmap on the M FMRs in the dirty list and then return them to the free list

> rds_pool_stop: unalloc N FMRS

Now, if you are willing to go with that approach, it means that in case 
core fmr pool API is enhanced such that you can --specify-- how many 
times an fmr can be mapped before its queued for unmap, RDS should be 
able to use this cache again and not have one of its own!

Or





More information about the rds-devel mailing list