[rds-devel] RDS status and future plans

Wed Oct 28 17:55:03 PDT 2009

Hi all,

I wanted to send out an update email, in case all you lurkers are
interested in how things are going, and our future plans.

Our current status is that we have finally transitioned internally off
of OFED 1.3.1-based to 1.4.2-based. Which I am very happy about :)

The RDS changes for OFED 1.5 are pretty modest -- we reintroduced tcp
support and modularized things a bit. The 1.5 development cycle felt
pretty short, actually. RDS/TCP is not new code, but virtualization has
given it a new reason to exist that may make RDS in 1.5 worth the upgrade.

Things should get more interesting for RDS in OFED 1.6. We have started
hashing out its proposed feature set, and so far have come up with the
list below. Aside from the first task, atomics, choosing which tasks
from this list we're going to work on is still not complete, so if you
have any comments either way on these, or suggestions for other
improvements, those would be most appreciated!

Here's the list so far:

# Atomics
  * Define and document additional CMSG-based API
  * Re-arch send path -- decompose into 1 rds_message per op, instead of
1 rds_message per syscall
  * iWARP and TCP transports will need emulation
# RDMA Emulation
  * Allows clients to assume RDMA and let us handle the details
  * Helps TCP transport, as well as iWARP loopback
# Multiple (num_cpus? num_sockets?) rds connections between two nodes
  * Help alleviate head-of-line blocking
  * Includes SRQ to keep memory use in line.
  * May also improve locking overhead or cache locality
# Better SMP scaling
  * Make workqueue per-cpu - raise IOP rate by parallelizing
  * Do work in tasklets or workqueues so RDS can't hose the system
  * Multiple TX/RX queues enable handling datagrams on same CPU as
destination process?
# Netstat support
  * Use normal kernel mechanisms instead of rds-info, as much as possible
# More detailed dynamic tracing
  * Subsumed by new std kernel facilities?
# SWI - Send with Invalidate - Reduce manual key invalidations (sync TPT
thread)
  * after op with USE-ONCE key is complete, server invalidates key on
client with SWI
  * Protocol change, requires feature negotiation on connection
establishment
# Fast registration work requests
  * Possible external contribution?
  * Get us on standardized IB interfaces
  * ib/iwarp transport re-unification?
# rds-stress using multiple IPs
# Figure out why we see performance imbalance (1771).
# Zero copy bcopy send interface
# Protection domain interface... allows clients to define private
protection domains.. for isolation.
# Reduced path lengths
  * Reduce or eliminate allocs in rds_rdma_map, rds_rdma_prepare for all
sizes if possible, but at least <= 8KB ops
  * Time TX/RX hot paths for additional path length shortening
opportunities
# Reduce interrupts per xfer
  * Look at local (SIGNALED) and remote (SOLICITED) notifications for
extraneous notifications
  * Eliminate per-msg local send complete notifications in favor of
every-N-WRs
  * send q, switch from notify_cq(NEXT_COMP) to notify_cq(SOLICITED)
# Dead/idle Connection cleanup
  * Connections to nonexistent machines totally useless (WIP)
  * Tear down connection if idle for 5(?) minutes?
# Fix outstanding RDS bugs in bugzilla

Regards -- Andy