[rds-devel] What is the road map for RDS ?

Thu Nov 15 09:27:59 PST 2007

_RDS V1 -> available for Oracle 10g R2:_

-  Low latency bcopy sends / recvs.
-  Proprietary stack built by Silver Storm (now Qlogic)
-  certified for use with Oracle 10g R2.
-  Some part of stack was contributed to Open IB / ported to Open IB.

This stack is still in use - has shown to be very stable - and provide 
outstanding performance for Oracle.

We achieved world record record TPCH performance and have several 
testimonials from customers.

Several presentations are online at www.openib.org -> past conferences - 
or just search on the web.

When we started RDS v1 our clear goal was to have a single open source 
RDS implementation for Linux, and as well, ideally that all other 
platforms would port it !

RDS V1 - and its success - were the genesis of OFED RDS v2.

_RDS v2 -> in certification for Oracle 11g R1._

RDS v2 from an interface and behavioral perspective - is essentially RDS 
v1 with very minor interface changes - towards the direction of a purer 
socket interface and some HA model simplifications.

However RDS v2 is all new code - in Open Source - available to all under 
dual licensing.

Oracle 11g R1 will only run on / support RDS v2 from OFED.

RDS v2 is in OFED 1.2.5.2.

Several customers are testing / evaluating RDS v2.

_RDS V3 -> under devlopment_

- Zero copy rdma operations - this is not zero copy send / recv.

- Is RDS v2 driver with additional interfaces for zero copy extensions.

- Maintains full compatibility with RDS v2 applications.

- Maintains full compatibility with RDS v2 wire protocol.

- Planned for OFED 1.3.

- Our original goal was to push for documented wire protocol and 
interoperability between platforms (Linux to HP, etc). However, based on 
the time line for OFED 1.3 - this work will not be in OFED 1.3 but is 
now planned for later OFED release.

_RDS v3.n - interoperability and docuemented wire protocol
_
- yep - it's time to get this done.

- Oracle requires interoperability so for example - Linux systems and 
AIX may be exchanging data over RDS including zero copy operations.

- We may form a working group to meet on a regular basis for the purpose 
of producing specifications for wire protocol, interoperability testing 
suites, etc.

_RDS V4 - zero copy recvs (and possibly sends)._

Why do zero copy recvs -> please use the app buffers !

The largest consumer of memory is recv side message processing.

With bcopy - these buffers are staged in kernel memory - a precious and 
limited resource.

If we had unlimited recv side memory - we would not need the complex 
congestion management model in RDS today.

Today, the Oracle IPC clients are already pre-posting buffers - at the 
user level - to receive incoming messages.

However, we currently can not pre-post recv buffers to the socket interface.

Further, the IPC clients are implementing flow control based on their 
buffer models.

If we can post the user buffers to RDS directly - then:

- we do not need to stage recv data into kernel memory (well maybe for 
some edge conditions).
- we can get rid of the congestion protocol.
- we get zero copy for recv messages.

To do pre-posting will require an async RDS recv buffer posting interface.

Why are zero copy sends - less interesting ?

To post a buffer for zero copy send - requires an async posting 
interface and a way to reap async send completions. Not too bad - we can 
do this.

However, the real issue is that zero copy sends do not complete until 
they are posted on the wire - vs - in bcopy; the the send is complete as 
soon as the data is copied into a kernel buffer.

Oracle clients gain significant performance advantages from this 
optimization - mostly due to their internal memory models. We 
investigated this optimization with uDAPL / uTAPI the net result being 
very little gain in reduced CPU cycles - and increased latency for send 
completions.