[rds-devel] [PATCH net-next 0/3] RDS: TCP: HA/Failover fixes

Sowmini Varadhan sowmini.varadhan at oracle.com
Wed Nov 16 13:29:47 PST 2016


This series contains a set of fixes for bugs exposed when
we ran the following in a loop between a test machine pair:

 while (1); do
   # modprobe rds-tcp on test nodes
   # run rds-stress in bi-dir mode between test machine pair 
   # modprobe -r rds-tcp on test nodes
 done

rds-stress in bi-dir mode will cause both nodes to initiate
RDS-TCP connections at almost the same instant, exposing the 
bugs fixed in this series. 

Without the fixes, rds-stress reports sporadic packet drops,
and packets arriving out of sequence. After the fixes,we have
been able to run the  test overnight, without any issues.

Each patch has a detailed description of the root-cause fixed
by the patch.


Sowmini Varadhan (3):
  RDS: TCP: set RDS_FLAG_RETRANSMITTED in cp_retrans list
  RDS: TCP: Track peer's connection generation number
  RDS: TCP: Force every connection to be initiated by numerically
    smaller IP address

 net/rds/af_rds.c      |    4 ++++
 net/rds/connection.c  |    3 +++
 net/rds/message.c     |    1 +
 net/rds/rds.h         |    8 +++++++-
 net/rds/recv.c        |   36 ++++++++++++++++++++++++++++++++++++
 net/rds/send.c        |    9 +++++++--
 net/rds/tcp_connect.c |   14 +++++++++++++-
 net/rds/tcp_listen.c  |   29 ++++++++++++-----------------
 net/rds/tcp_send.c    |    3 +++
 9 files changed, 86 insertions(+), 21 deletions(-)




More information about the rds-devel mailing list