[rds-devel] [PATCH net 1/2] RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock

Sowmini Varadhan sowmini.varadhan at oracle.com
Mon May 2 09:37:15 PDT 2016


On (05/02/16 09:20), Santosh Shilimkar wrote:
> > 	rds_conn_transition(conn, RDS_CONN_DOWN, RDS_CONN_CONNECTING);
> >+	if (rs_tcp->t_sock) {
> >+		/* Need to resolve a duelling SYN between peers.
> >+		 * We have an outstanding SYN to this peer, which may
> >+		 * potentially have transitioned to the RDS_CONN_UP state,
> >+		 * so we must quiesce any send threads before resetting
> >+		 * c_transport_data.
> >+		 */
> >+		wait_event(conn->c_waitq,
> >+			   !test_bit(RDS_IN_XMIT, &conn->c_flags));
> Would it be good to check the return value of rds_conn_transition()
> since if CONN is already UP above will fail and then send message
> might again race and we will let message through even though passive
> hasn't finished its connection.

no, that was the original issue that I was running into, which needed
commit 241b2719 - prior to that commit, if the conn was already UP,
we'd end up doing a rds_conn_drop on a good connection, and both sides
would end up in a pair of infinite 3WH loops. Even if we dont do
a rds_conn_drop on the UP connection, we've just (before
rds_tcp_accept_one) sent out a syn-ack on the incoming syn, and now
need to RST that syn-ac.  The other side is going to receive the rst,
and get confused about what to clean up (since there's already an UP
connection going on).

In short, when there is a duel, it's cleanest to have a deterministic
arbitration- both sides use the numeric value of saddr and faddr to 
figure out which side is active, which side is passive. (Thus the
basis on the BGP router-id based model for 241b2719)

FWIW, much of this is actually a corner case-  in practice, its not
frequent to have syns crossing each other at "almost the same time".

--Sowmini




More information about the rds-devel mailing list