[rds-devel] RE: [PATCH 1/2] RDS/IB: Handle connection request in case of failover.

Vladimir Sokolovsky vlad at mellanox.co.il
Sun Jun 10 07:59:30 PDT 2007


Hi Zach,
During the testing I get into the BUG_ON(ic->i_cm_id) (file:
net/rds/ib-cm.c line: 252)
So, checking _CONNECTED and _CONNECTING is not enough.

To reproduce:

* Setup:
	Node A, connected with 2 IB ports to the IB switch
	Node B, connected with 2 IB ports to the same IB switch

* Test:
	Run rds-sink on the node A
	Run rds-gen on the node B

	Then disconnect the active IB port on the node A
	After ~20 sec node A will call to rds_shutdown_worker
	Node B will get RDMA_CM_EVENT_CONNECT_REQUEST and then get into
BUG_ON(ic->i_cm_id)


For now I did not find the better way to fix this issue...

Regards,
Vladimir
 

> -----Original Message-----
> From: zab at oss.oracle.com [mailto:zab at oss.oracle.com] On 
> Behalf Of Zach Brown
> Sent: Wednesday, June 06, 2007 9:25 PM
> To: Vladimir Sokolovsky
> Cc: rds-devel at oss.oracle.com; Tziporet Koren; Chris Mason
> Subject: Re: [PATCH 1/2] RDS/IB: Handle connection request in 
> case of failover.
> 
> > Hi Zach,
> > Please review the following two patches:
> 
> Sure.  Your patches don't have complete descriptions, though, 
> so I'm forced to ask questions.
> 
> >         /*
> > +        * the connection request may occur while the
> > +        * previous connection exist. E.g. in case of failover
> > +        */
> 
> This logic seems to duplicate the logic under the test for 
> _CONNECTING, with the addition of setting i_wc_err.  Why do 
> these connection attempts get to the BUG_ON() instead of 
> noticing that _CONNECTING is set on the existing connection?  
> That should trigger a shutdown, just like these tests you added do.
> 
> - z
> 



More information about the rds-devel mailing list