[rds-devel] [PATCH net] rds: ib: Fix NULL pointer dereference in debug code

David Miller davem at davemloft.net
Thu Nov 9 21:55:16 PST 2017


From: Håkon Bugge <Haakon.Bugge at oracle.com>
Date: Tue,  7 Nov 2017 16:33:34 +0100

> rds_ib_recv_refill() is a function that refills an IB receive
> queue. It can be called from both the CQE handler (tasklet) and a
> worker thread.
> 
> Just after the call to ib_post_recv(), a debug message is printed with
> rdsdebug():
> 
>             ret = ib_post_recv(ic->i_cm_id->qp, &recv->r_wr, &failed_wr);
>             rdsdebug("recv %p ibinc %p page %p addr %lu ret %d\n", recv,
>                      recv->r_ibinc, sg_page(&recv->r_frag->f_sg),
>                      (long) ib_sg_dma_address(
>                             ic->i_cm_id->device,
>                             &recv->r_frag->f_sg),
>                     ret);
> 
> Now consider an invocation of rds_ib_recv_refill() from the worker
> thread, which is preemptible. Further, assume that the worker thread
> is preempted between the ib_post_recv() and rdsdebug() statements.
> 
> Then, if the preemption is due to a receive CQE event, the
> rds_ib_recv_cqe_handler() will be invoked. This function processes
> receive completions, including freeing up data structures, such as the
> recv->r_frag.
> 
> In this scenario, rds_ib_recv_cqe_handler() will process the receive
> WR posted above. That implies, that the recv->r_frag has been freed
> before the above rdsdebug() statement has been executed. When it is
> later executed, we will have a NULL pointer dereference:
 ...
> This bug was provoked by compiling rds out-of-tree with
> EXTRA_CFLAGS="-DRDS_DEBUG -DDEBUG" and inserting an artificial delay
> between the rdsdebug() and ib_ib_port_recv() statements:
> 
>    	       /* XXX when can this fail? */
> 	       ret = ib_post_recv(ic->i_cm_id->qp, &recv->r_wr, &failed_wr);
> +		if (can_wait)
> +			usleep_range(1000, 5000);
> 	       rdsdebug("recv %p ibinc %p page %p addr %lu ret %d\n", recv,
> 			recv->r_ibinc, sg_page(&recv->r_frag->f_sg),
> 			(long) ib_sg_dma_address(
> 
> The fix is simply to move the rdsdebug() statement up before the
> ib_post_recv() and remove the printing of ret, which is taken care of
> anyway by the non-debug code.
> 
> Signed-off-by: Håkon Bugge <haakon.bugge at oracle.com>
> Reviewed-by: Knut Omang <knut.omang at oracle.com>
> Reviewed-by: Wei Lin Guay <wei.lin.guay at oracle.com>

Applied, thank you.



More information about the rds-devel mailing list