[rds-devel] [net, v2] net/rds: fix recursive lock in rds_tcp_conn_slots_available

Simon Horman horms at kernel.org
Wed Feb 18 10:02:06 UTC 2026


This is an AI-generated review of your patch. The human sending this
email says "I'm passing this on, not becuase of the first coment
regarding fan-out, which I think seems reasonable in it's current
form. But because of the second comment, at the end, regarding the
receive path. I'm not sure about that one either way, but it seems
worth passing on to you."

For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
net/rds: fix recursive lock in rds_tcp_conn_slots_available

This commit addresses a recursive lock warning reported by syzbot in
rds_tcp_get_peer_sport() by replacing kernel_getpeername() calls with
direct socket struct access to read the peer port number.

Closes: https://syzkaller.appspot.com/bug?extid=5efae91f60932839f0a5

> diff --git a/net/rds/connection.c b/net/rds/connection.c
> index 185f73b01694..a542f94c0214 100644
> --- a/net/rds/connection.c
> +++ b/net/rds/connection.c
> @@ -455,6 +455,9 @@ void rds_conn_shutdown(struct rds_conn_path *cp)
>  		rcu_read_unlock();
>  	}
>
> +	/* we do not hold the socket lock here but it is safe because
> +	 * fan-out is disabled when calling conn_slots_available()
> +	 */

Is this comment potentially misleading about the safety reasoning?

The comment says "we do not hold the socket lock here but it is safe
because fan-out is disabled." This is factually accurate - with fan_out
set to false, the code path that reads from the socket (via
rds_tcp_get_peer_sport) is never reached.

However, the phrasing could confuse readers about what happens if fan_out
were true but the lock weren't held. In that scenario, the new code
(reading inet_dport directly) would still be safe because inet_dport is an
atomic-width read of a stable field.

The comment conflates two independent safety properties: (1) the socket
isn't accessed at all when fan_out is false, and (2) if it were accessed,
the new implementation would still be safe without the lock.

Could the comment more precisely explain the safety reasoning?

>  	if (conn->c_trans->conn_slots_available)
>  		conn->c_trans->conn_slots_available(conn, false);
>  }

[ ... ]

> @@ -59,30 +59,12 @@ void rds_tcp_keepalive(struct socket *sock)
>  static int
>  rds_tcp_get_peer_sport(struct socket *sock)
>  {
> -	union {
> -		struct sockaddr_storage storage;
> -		struct sockaddr addr;
> -		struct sockaddr_in sin;
> -		struct sockaddr_in6 sin6;
> -	} saddr;
> -	int sport;
> -
> -	if (kernel_getpeername(sock, &saddr.addr) >= 0) {
> -		switch (saddr.addr.sa_family) {
> -		case AF_INET:
> -			sport = ntohs(saddr.sin.sin_port);
> -			break;
> -		case AF_INET6:
> -			sport = ntohs(saddr.sin6.sin6_port);
> -			break;
> -		default:
> -			sport = -1;
> -		}
> -	} else {
> -		sport = -1;
> -	}
> +	struct sock *sk = sock->sk;
> +
> +	if (!sk)
> +		return -1;
>
> -	return sport;
> +	return ntohs(inet_sk(sk)->inet_dport);
>  }

The commit message states "For rds_tcp_conn_slots_available() the lock is
already held because we are in the receiving path." Is this claim accurate
across all receive paths?

Looking at the two receive sub-paths:

1. In the backlog processing path (shown in the syzbot stack trace), the
socket IS owned because __release_sock() runs while the lock_sock
ownership bit is still set:

tcp_sock_set_cork() -> lock_sock() -> release_sock() -> __release_sock()
  -> sk_backlog_rcv() -> tcp_v6_do_rcv() -> tcp_rcv_established()
  -> tcp_data_ready() -> sk->sk_data_ready() -> rds_tcp_data_ready()
  -> rds_tcp_read_sock()

2. However, rds_tcp_data_ready() can also be called directly from the
normal softirq receive path via tcp_data_ready() -> sk->sk_data_ready(),
where the socket lock is NOT held. In this path, rds_tcp_read_sock()
calls tcp_read_sock() without lock_sock.

The fix is still correct in both cases because inet_dport is a stable
atomic-width field, but the claim "the lock is already held" is not
universally true for all receiving paths.

Should the commit message be more precise about when the lock is held?



More information about the rds-devel mailing list