[rds-devel] [External] : Re: [PATCH net v2] rds: filter RDS_INFO_* getsockopt by caller's netns
Maoyi Xie
maoyixie.tju at gmail.com
Mon May 11 06:41:58 UTC 2026
Hi Simon,
Thanks for the review.
> Does this early-out check using the global rds_sock_count break the
> namespace isolation and force callers to over-allocate memory?
Both effects are present. The size returned via lens to a probing
caller is still the global count. A caller in an isolated ns that
sizes its buffer to that value can also see ENOSPC on the second
call. The precheck compares against the global count. The data
written only covers entries in the caller's ns.
v3 addresses this. Each handler now does a first pass to count
entries in the caller's ns. The precheck uses that count. A second
pass fills the buffer. The change applies to four handlers:
rds_sock_info and rds6_sock_info in net/rds/af_rds.c, plus
rds_tcp_tc_info and rds6_tcp_tc_info in net/rds/tcp.c. lens->nr
now reflects the ns scoped count on both probe and full reads.
Re-verified on a KASAN VM. One AF_RDS socket is bound in init_net
to 127.0.0.1:4242. The attacker is the same process after
unshare(CLONE_NEWUSER | CLONE_NEWNET) and uid_map "0 0 1".
[init] count-probe rc=-1 errno=ENOSPC optlen_after=28 entries=1
[init] full-read rc=0 len=28 entries=1 (127.0.0.1:4242)
[attacker] count-probe rc=0 optlen_after=0 entries=0
[attacker] full-read rc=0 len=0 entries=0
Pre-v3 the precheck returned the global count of 1 to the attacker
via lens->nr on the zero-length probe. v3 returns 0.
> Can concurrent getsockopt calls trigger a NULL pointer dereference
> here?
Yes, the window looks reachable.
The writer takes rds_tcp_tc_list_lock and calls list_add_tail. It
releases the lock. Only after that it assigns tc->t_sock = sock.
The reader takes the same lock, walks the list, and dereferences
inet_sk(tc->t_sock->sk). There is no NULL check on the read side.
A reader that enters between the writer's spin_unlock and the
t_sock store observes a list entry whose t_sock is still NULL.
The companion restore_callbacks path is safe. list_del_init is
inside the lock. A reader holding the lock cannot observe the
unlinked entry. The matching tc->t_sock = NULL outside the lock
is then harmless. Another reader in the same file at line 676
already checks !tc->t_sock before use.
The smallest fix is to move tc->t_sock = sock into the
rds_tcp_tc_list_lock critical section in rds_tcp_set_callbacks,
just before list_add_tail. The list insertion and the t_sock
store then become atomic from the reader's view. The diff is
one line moved. It does not affect the callback_lock side
effects below.
This is independent of the netns filter. I have not built a
runtime PoC. The window is short. Does the code analysis above
match your reading? If yes, I can send this as a separate patch
with a Fixes tag.
Thanks,
Maoyi
More information about the rds-devel
mailing list