[rds-devel] [net-next, v4, 2/2] net/rds: Give each connection path its own workqueue
Jakub Kicinski
kuba at kernel.org
Fri Jan 9 02:51:52 UTC 2026
This is an automated AI-generated review of your patch.
For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
commit 95959b9817fcb113ceda9dcddc66514d2407af44
Author: Allison Henderson <allison.henderson at oracle.com>
net/rds: Give each connection path its own workqueue
This patch allocates per-connection-path workqueues instead of using
the shared rds_wq. This improves RDS failover performance by allowing
connections to process work independently.
> diff --git a/net/rds/connection.c b/net/rds/connection.c
> index dc7323707f45..3743940423c8 100644
> --- a/net/rds/connection.c
> +++ b/net/rds/connection.c
> @@ -269,7 +269,11 @@ static struct rds_connection *__rds_conn_create(struct net *net,
> __rds_conn_path_init(conn, &conn->c_path[i],
> is_outgoing);
> conn->c_path[i].cp_index = i;
> - conn->c_path[i].cp_wq = rds_wq;
> + conn->c_path[i].cp_wq =
> + alloc_ordered_workqueue("krds_cp_wq#%lu/%d", 0,
> + rds_conn_count, i);
> + if (!conn->c_path[i].cp_wq)
> + conn->c_path[i].cp_wq = rds_wq;
> }
> rcu_read_lock();
> if (rds_destroy_pending(conn))
> @@ -278,6 +282,9 @@ static struct rds_connection *__rds_conn_create(struct net *net,
> ret = trans->conn_alloc(conn, GFP_ATOMIC);
> if (ret) {
> rcu_read_unlock();
> + for (i = 0; i < npaths; i++)
> + if (conn->c_path[i].cp_wq != rds_wq)
> + destroy_workqueue(conn->c_path[i].cp_wq);
> kfree(conn->c_path);
The workqueues allocated above are properly destroyed here when
trans->conn_alloc() fails, but what happens in the "race to create"
paths later in this function?
When two threads race to create the same connection, one will win and
the other must discard its connection. Looking at the existing code
after this diff (around the "Creating passive conn" and "Creating
normal conn" comments), those cleanup paths call trans->conn_free(),
kfree(conn->c_path), and kmem_cache_free(), but they do not destroy
the per-path workqueues.
For example, in the "found" path for normal connections:
if (found) {
for (i = 0; i < npaths; i++) {
cp = &conn->c_path[i];
if (cp->cp_transport_data)
trans->conn_free(cp->cp_transport_data);
}
kfree(conn->c_path);
kmem_cache_free(rds_conn_slab, conn);
conn = found;
}
Could this leak the workqueues allocated earlier in the function?
Similar cleanup appears needed for the passive connection race path.
Note that destroy_workqueue() cannot be called while holding the
rds_conn_lock spinlock since it may sleep, so the fix would require
restructuring the cleanup to destroy workqueues before acquiring
the lock, similar to the trans->conn_alloc() failure path above.
[ ... ]
More information about the rds-devel
mailing list