[rds-devel] [PATCH RFC RFC] rds: Use NETDEV_UNREGISTER in rds_tcp_dev_event() (then kill NETDEV_UNREGISTER_FINAL)

Sowmini Varadhan sowmini.varadhan at oracle.com
Fri Mar 16 10:29:18 PDT 2018


I had taken some of this offline, but it occurs to me
that some of these notes should be saved to the netdev archives, 
in case this question pops up again in a few years.

When I run your patch, I get a repeatable panic by doing
  modprobe rds-tcp
  ip netns create blue
the panic is because we are finding a null trn in rds_tcp_init_net.

I think there's something very disturbed about calling
register_pernet_operations() twice, once via 
register_pernet_device() and again via register_pernet_subsys().

I suspect this has everything to do with the panic but I have
not had time to debug every little detail here.

In general, rds_tcp is not a network device, it is a kernel
module.  That is the fundamental problem here. 

To repeat the comments form net_namespace.h:
 * Network interfaces need to be removed from a dying netns _before_
 * subsys notifiers can be called, as most of the network code cleanup
 * (which is done from subsys notifiers) runs with the assumption that
 * dev_remove_pack has been called so no new packets will arrive during
 * and after the cleanup functions have been called.  dev_remove_pack
 * is not per namespace so instead the guarantee of no more packets
 * arriving in a network namespace is provided by ensuring that all
 * network devices and all sockets have left the network namespace
 * before the cleanup methods are called.

when the "blue" netns starts up, it creates at least one kernel listen
socket on *.16385. This socket, and any other child/client sockets 
created must be cleaned up before the cleanup_net can happen.

This is why I chose to call regster_pernet_subsys. Again, as per
comments in net_namespace.h:

 * Use these carefully.  If you implement a network device and it
 * needs per network namespace operations use device pernet operations,
 * otherwise use pernet subsys operations.

On (03/16/18 18:51), Kirill Tkhai wrote:
> > Let's find another approach. Could you tell what problem we have in 
> > case of rds_tcp_dev_ops is declared as pernet_device?

As above, rds-tcp is not a network device.

> One more question. Which time we take a reference of loopback device?
> Is it possible before we created a net completely?

We dont take a reference on the loopback device. 
We make sure none of the kernel sockets does a get_net() so
that we dont block the cleanup_net, and then, when all
the network interfaces have been taken down (loopback is
the last one) we know there are no more packets coming in
and out, so it is safe to dismantle all kernel sockets 
created by rds-tcp.

Hope that helps.

--Sowmini



More information about the rds-devel mailing list