[DTrace-devel] test time-out signal

Kris Van Hees kris.van.hees at oracle.com
Mon Jun 29 22:50:39 PDT 2020


On Mon, Jun 29, 2020 at 10:26:12PM -0700, Eugene Loh wrote:
> We create uprobes and kprobes for some providers (dtrace and fbt) and 
> try to clean these probes up when we finish, even if a job was 
> terminated with Ctrl-C.  There turn out to be some limitations on this, 
> and if you run the test suite you'll find lots of orphaned uprobes and 
> kprobes.  I think I understand the basic issues and am addressing them.
> 
> There aren't too many things going on, really.  One of them is that the 
> dtrace tool sets a signal handler for SIGINT and SIGTERM.  So if you 
> Ctrl-C dtrace, it captures the signal, sets a flag (g_intr++), and 
> proceeds with an orderly shutdown, including cleaning up uprobes and 
> kprobes.  However, if the test suite runs and a test times out, then 
> "timeout --signal=KILL" sends a signal that is not captured.  The job is 
> killed abruptly, and probes are *NOT* cleaned up.
> 
> How should this problem be addressed?
> *)  Just let the uprobes and kprobes accumulate?
> *)  Have the test suite clean the probes up?
> *)  Have dtrace capture KILL the same way it captures INT and TERM?
> *)  Have the test suite timeout tests with INT or TERM rather than KILL?
> 
> I would suggest a choice if one of them felt much better than the others.

Some input from Nick would be appropriate here I think since he wrote the
testsuite mechanism.  My thought is that it might be reasonable for the
testsuite engine to first try to send TERM, and then KILL.  That allows
dtrace a chance to clean up, and if it is truly hanging, the KILL will
take care of that.  If we need to go in for the real KILL, it seems OK
to me that probes are left behind (they are mostly harmless anyway).  A
KILL signal isn't a clean way to terminate a program - it's brute force.



More information about the DTrace-devel mailing list