[DTrace-devel] [PATCH 5/6] dtprobed: retry DOF parses at least once
Eugene Loh
eugene.loh at oracle.com
Tue May 23 17:55:10 UTC 2023
Another patch I don't understand. (My fault, not yours.) But I'm
guessing it's good.
Any chance of a test?
And, some extremely minor comments...
On 5/22/23 16:20, Nick Alcock via DTrace-devel wrote:
> The whole point of jailing the DOF parser is so that failures are
> harmless (to do harm to anything other than itself an attacking process
> would need to compromise the state of the parser process such that it
> not only doesn't crash but returns malicious results affecting
> *subsequent calls*, which is quite difficult). This implies that DOF
Grammatically, that parenthetical remark is a complete sentence. So how
about a full stop after "harmless," giving the reader a chance to catch
their breath.
> parser crashes can be because of corruption induced by *previous*
> invocations -- but right now we penalize the crashed process by not
> parsing its DOF if the parser crashed.
>
> Give all processes at least one fresh try in a new instance before
> giving up. (This also means that if memory fragmentation or something
> forces a malloc to make forbidden syscalls -- and thus a crash -- we try
> again with a fresh copy which won't need to malloc.)
>
> (In the process, fix one place where we were triggering a double-kill()
> of the DOF parser child, which on a very busy system could theoretically
> lead us to kill something we shouldn't.)
>
> Signed-off-by: Nick Alcock <nick.alcock at oracle.com>
> ---
> dtprobed/dtprobed.c | 37 ++++++++++++++++++++++++-------------
> 1 file changed, 24 insertions(+), 13 deletions(-)
>
> diff --git a/dtprobed/dtprobed.c b/dtprobed/dtprobed.c
> index 1bf621e200d58..29cddba1d9658 100644
> --- a/dtprobed/dtprobed.c
> +++ b/dtprobed/dtprobed.c
> @@ -592,21 +592,33 @@ process_dof(fuse_req_t req, int out, int in, pid_t pid,
> dof_parsed_t *provider;
> const char *errmsg;
> size_t i;
> + size_t tries = 0;
>
> - errmsg = "DOF parser write failed";
> - while ((errno = dof_parser_host_write(out, dh,
> - (dof_hdr_t *) in_buf)) == EAGAIN);
> - if (errno != 0)
> - goto err;
> + do {
> + errmsg = "DOF parser write failed";
> + while ((errno = dof_parser_host_write(out, dh,
> + (dof_hdr_t *) in_buf)) == EAGAIN);
> + if (errno != 0)
> + goto err;
>
> - /*
> - * Wait for parsed reply.
> - */
> + /*
> + * Wait for parsed reply. If it fails, try once more; possibly
> + * the child was killed due to problems induced by a previous
> + * probe, and a retry will work.
> + */
>
> - errmsg = "parsed DOF read failed";
> - provider = dof_read(req, parser_out_pipe[0]);
> - if (!provider || provider->type != DIT_PROVIDER)
> - goto err;
> + errmsg = "parsed DOF read failed";
> + provider = dof_read(req, parser_out_pipe[0]);
> + if (!provider) {
> + if (tries++ > 1)
> + goto err;
> + dof_parser_tidy(1);
> + continue;
> + }
> + if (provider->type != DIT_PROVIDER)
> + goto err;
> + break;
> + } while (!provider);
FWIW, I assume that test will always be true and could be written
while(1)? But perhaps that would be no clearer?
> for (i = 0; i < provider->provider.nprobes; i++) {
> dof_parsed_t *probe = dof_read(req, in);
> @@ -638,7 +650,6 @@ process_dof(fuse_req_t req, int out, int in, pid_t pid,
>
> err:
> fuse_log(FUSE_LOG_ERR, "%i: dtprobed: parser error: %s\n", pid, errmsg);
> - kill(parser_pid, SIGKILL);
> dof_parser_tidy(1);
> return -1;
> }
More information about the DTrace-devel
mailing list