[DTrace-devel] [PATCH 5/6] dtprobed: retry DOF parses at least once

Eugene Loh eugene.loh at oracle.com
Tue May 23 17:55:10 UTC 2023


Another patch I don't understand.  (My fault, not yours.)  But I'm 
guessing it's good.

Any chance of a test?

And, some extremely minor comments...

On 5/22/23 16:20, Nick Alcock via DTrace-devel wrote:
> The whole point of jailing the DOF parser is so that failures are
> harmless (to do harm to anything other than itself an attacking process
> would need to compromise the state of the parser process such that it
> not only doesn't crash but returns malicious results affecting
> *subsequent calls*, which is quite difficult).  This implies that DOF

Grammatically, that parenthetical remark is a complete sentence.  So how 
about a full stop after "harmless," giving the reader a chance to catch 
their breath.

> parser crashes can be because of corruption induced by *previous*
> invocations -- but right now we penalize the crashed process by not
> parsing its DOF if the parser crashed.
>
> Give all processes at least one fresh try in a new instance before
> giving up.  (This also means that if memory fragmentation or something
> forces a malloc to make forbidden syscalls -- and thus a crash -- we try
> again with a fresh copy which won't need to malloc.)
>
> (In the process, fix one place where we were triggering a double-kill()
> of the DOF parser child, which on a very busy system could theoretically
> lead us to kill something we shouldn't.)
>
> Signed-off-by: Nick Alcock <nick.alcock at oracle.com>
> ---
>   dtprobed/dtprobed.c | 37 ++++++++++++++++++++++++-------------
>   1 file changed, 24 insertions(+), 13 deletions(-)
>
> diff --git a/dtprobed/dtprobed.c b/dtprobed/dtprobed.c
> index 1bf621e200d58..29cddba1d9658 100644
> --- a/dtprobed/dtprobed.c
> +++ b/dtprobed/dtprobed.c
> @@ -592,21 +592,33 @@ process_dof(fuse_req_t req, int out, int in, pid_t pid,
>   	dof_parsed_t *provider;
>   	const char *errmsg;
>   	size_t i;
> +	size_t tries = 0;
>   
> -	errmsg = "DOF parser write failed";
> -	while ((errno = dof_parser_host_write(out, dh,
> -					      (dof_hdr_t *) in_buf)) == EAGAIN);
> -	if (errno != 0)
> -		goto err;
> +	do {
> +		errmsg = "DOF parser write failed";
> +		while ((errno = dof_parser_host_write(out, dh,
> +						      (dof_hdr_t *) in_buf)) == EAGAIN);
> +		if (errno != 0)
> +			goto err;
>   
> -	/*
> -	 * Wait for parsed reply.
> -	 */
> +		/*
> +		 * Wait for parsed reply.  If it fails, try once more; possibly
> +		 * the child was killed due to problems induced by a previous
> +		 * probe, and a retry will work.
> +		 */
>   
> -	errmsg = "parsed DOF read failed";
> -	provider = dof_read(req, parser_out_pipe[0]);
> -	if (!provider || provider->type != DIT_PROVIDER)
> -		goto err;
> +		errmsg = "parsed DOF read failed";
> +		provider = dof_read(req, parser_out_pipe[0]);
> +		if (!provider) {
> +			if (tries++ > 1)
> +				goto err;
> +			dof_parser_tidy(1);
> +			continue;
> +		}
> +		if (provider->type != DIT_PROVIDER)
> +			goto err;
> +		break;
> +	} while (!provider);

FWIW, I assume that test will always be true and could be written 
while(1)?  But perhaps that would be no clearer?

>   	for (i = 0; i < provider->provider.nprobes; i++) {
>   		dof_parsed_t *probe = dof_read(req, in);
> @@ -638,7 +650,6 @@ process_dof(fuse_req_t req, int out, int in, pid_t pid,
>   
>   err:
>   	fuse_log(FUSE_LOG_ERR, "%i: dtprobed: parser error: %s\n", pid, errmsg);
> -	kill(parser_pid, SIGKILL);
>   	dof_parser_tidy(1);
>   	return -1;
>   }	



More information about the DTrace-devel mailing list