[DTrace-devel] [PATCH v3 15/16 was 12/12] Implement the io provider

Eugene Loh eugene.loh at oracle.com
Thu Jan 18 21:14:16 UTC 2024


I would not want to bet my life on this patch (it's pretty 
complicated... the patch, I mean), but
Reviewed-by: Eugene Loh <eugene.loh at oracle.com>

My main concern is the testing.  This provider is kind of complicated, 
with lots of different code paths, and so deserves robust tests.  My 
attempts with local2 and nfs2 fail on occasion. I sent you (personally, 
off list) some test patches (5-patch series), I think on Dec 7.  Those 
patches can be squashed back into this patch.  The reason for the patch 
series is so that it'll be easier for you to see what's going on, but I 
think the patches should be squashed back into here.  Actually, I'll 
just set up a branch with the patches on it rather than messing with all 
that over email.  The branch will have a few more patches beyond the Dec 
7 series.

I still find that tests occasionally fail.  It looks like something is 
getting corrupted... might be as simple as garbled output, but might be 
something deeper than that.  I don't know.  I try another test with 
io::: {@[probename]=count()} and if it runs a little while it gets 
garbage probe names.  Stuff like "doart" (some combination of "done" and 
"start").  I cannot reproduce these problems with other providers, but I 
am still not sure that the problems are related to the io provider.  
Under investigation.

Nonetheless, the io probes seem to be good "nearly all the time" and so 
this looks like a meaningful first release of the io provider.

Some extremely minor comments:

On 1/12/24 19:24, Kris Van Hees via DTrace-devel wrote:
> diff --git a/libdtrace/dt_open.c b/libdtrace/dt_open.c
> +	dtp->dt_maxlvaralloc = 1;	/* force the lvars map to be created */

I assume we cannot do this conditionally, based on whether the io 
provider needs this?

> diff --git a/libdtrace/dt_prov_io.c b/libdtrace/dt_prov_io.c
> +/* List of provider-specific variables. */
> +static dt_ident_t v_bio = { "-io-bio", DT_IDENT_SCALAR,
> +			    DT_IDFLG_LOCAL | DT_IDFLG_WRITE, 0, DT_ATTR_STABCMN,			    DT_VERS_2_0, &dt_idops_type, "struct bio" };

Missing line break.

> +static dt_ident_t v_biop = { "-io-bio-ptr", DT_IDENT_SCALAR,
> +			     DT_IDFLG_TLS | DT_IDFLG_WRITE, 0, DT_ATTR_STABCMN,
> +			     DT_VERS_2_0, &dt_idops_type, "struct bio *" };
> +
> +/*
> + * Generate BPF instructions to dereference the pointer in %r3 (after applying
> + * an optional addend) and read a value of the given 'width'.  The result isu

s/isu/is/

> + * stored in register 'reg' (where BPF_REG_0 <= reg <= BPF_REG_5).
> + *
> + * Registers %r0-%r5 will be clobbered.  Register 'reg' holds the value.
> + */
> +static void deref_r3(dt_irlist_t *dlp, uint_t exitlbl, int addend, int width,
> +		     int reg)
> +{
> +	assert(reg >= BPF_REG_0 && reg <= BPF_REG_5);
> +
> +	/* Use slot 0 as temporary storage. */
> +	emit(dlp, BPF_MOV_REG(BPF_REG_1, BPF_REG_FP));
> +	emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, DT_TRAMP_SP_SLOT(0)));
> +
> +	/* Specify the width of the scalar. */
> +	emit(dlp, BPF_MOV_IMM(BPF_REG_2, width));
> +
> +	/* The source address is already in %r3, but add addend, if any. */
> +	if (addend)
> +		emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, addend));
> +
> +	/* Perform the copy and check for success. */
> +	emit(dlp, BPF_CALL_HELPER(BPF_FUNC_probe_read));
> +	emit(dlp, BPF_BRANCH_IMM(BPF_JSLT, BPF_REG_0, 0, exitlbl));
> +
> +	/* Load the result into the specified register. */
> +	width = bpf_ldst_size(width, 0);
> +	emit(dlp, BPF_LOAD(width, reg, BPF_REG_FP, DT_TRAMP_SP_SLOT(0)));
> +}
> +
> +static int trampoline(dt_pcb_t *pcb, uint_t exitlbl)
> +{
> +
> +	/* Handle the non-XFS wait-done flavor. */
> +	if (strcmp(prp->desc->prb, "wait-done") == 0) {
> +		/*
> +		 * We need instrument submit_bio_wait(struct bio *):

need instrument
->
need to instrument

> +		 *   - on entry, store bio in a TLS var
> +		 *   - on return, get bio (and store 0 to delete the TLS var)

Can omit "store 0 to" since that's a detail inside of del_var().

> +		 * We use a TLS var to distinguish among possible concurrent
> +		 * submit_bio_wait() on the CPU.
> +		 */
> +		dt_cg_tramp_decl_var(pcb, &v_biop);
> +		if (strcmp(uprp->desc->prb, "entry") == 0) {
> +			dt_cg_tramp_get_var(pcb, "self->-io-bio-ptr", 1, BPF_REG_3);
> +			emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_0, BPF_REG_7, DMST_ARG(0)));
> +			emit(dlp, BPF_STORE(BPF_DW, BPF_REG_3, 0, BPF_REG_0));
> +			return 1;
> +		} else {
> +			dt_cg_tramp_get_var(pcb, "self->-io-bio-ptr", 0, BPF_REG_0);
> +			emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(0), BPF_REG_0));
> +			dt_cg_tramp_del_var(pcb, "self->-io-bio-ptr");
> +		}
> +
> +	}
> +
> +done:
> +	/*
> +	 * Note: DTrace does not currently support the use of fileinfo_t with
> +	 * io probes.  In Oracle Linux, there is no information about the file
> +	 * where the I/O request originated at the point where the io probes
> +	 * fire.
> +	 */
> +	emit(dlp, BPF_STORE_IMM(BPF_DW, BPF_REG_7, DMST_ARG(1), 0));
> +
> +	return 0;
> +}
> diff --git a/test/unittest/io/tst.nfs2.sh b/test/unittest/io/tst.nfs2.sh
> +
> +mkdir $exdir
> +  # what is the fsid?
> +  exportfs -i -v -o "rw,sync,no_root_squash,insecure,fsid=8434437287" 127.0.0.1:$exdir > /dev/null
> +    mkdir $iodir
> +        mount -t nfs -o nfsvers=3 127.0.0.1:$exdir $iodir
> +            $rundt "dd if=/dev/urandom of=$tempfile count=$filesize bs=1 status=none" -o log.write
> +            myinode=`stat $tempfile  | awk '/	Inode: / {print $4}'`
> +        umount $iodir
> +        # fliush caches and remount to force IO

s/fliush/flush/

> +	echo 3 > /proc/sys/vm/drop_caches
> +        mount -t nfs -o nfsvers=3 127.0.0.1:$exdir $iodir
> +            $rundt "sum $tempfile"                                                    -o log.read
> +            rm -f $tempfile
> +        umount $iodir
> +    rmdir $iodir
> +  exportfs -u 127.0.0.1:$exdir
> +rmdir $exdir



More information about the DTrace-devel mailing list