[DTrace-devel] [PATCH 1/2] dt_link: retry elf_update() without mmaps if writing fails

Steven Sistare steven.sistare at oracle.com
Tue May 23 16:26:14 UTC 2023


On 5/22/2023 6:24 PM, Nick Alcock wrote:
> elfutils has an option ELF_C_RDWR_MMAP which uses mmap() to update ELF
> files, which dtrace uses in dtrace -G's process_obj().  This is usually
> desirable because most of the file is not altered, so the writeout is
> (for large files, much) faster than doing anything else (it's even
> faster than the update-in-place which elfutils otherwise does).
> Unfortunately, elfutils calls mremap() to expand the file if needed, and
> does not pass MREMAP_MAYMOVE, probably because it maintains pointers
> into the ELF buffers and cannot cope with them moving.
> 
> Unfortunately this failure happens only at elf_update() time, which
> means that process_obj() finds out that it has to reopen a file without
> mmap only when it's about ready to close it.  Adapt to this by havinb
> process_obj() reinvoke itself when this error is found, forcing the
> non-use of mmap() in this case.  Fortunately the link operation is
> almost entirely idempotent: everything we do is either done to the ELF
> file (and thus thrown away when elf_update() fails) or is allocations
> that are discarded as process_obj() exits, or is dependent on the ELF
> file (like offsets, etc).  So we don't need to reset the eprobesp
> boolean parameter (if it was changed before, it will be changed
> identically again).  The only thing we must suppress is the call to
> dt_probe_define(), which actually does the DOF-creation work: that is
> tied to the probe definition and dtrace handle, and thus is not
> discarded when process_obj() ends (indeed, we need it not to be, because
> that's what creates the data that dtrace_dof_create() writes out).
> 
> But we know that calls in a subsequent process_obj() invocation on the
> same file will be identical to the calls that already happened: it's
> just that dt_probe_define() doesn't check for existing offsets.  Maybe
> it should, but doing so efficiently is quite a lot more work than just
> noting that this is a reinvocation of process_obj() and skipping the
> define on the grounds that it already happened.
> 
> Orabug: 35416271
> Signed-off-by: Nick Alcock <nick.alcock at oracle.com>

Just a heads up, I applied this patch and dlink has munged the file that previously
failed with ELF_C_RDWR_MMAP.

/usr/bin/ld: nbd/server.o: attempt to load strings from a non-string section (number 35)
nbd/server.o: file not recognized: Bad value

I will debug it.

- Steve

> ---
>  libdtrace/dt_link.c               | 51 +++++++++++++++++----
>  test/unittest/usdt/tst.no-mmap.r  |  2 +
>  test/unittest/usdt/tst.no-mmap.sh | 75 +++++++++++++++++++++++++++++++
>  3 files changed, 120 insertions(+), 8 deletions(-)
>  create mode 100644 test/unittest/usdt/tst.no-mmap.r
>  create mode 100755 test/unittest/usdt/tst.no-mmap.sh
> 
> diff --git a/libdtrace/dt_link.c b/libdtrace/dt_link.c
> index 3346720fcf62e..e3646d92ad0ca 100644
> --- a/libdtrace/dt_link.c
> +++ b/libdtrace/dt_link.c
> @@ -1019,7 +1019,8 @@ dt_link_error(dtrace_hdl_t *dtp, Elf *elf, int fd, dt_link_pair_t *bufs,
>  }
>  
>  static int
> -process_obj(dtrace_hdl_t *dtp, const char *obj, int *eprobesp)
> +process_obj(dtrace_hdl_t *dtp, const char *obj, int *eprobesp,
> +    int retry_no_mmap)
>  {
>  	static const char dt_prefix[] = "__dtrace";
>  	static const char dt_enabled[] = "enabled";
> @@ -1042,13 +1043,15 @@ process_obj(dtrace_hdl_t *dtp, const char *obj, int *eprobesp)
>  	key_t objkey;
>  	dt_link_pair_t *pair, *bufs = NULL;
>  	dt_strtab_t *strtab;
> +	int oflags = retry_no_mmap ? ELF_C_RDWR : ELF_C_RDWR_MMAP;
> +	int do_retry = 0;
>  
>  	if ((fd = open64(obj, O_RDWR)) == -1) {
>  		return dt_link_error(dtp, elf, fd, bufs,
>  		    "failed to open %s: %s", obj, strerror(errno));
>  	}
>  
> -	if ((elf = elf_begin(fd, ELF_C_RDWR_MMAP, NULL)) == NULL) {
> +	if ((elf = elf_begin(fd, oflags, NULL)) == NULL) {
>  		return dt_link_error(dtp, elf, fd, bufs,
>  		    "failed to process %s: %s", obj, elf_errmsg(elf_errno()));
>  	}
> @@ -1427,9 +1430,16 @@ process_obj(dtrace_hdl_t *dtp, const char *obj, int *eprobesp)
>  			if (dt_modtext(dtp, data_tgt->d_buf, &rela, &off) != 0)
>  				goto err;
>  
> -			if (dt_probe_define(pvp, prp, s, r, off, eprobe) != 0)
> -				return dt_link_error(dtp, elf, fd, bufs,
> -				    "failed to allocate space for probe");
> +			/*
> +			 * If this is a retry of a previously-scanned binary
> +			 * with mmap disabled, avoid redefining the probe: it
> +			 * will already be defined from the previous run through
> +			 * this function.
> +			 */
> +			if (!retry_no_mmap)
> +				if (dt_probe_define(pvp, prp, s, r, off, eprobe) != 0)
> +					return dt_link_error(dtp, elf, fd, bufs,
> +					    "failed to allocate space for probe");
>  
>  			mod = 1;
>  			elf_flagdata(data_tgt, ELF_C_SET, ELF_F_DIRTY);
> @@ -1466,8 +1476,30 @@ process_obj(dtrace_hdl_t *dtp, const char *obj, int *eprobesp)
>  		}
>  	}
>  
> -	if (mod && elf_update(elf, ELF_C_WRITE) == -1)
> -		goto err;
> +	/*
> +	 * libelf calls mremap() but does not allow the mapping to move (because
> +	 * it may have outstanding pointers into it).  Thus the remapping can
> +	 * fail, which has the effect of causing elf_update() to fail,  Invoke
> +	 * ourselves again and force non-use of mmap() in this situation.  (It
> +	 * is expected to be quite rare, and non-use of mmap() can require much
> +	 * larger rewrites: so we prefer mmapping if possible.)
> +	 *
> +	 * Unfortunately libelf has no useful error machinery (all you can get
> +	 * out of it is a translated string), so interpret any error whatsoever
> +	 * at write time as a reason to retry without mmap().
> +	 *
> +	 * Add a debugging hack that forces us to act more or less as if we are
> +	 * failing mmap to make it possible to practically test this code path.
> +	 */
> +
> +	if (_dt_unlikely_(!retry_no_mmap && getenv("_DTRACE_TESTING_FAIL_MMAP")))
> +		do_retry = 1;
> +	else if (mod && elf_update(elf, ELF_C_WRITE) == -1) {
> +		if (!retry_no_mmap)
> +			do_retry = 1;
> +		else
> +			goto err;
> +	}
>  
>  	elf_end(elf);
>  	close(fd);
> @@ -1479,6 +1511,9 @@ process_obj(dtrace_hdl_t *dtp, const char *obj, int *eprobesp)
>  		dt_free(dtp, pair);
>  	}
>  
> +	if (do_retry)
> +		return process_obj(dtp, obj, eprobesp, 1);
> +
>  	return 0;
>  
>  err:
> @@ -1540,7 +1575,7 @@ dtrace_program_link(dtrace_hdl_t *dtp, dtrace_prog_t *pgp, uint_t dflags,
>  	}
>  
>  	for (i = 0; i < objc; i++) {
> -		if (process_obj(dtp, objv[i], &eprobes) != 0)
> +		if (process_obj(dtp, objv[i], &eprobes, 0) != 0)
>  			return -1; /* errno is set for us */
>  
>  		/*
> diff --git a/test/unittest/usdt/tst.no-mmap.r b/test/unittest/usdt/tst.no-mmap.r
> new file mode 100644
> index 0000000000000..b856cc59f44c1
> --- /dev/null
> +++ b/test/unittest/usdt/tst.no-mmap.r
> @@ -0,0 +1,2 @@
> +test:main:go
> +
> diff --git a/test/unittest/usdt/tst.no-mmap.sh b/test/unittest/usdt/tst.no-mmap.sh
> new file mode 100755
> index 0000000000000..f5605047cee7c
> --- /dev/null
> +++ b/test/unittest/usdt/tst.no-mmap.sh
> @@ -0,0 +1,75 @@
> +#!/bin/bash
> +#
> +# Oracle Linux DTrace.
> +# Copyright (c) 2006, 2023, Oracle and/or its affiliates. All rights reserved.
> +# Licensed under the Universal Permissive License v 1.0 as shown at
> +# http://oss.oracle.com/licenses/upl.
> +#
> +if [ $# != 1 ]; then
> +	echo expected one argument: '<'dtrace-path'>'
> +	exit 2
> +fi
> +
> +dtrace=$1
> +CC=/usr/bin/gcc
> +CFLAGS=
> +
> +DIRNAME="$tmpdir/usdt-no-mmap.$$.$RANDOM"
> +mkdir -p $DIRNAME
> +cd $DIRNAME
> +
> +cat > prov.d <<EOF
> +provider test_prov {
> +	probe go();
> +};
> +EOF
> +
> +$dtrace -h -s prov.d
> +if [ $? -ne 0 ]; then
> +	echo "failed to generate header file" >& 2
> +	exit 1
> +fi
> +
> +cat > test.c <<EOF
> +#include <sys/types.h>
> +#include "prov.h"
> +
> +int
> +main(int argc, char **argv)
> +{
> +	if (TEST_PROV_GO_ENABLED()) {
> +		TEST_PROV_GO();
> +	}
> +}
> +EOF
> +
> +${CC} ${CFLAGS} -c test.c
> +if [ $? -ne 0 ]; then
> +	echo "failed to compile test.c" >& 2
> +	exit 1
> +fi
> +_DTRACE_TESTING_FAIL_MMAP=1 $dtrace -G -s prov.d test.o
> +if [ $? -ne 0 ]; then
> +	echo "failed to create DOF" >& 2
> +	exit 1
> +fi
> +${CC} ${CFLAGS} -o test test.o prov.o
> +if [ $? -ne 0 ]; then
> +	echo "failed to link final executable" >& 2
> +	exit 1
> +fi
> +
> +script()
> +{
> +	$dtrace -c ./test -qs /dev/stdin <<EOF
> +	test_prov\$target:::
> +	{
> +		printf("%s:%s:%s\n", probemod, probefunc, probename);
> +	}
> +EOF
> +}
> +
> +script
> +status=$?
> +
> +exit $status
> 
> base-commit: f543fa6706c0b31364356c01bf3de63e3cce8ad1
> prerequisite-patch-id: fa667248b7b02e92ee6b3807123ebcf2ac0acc38
> prerequisite-patch-id: 0cec5f1a31d528a493d6143644c1e6b78c7f403b
> prerequisite-patch-id: 1e1ced5fcec01a0c2e38d12f1ae5a124c8f93860
> prerequisite-patch-id: eb290f098e38981181d85240150e1713979dc505
> prerequisite-patch-id: c25b38ab68e815a63f064aff7496408684f94a36
> prerequisite-patch-id: b86eb7e00b5883a586bd4dc7143ab25094912288
> prerequisite-patch-id: 85d3c176dc4a7105e4f04dbcdb4ccbdf18741795
> prerequisite-patch-id: 94206b53a67820bb1c9e828c11e57a0cb36949b6
> prerequisite-patch-id: c5a64d246ff26047723e5e1a9c16b42a1d5fbf87
> prerequisite-patch-id: 89cf4c6b9242525883ac02bb8ef90856ef908d0f
> prerequisite-patch-id: 0992642ac657a1da617588a6d7b28f3669960374
> prerequisite-patch-id: e8a65c309b3eaf3341ceac421ef602bcc3f057fa
> prerequisite-patch-id: 4d38462a4bf2b242f8fcec3bf28f1ad97fa40ca7
> prerequisite-patch-id: 6430030eaa9b5f09eedf6d41b698549096905c56
> prerequisite-patch-id: aefef52e1438f34dd49e0cde010e791d96ef06d9
> prerequisite-patch-id: f9ca834410ab770ec2d0b4f4f1ea70d5bea37c84
> prerequisite-patch-id: a2e41bc39d3eb5bea3c1543928d1a695320cce52
> prerequisite-patch-id: 5998403c58e4b48f0cb0d87971e553d2999421fc
> prerequisite-patch-id: 8766128335d133bdd548ed5978aa9eefeb6f7b6d
> prerequisite-patch-id: 35e83414bbe54be31fe7cc15c7d6368543cd4e8f



More information about the DTrace-devel mailing list