[DTrace-devel] [PATCH 1/2] dt_link: retry elf_update() without mmaps if writing fails
Steven Sistare
steven.sistare at oracle.com
Tue May 23 16:26:14 UTC 2023
On 5/22/2023 6:24 PM, Nick Alcock wrote:
> elfutils has an option ELF_C_RDWR_MMAP which uses mmap() to update ELF
> files, which dtrace uses in dtrace -G's process_obj(). This is usually
> desirable because most of the file is not altered, so the writeout is
> (for large files, much) faster than doing anything else (it's even
> faster than the update-in-place which elfutils otherwise does).
> Unfortunately, elfutils calls mremap() to expand the file if needed, and
> does not pass MREMAP_MAYMOVE, probably because it maintains pointers
> into the ELF buffers and cannot cope with them moving.
>
> Unfortunately this failure happens only at elf_update() time, which
> means that process_obj() finds out that it has to reopen a file without
> mmap only when it's about ready to close it. Adapt to this by havinb
> process_obj() reinvoke itself when this error is found, forcing the
> non-use of mmap() in this case. Fortunately the link operation is
> almost entirely idempotent: everything we do is either done to the ELF
> file (and thus thrown away when elf_update() fails) or is allocations
> that are discarded as process_obj() exits, or is dependent on the ELF
> file (like offsets, etc). So we don't need to reset the eprobesp
> boolean parameter (if it was changed before, it will be changed
> identically again). The only thing we must suppress is the call to
> dt_probe_define(), which actually does the DOF-creation work: that is
> tied to the probe definition and dtrace handle, and thus is not
> discarded when process_obj() ends (indeed, we need it not to be, because
> that's what creates the data that dtrace_dof_create() writes out).
>
> But we know that calls in a subsequent process_obj() invocation on the
> same file will be identical to the calls that already happened: it's
> just that dt_probe_define() doesn't check for existing offsets. Maybe
> it should, but doing so efficiently is quite a lot more work than just
> noting that this is a reinvocation of process_obj() and skipping the
> define on the grounds that it already happened.
>
> Orabug: 35416271
> Signed-off-by: Nick Alcock <nick.alcock at oracle.com>
Just a heads up, I applied this patch and dlink has munged the file that previously
failed with ELF_C_RDWR_MMAP.
/usr/bin/ld: nbd/server.o: attempt to load strings from a non-string section (number 35)
nbd/server.o: file not recognized: Bad value
I will debug it.
- Steve
> ---
> libdtrace/dt_link.c | 51 +++++++++++++++++----
> test/unittest/usdt/tst.no-mmap.r | 2 +
> test/unittest/usdt/tst.no-mmap.sh | 75 +++++++++++++++++++++++++++++++
> 3 files changed, 120 insertions(+), 8 deletions(-)
> create mode 100644 test/unittest/usdt/tst.no-mmap.r
> create mode 100755 test/unittest/usdt/tst.no-mmap.sh
>
> diff --git a/libdtrace/dt_link.c b/libdtrace/dt_link.c
> index 3346720fcf62e..e3646d92ad0ca 100644
> --- a/libdtrace/dt_link.c
> +++ b/libdtrace/dt_link.c
> @@ -1019,7 +1019,8 @@ dt_link_error(dtrace_hdl_t *dtp, Elf *elf, int fd, dt_link_pair_t *bufs,
> }
>
> static int
> -process_obj(dtrace_hdl_t *dtp, const char *obj, int *eprobesp)
> +process_obj(dtrace_hdl_t *dtp, const char *obj, int *eprobesp,
> + int retry_no_mmap)
> {
> static const char dt_prefix[] = "__dtrace";
> static const char dt_enabled[] = "enabled";
> @@ -1042,13 +1043,15 @@ process_obj(dtrace_hdl_t *dtp, const char *obj, int *eprobesp)
> key_t objkey;
> dt_link_pair_t *pair, *bufs = NULL;
> dt_strtab_t *strtab;
> + int oflags = retry_no_mmap ? ELF_C_RDWR : ELF_C_RDWR_MMAP;
> + int do_retry = 0;
>
> if ((fd = open64(obj, O_RDWR)) == -1) {
> return dt_link_error(dtp, elf, fd, bufs,
> "failed to open %s: %s", obj, strerror(errno));
> }
>
> - if ((elf = elf_begin(fd, ELF_C_RDWR_MMAP, NULL)) == NULL) {
> + if ((elf = elf_begin(fd, oflags, NULL)) == NULL) {
> return dt_link_error(dtp, elf, fd, bufs,
> "failed to process %s: %s", obj, elf_errmsg(elf_errno()));
> }
> @@ -1427,9 +1430,16 @@ process_obj(dtrace_hdl_t *dtp, const char *obj, int *eprobesp)
> if (dt_modtext(dtp, data_tgt->d_buf, &rela, &off) != 0)
> goto err;
>
> - if (dt_probe_define(pvp, prp, s, r, off, eprobe) != 0)
> - return dt_link_error(dtp, elf, fd, bufs,
> - "failed to allocate space for probe");
> + /*
> + * If this is a retry of a previously-scanned binary
> + * with mmap disabled, avoid redefining the probe: it
> + * will already be defined from the previous run through
> + * this function.
> + */
> + if (!retry_no_mmap)
> + if (dt_probe_define(pvp, prp, s, r, off, eprobe) != 0)
> + return dt_link_error(dtp, elf, fd, bufs,
> + "failed to allocate space for probe");
>
> mod = 1;
> elf_flagdata(data_tgt, ELF_C_SET, ELF_F_DIRTY);
> @@ -1466,8 +1476,30 @@ process_obj(dtrace_hdl_t *dtp, const char *obj, int *eprobesp)
> }
> }
>
> - if (mod && elf_update(elf, ELF_C_WRITE) == -1)
> - goto err;
> + /*
> + * libelf calls mremap() but does not allow the mapping to move (because
> + * it may have outstanding pointers into it). Thus the remapping can
> + * fail, which has the effect of causing elf_update() to fail, Invoke
> + * ourselves again and force non-use of mmap() in this situation. (It
> + * is expected to be quite rare, and non-use of mmap() can require much
> + * larger rewrites: so we prefer mmapping if possible.)
> + *
> + * Unfortunately libelf has no useful error machinery (all you can get
> + * out of it is a translated string), so interpret any error whatsoever
> + * at write time as a reason to retry without mmap().
> + *
> + * Add a debugging hack that forces us to act more or less as if we are
> + * failing mmap to make it possible to practically test this code path.
> + */
> +
> + if (_dt_unlikely_(!retry_no_mmap && getenv("_DTRACE_TESTING_FAIL_MMAP")))
> + do_retry = 1;
> + else if (mod && elf_update(elf, ELF_C_WRITE) == -1) {
> + if (!retry_no_mmap)
> + do_retry = 1;
> + else
> + goto err;
> + }
>
> elf_end(elf);
> close(fd);
> @@ -1479,6 +1511,9 @@ process_obj(dtrace_hdl_t *dtp, const char *obj, int *eprobesp)
> dt_free(dtp, pair);
> }
>
> + if (do_retry)
> + return process_obj(dtp, obj, eprobesp, 1);
> +
> return 0;
>
> err:
> @@ -1540,7 +1575,7 @@ dtrace_program_link(dtrace_hdl_t *dtp, dtrace_prog_t *pgp, uint_t dflags,
> }
>
> for (i = 0; i < objc; i++) {
> - if (process_obj(dtp, objv[i], &eprobes) != 0)
> + if (process_obj(dtp, objv[i], &eprobes, 0) != 0)
> return -1; /* errno is set for us */
>
> /*
> diff --git a/test/unittest/usdt/tst.no-mmap.r b/test/unittest/usdt/tst.no-mmap.r
> new file mode 100644
> index 0000000000000..b856cc59f44c1
> --- /dev/null
> +++ b/test/unittest/usdt/tst.no-mmap.r
> @@ -0,0 +1,2 @@
> +test:main:go
> +
> diff --git a/test/unittest/usdt/tst.no-mmap.sh b/test/unittest/usdt/tst.no-mmap.sh
> new file mode 100755
> index 0000000000000..f5605047cee7c
> --- /dev/null
> +++ b/test/unittest/usdt/tst.no-mmap.sh
> @@ -0,0 +1,75 @@
> +#!/bin/bash
> +#
> +# Oracle Linux DTrace.
> +# Copyright (c) 2006, 2023, Oracle and/or its affiliates. All rights reserved.
> +# Licensed under the Universal Permissive License v 1.0 as shown at
> +# http://oss.oracle.com/licenses/upl.
> +#
> +if [ $# != 1 ]; then
> + echo expected one argument: '<'dtrace-path'>'
> + exit 2
> +fi
> +
> +dtrace=$1
> +CC=/usr/bin/gcc
> +CFLAGS=
> +
> +DIRNAME="$tmpdir/usdt-no-mmap.$$.$RANDOM"
> +mkdir -p $DIRNAME
> +cd $DIRNAME
> +
> +cat > prov.d <<EOF
> +provider test_prov {
> + probe go();
> +};
> +EOF
> +
> +$dtrace -h -s prov.d
> +if [ $? -ne 0 ]; then
> + echo "failed to generate header file" >& 2
> + exit 1
> +fi
> +
> +cat > test.c <<EOF
> +#include <sys/types.h>
> +#include "prov.h"
> +
> +int
> +main(int argc, char **argv)
> +{
> + if (TEST_PROV_GO_ENABLED()) {
> + TEST_PROV_GO();
> + }
> +}
> +EOF
> +
> +${CC} ${CFLAGS} -c test.c
> +if [ $? -ne 0 ]; then
> + echo "failed to compile test.c" >& 2
> + exit 1
> +fi
> +_DTRACE_TESTING_FAIL_MMAP=1 $dtrace -G -s prov.d test.o
> +if [ $? -ne 0 ]; then
> + echo "failed to create DOF" >& 2
> + exit 1
> +fi
> +${CC} ${CFLAGS} -o test test.o prov.o
> +if [ $? -ne 0 ]; then
> + echo "failed to link final executable" >& 2
> + exit 1
> +fi
> +
> +script()
> +{
> + $dtrace -c ./test -qs /dev/stdin <<EOF
> + test_prov\$target:::
> + {
> + printf("%s:%s:%s\n", probemod, probefunc, probename);
> + }
> +EOF
> +}
> +
> +script
> +status=$?
> +
> +exit $status
>
> base-commit: f543fa6706c0b31364356c01bf3de63e3cce8ad1
> prerequisite-patch-id: fa667248b7b02e92ee6b3807123ebcf2ac0acc38
> prerequisite-patch-id: 0cec5f1a31d528a493d6143644c1e6b78c7f403b
> prerequisite-patch-id: 1e1ced5fcec01a0c2e38d12f1ae5a124c8f93860
> prerequisite-patch-id: eb290f098e38981181d85240150e1713979dc505
> prerequisite-patch-id: c25b38ab68e815a63f064aff7496408684f94a36
> prerequisite-patch-id: b86eb7e00b5883a586bd4dc7143ab25094912288
> prerequisite-patch-id: 85d3c176dc4a7105e4f04dbcdb4ccbdf18741795
> prerequisite-patch-id: 94206b53a67820bb1c9e828c11e57a0cb36949b6
> prerequisite-patch-id: c5a64d246ff26047723e5e1a9c16b42a1d5fbf87
> prerequisite-patch-id: 89cf4c6b9242525883ac02bb8ef90856ef908d0f
> prerequisite-patch-id: 0992642ac657a1da617588a6d7b28f3669960374
> prerequisite-patch-id: e8a65c309b3eaf3341ceac421ef602bcc3f057fa
> prerequisite-patch-id: 4d38462a4bf2b242f8fcec3bf28f1ad97fa40ca7
> prerequisite-patch-id: 6430030eaa9b5f09eedf6d41b698549096905c56
> prerequisite-patch-id: aefef52e1438f34dd49e0cde010e791d96ef06d9
> prerequisite-patch-id: f9ca834410ab770ec2d0b4f4f1ea70d5bea37c84
> prerequisite-patch-id: a2e41bc39d3eb5bea3c1543928d1a695320cce52
> prerequisite-patch-id: 5998403c58e4b48f0cb0d87971e553d2999421fc
> prerequisite-patch-id: 8766128335d133bdd548ed5978aa9eefeb6f7b6d
> prerequisite-patch-id: 35e83414bbe54be31fe7cc15c7d6368543cd4e8f
More information about the DTrace-devel
mailing list