From kris.van.hees at oracle.com Thu Mar 6 14:09:24 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Thu, 6 Mar 2025 09:09:24 -0500 Subject: [DTrace-devel] [PATCH] Refactor the versioning handling system In-Reply-To: <430ab6de-3ddf-35b2-692d-5e3a386e17c0@oracle.com> References: <430ab6de-3ddf-35b2-692d-5e3a386e17c0@oracle.com> Message-ID: Nick: see question below (mentions your name to find it easily) On Fri, Feb 28, 2025 at 03:55:03PM -0500, Eugene Loh via DTrace-devel wrote: > How about eliminating DT_VERS_LATEST?? It seems cumbersome and unnecessary > to maintain manually. True, though then we should probably define a global variable for that and initialize it as __dtrace_versions[ARRAY_SIZE(_dtrace_versions) - 1]. That can be used as default value for dt_vmax (in dt_open.c). > Do tests pass with this patch?? How about: > ? ?? ?? test/unittest/options/tst.version.sh Yes > ??????? test/unittest/dtrace-util/tst.APIVersion.d No, becuase that test foolishly expects a specific version. We probably should auto-generate the .r file for it at build time so that it gets populated with the correct value. > Is the plan to update such tests with each new version of DTrace? I would hope that we will develop a comprehensive set of tests that actually verify that the version information associated with each identifier is correct. > Ideally (well, in my opinion), it'd be nice if the patch were factored into > multiple patches.? The big code movement that merely moves stuff from one > file to another would come first.? Then, the smaller changes that actually > change how things are done can be in a smaller patch.? (Yeah, I know:? > subjective and a question of whether to make things more tedious for the > submitter or the reviewer of the changes.) Works for me. I didn't because the code movement seemed to be rather obvious and relatively small. But I don't mind doing it as a patch series if you prefer that. > Did it use to be the case that when you ran DTrace you would want to know > both what version of the tool is this and what version of the API?? But now > there is only one version number?? We've been going that way anyhow, but > maybe if we're committing to that simpler versioning, we should say so?? > (The whole topic confused me, in any case.) To my knowledge (and based on the code) there has never been a way to print any version other than the _dtrace_version which is the API version. It is possible that prior to Linux a packaging version was assigned other than the API version, but that seems doubtful because it would essentially be quite meaningless and confusing. Theoretically, it could be envisioned that there be a tool version (cmd/dtrace.c), library version (libdtrace), API version (exposed API of libdtrace), and D language version. But since we are not dealing with a complex frontend, multiple consumers, etc... I think that a single version suffices for now. And that seems to be what has been done all along. > Also... > > On 2/27/25 22:09, Kris Van Hees via DTrace-devel wrote: > > > DTrace was handlings versioning data in multiple locations, causing > > common mistakes in not consistently updating versions in allplaces. > > s/allplaces/all places/ Thanks. > > By consolidating all versioning data in dt_version.h a single file > > For me, a comma after dt_version.h would help with readability. Sure. > > need to be updated (as far as the source tree is concerned) when a > > s/need/needs/? Thanks. > > new version is introduced. > > > > For building, the GNUmakefile and dtrace.spec files will also need > > to be updated with the new version number. > > Maybe this comment about GNUmakefile and dtrace.spec should be in > dt_version.h?? Also, what about libdtrace/Build (and libdtrace_VERSION)? Hm, yes, I can do that (and make it more generic so it refers to all packaging config files that distros might need to update). As far as libdtrace_VERSION is concerned, I think Nick is best positioned to answer whether that should follow the DTrace version or whether keeping it at 2.0.0 is the most appropriate action, > > Signed-off-by: Kris Van Hees > > --- > > libdtrace/Build | 1 + > > libdtrace/dt_impl.h | 48 ++++++++----------- > > libdtrace/dt_open.c | 19 -------- > > libdtrace/dt_subr.c | 80 -------------------------------- > > libdtrace/dt_version.c | 93 +++++++++++++++++++++++++++++++++++++ > > libdtrace/dt_version.h | 102 ++++++++++++++++++++++++++++------------- > > 6 files changed, 181 insertions(+), 162 deletions(-) > > create mode 100644 libdtrace/dt_version.c > > > > diff --git a/libdtrace/Build b/libdtrace/Build > > index 51e0f078..57804f55 100644 > > --- a/libdtrace/Build > > +++ b/libdtrace/Build > > @@ -70,6 +70,7 @@ libdtrace-build_SOURCES = dt_aggregate.c \ > > dt_strtab.c \ > > dt_subr.c \ > > dt_symtab.c \ > > + dt_version.c \ > > dt_work.c \ > > dt_xlator.c > > diff --git a/libdtrace/dt_impl.h b/libdtrace/dt_impl.h > > index 68fb8ec5..60e9b0c9 100644 > > --- a/libdtrace/dt_impl.h > > +++ b/libdtrace/dt_impl.h > > @@ -256,8 +256,6 @@ typedef struct dt_percpu_drops { > > */ > > #define DT_MAX_NSPECS 16 /* sanity upper bound on speculations */ > > -typedef uint32_t dt_version_t; /* encoded version (see below) */ > > - > > struct dtrace_hdl { > > const dtrace_vector_t *dt_vector; /* library vector, if vectored open */ > > void *dt_varg; /* vector argument, if vectored open */ > > @@ -645,6 +643,24 @@ enum { > > EDT_PRINT, /* missing or corrupt print() record */ > > }; > > +/* > > + * Stability definitions > > + * > > + * These #defines are used in the tables of identifiers below to fill in the > > + * attribute fields associated with each identifier. The DT_ATTR_* macros are > > + * a convenience to permit more concise declarations of common attributes such > > + * as Stable/Stable/Common. > > + * > > + * Refer to the Solaris Dynamic Tracing Guide Stability chapter respectively > > + * for an explanation of these DTrace features and their values. > > + */ > > +#define DT_ATTR_STABCMN { DTRACE_STABILITY_STABLE, \ > > + DTRACE_STABILITY_STABLE, DTRACE_CLASS_COMMON } > > + > > +#define DT_ATTR_EVOLCMN { DTRACE_STABILITY_EVOLVING, \ > > + DTRACE_STABILITY_EVOLVING, DTRACE_CLASS_COMMON \ > > +} > > + > > /* > > * Interfaces for parsing and comparing DTrace attribute tuples, which describe > > * stability and architectural binding information. > > @@ -654,31 +670,6 @@ extern dtrace_attribute_t dt_attr_max(dtrace_attribute_t, dtrace_attribute_t); > > extern char *dt_attr_str(dtrace_attribute_t, char *, size_t); > > extern int dt_attr_cmp(dtrace_attribute_t, dtrace_attribute_t); > > -/* > > - * Interfaces for parsing and handling DTrace version strings. Version binding > > - * is a feature of the D compiler that is handled completely independently of > > - * the DTrace kernel infrastructure, so the definitions are here in libdtrace. > > - * Version strings are compiled into an encoded uint32_t which can be compared > > - * using C comparison operators. Version definitions are found in dt_open.c. > > - */ > > -#define DT_VERSION_STRMAX 16 /* enough for "255.4095.4095\0" */ > > -#define DT_VERSION_MAJMAX 0xFF /* maximum major version number */ > > -#define DT_VERSION_MINMAX 0xFFF /* maximum minor version number */ > > -#define DT_VERSION_MICMAX 0xFFF /* maximum micro version number */ > > - > > -#define DT_VERSION_NUMBER(M, m, u) \ > > - ((((M) & 0xFF) << 24) | (((m) & 0xFFF) << 12) | ((u) & 0xFFF)) > > - > > -#define DT_VERSION_MAJOR(v) (((v) & 0xFF000000) >> 24) > > -#define DT_VERSION_MINOR(v) (((v) & 0x00FFF000) >> 12) > > -#define DT_VERSION_MICRO(v) ((v) & 0x00000FFF) > > - > > -extern char *dt_version_num2str(dt_version_t, char *, size_t); > > -extern int dt_version_str2num(const char *, dt_version_t *); > > -extern int dt_version_defined(dt_version_t); > > - > > -extern int dt_str2kver(const char *, dt_version_t *); > > - > > extern uint32_t dt_gen_hval(const char *, uint32_t, size_t); > > /* > > @@ -816,9 +807,6 @@ extern const dtrace_attribute_t _dtrace_typattr; /* type ref attributes */ > > extern const dtrace_attribute_t _dtrace_prvattr; /* provider attributes */ > > extern const dtrace_pattr_t _dtrace_prvdesc; /* provider attribute bundle */ > > -extern const dt_version_t _dtrace_versions[]; /* array of valid versions */ > > -extern const char *const _dtrace_version; /* current version string */ > > - > > extern int _dtrace_strbuckets; /* number of hash buckets for strings */ > > extern uint_t _dtrace_stkindent; /* default indent for stack/ustack */ > > extern uint_t _dtrace_pidbuckets; /* number of hash buckets for pids */ > > diff --git a/libdtrace/dt_open.c b/libdtrace/dt_open.c > > index 51c056b2..72c138ce 100644 > > --- a/libdtrace/dt_open.c > > +++ b/libdtrace/dt_open.c > > @@ -42,25 +42,6 @@ > > #include > > #include > > -const dt_version_t _dtrace_versions[] = { > > - DT_VERS_1_0, /* D API 1.0.0 (PSARC 2001/466) Solaris 10 FCS */ > > - DT_VERS_1_1, /* D API 1.1.0 Solaris Express 6/05 */ > > - DT_VERS_1_2, /* D API 1.2.0 Solaris 10 Update 1 */ > > - DT_VERS_1_2_1, /* D API 1.2.1 Solaris Express 4/06 */ > > - DT_VERS_1_2_2, /* D API 1.2.2 Solaris Express 6/06 */ > > - DT_VERS_1_3, /* D API 1.3 Solaris Express 10/06 */ > > - DT_VERS_1_4, /* D API 1.4 Solaris Express 2/07 */ > > - DT_VERS_1_4_1, /* D API 1.4.1 Solaris Express 4/07 */ > > - DT_VERS_1_5, /* D API 1.5 Solaris Express 7/07 */ > > - DT_VERS_1_6, /* D API 1.6 */ > > - DT_VERS_1_6_1, /* D API 1.6.1 */ > > - DT_VERS_1_6_2, /* D API 1.6.2 */ > > - DT_VERS_1_6_3, /* D API 1.6.3 */ > > - DT_VERS_1_6_4, /* D API 1.6.4 */ > > - DT_VERS_2_0, /* D API 2.0 */ > > - 0 > > -}; > > - > > /* > > * Table of global identifiers. This is used to populate the global identifier > > * hash when a new dtrace client open occurs. For more info see dt_ident.h. > > diff --git a/libdtrace/dt_subr.c b/libdtrace/dt_subr.c > > index d5dca164..40b66c7d 100644 > > --- a/libdtrace/dt_subr.c > > +++ b/libdtrace/dt_subr.c > > @@ -369,58 +369,6 @@ dt_attr_str(dtrace_attribute_t a, char *buf, size_t len) > > return buf; > > } > > -char * > > -dt_version_num2str(dt_version_t v, char *buf, size_t len) > > -{ > > - uint_t M = DT_VERSION_MAJOR(v); > > - uint_t m = DT_VERSION_MINOR(v); > > - uint_t u = DT_VERSION_MICRO(v); > > - > > - if (u == 0) > > - snprintf(buf, len, "%u.%u", M, m); > > - else > > - snprintf(buf, len, "%u.%u.%u", M, m, u); > > - > > - return buf; > > -} > > - > > -int > > -dt_version_str2num(const char *s, dt_version_t *vp) > > -{ > > - int i = 0, n[3] = { 0, 0, 0 }; > > - char c; > > - > > - while ((c = *s++) != '\0') { > > - if (isdigit(c)) > > - n[i] = n[i] * 10 + c - '0'; > > - else if (c != '.' || i++ >= sizeof(n) / sizeof(n[0]) - 1) > > - return -1; > > - } > > - > > - if (n[0] > DT_VERSION_MAJMAX || > > - n[1] > DT_VERSION_MINMAX || > > - n[2] > DT_VERSION_MICMAX) > > - return -1; > > - > > - if (vp != NULL) > > - *vp = DT_VERSION_NUMBER(n[0], n[1], n[2]); > > - > > - return 0; > > -} > > - > > -int > > -dt_version_defined(dt_version_t v) > > -{ > > - int i; > > - > > - for (i = 0; _dtrace_versions[i] != 0; i++) { > > - if (_dtrace_versions[i] == v) > > - return 1; > > - } > > - > > - return 0; > > -} > > - > > char * > > dt_cpp_add_arg(dtrace_hdl_t *dtp, const char *str) > > { > > @@ -949,34 +897,6 @@ dtrace_uaddr2str(dtrace_hdl_t *dtp, pid_t pid, uint64_t addr, char *str, > > return dt_string2str(c, str, nbytes); > > } > > -/* > > - * The function converts string representation of kernel version > > - * into the dt_version_t type. > > - */ > > -int > > -dt_str2kver(const char *kverstr, dt_version_t *vp) > > -{ > > - int kv1, kv2, kv3; > > - int rval; > > - > > - rval = sscanf(kverstr, "%d.%d.%d", &kv1, &kv2, &kv3); > > - > > - switch (rval) { > > - case 2: > > - kv3 = 0; > > - break; > > - case 3: > > - break; > > - default: > > - return -1; > > - } > > - > > - if (vp) > > - *vp = DT_VERSION_NUMBER(kv1, kv2, kv3); > > - > > - return 0; > > -} > > - > > /* > > * Compute a 32-bit hash value for a memory block of given size. > > */ > > diff --git a/libdtrace/dt_version.c b/libdtrace/dt_version.c > > new file mode 100644 > > index 00000000..e9ac88be > > --- /dev/null > > +++ b/libdtrace/dt_version.c > > @@ -0,0 +1,93 @@ > > +/* > > + * Oracle Linux DTrace. > > + * Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved. > > + * Licensed under the Universal Permissive License v 1.0 as shown at > > + * http://oss.oracle.com/licenses/upl. > > + */ > > + > > +#include > > +#include > > + > > +#include > > + > > +const dt_version_t _dtrace_versions[] = DTRACE_VERSIONS; > > + > > +char * > > +dt_version_num2str(dt_version_t v, char *buf, size_t len) > > +{ > > + uint_t M = DT_VERSION_MAJOR(v); > > + uint_t m = DT_VERSION_MINOR(v); > > + uint_t u = DT_VERSION_MICRO(v); > > + > > + if (u == 0) > > + snprintf(buf, len, "%u.%u", M, m); > > + else > > + snprintf(buf, len, "%u.%u.%u", M, m, u); > > + > > + return buf; > > +} > > + > > +int > > +dt_version_str2num(const char *s, dt_version_t *vp) > > +{ > > + int i = 0, n[3] = { 0, 0, 0 }; > > + char c; > > + > > + while ((c = *s++) != '\0') { > > + if (isdigit(c)) > > + n[i] = n[i] * 10 + c - '0'; > > + else if (c != '.' || i++ >= ARRAY_SIZE(n) - 1) > > + return -1; > > + } > > + > > + if (n[0] > DT_VERSION_MAJMAX || > > + n[1] > DT_VERSION_MINMAX || > > + n[2] > DT_VERSION_MICMAX) > > + return -1; > > + > > + if (vp != NULL) > > + *vp = DT_VERSION_NUMBER(n[0], n[1], n[2]); > > + > > + return 0; > > +} > > + > > +int > > +dt_version_defined(dt_version_t v) > > +{ > > + int i; > > + > > + for (i = 0; i < ARRAY_SIZE(_dtrace_versions); i++) { > > + if (_dtrace_versions[i] == v) > > + return 1; > > + } > > + > > + return 0; > > +} > > + > > +/* > > + * Convert a string representation of a kernel version string into the > > + * a dt_version_t value. > > + */ > > +int > > +dt_str2kver(const char *kverstr, dt_version_t *vp) > > +{ > > + int kv1, kv2, kv3; > > + int rval; > > + > > + rval = sscanf(kverstr, "%d.%d.%d", &kv1, &kv2, &kv3); > > + > > + switch (rval) { > > + case 2: > > + kv3 = 0; > > + break; > > + case 3: > > + break; > > + default: > > + return -1; > > + } > > + > > + if (vp) > > + *vp = DT_VERSION_NUMBER(kv1, kv2, kv3); > > + > > + return 0; > > +} > > diff --git a/libdtrace/dt_version.h b/libdtrace/dt_version.h > > index 3fd1b3d1..967e22cc 100644 > > --- a/libdtrace/dt_version.h > > +++ b/libdtrace/dt_version.h > > @@ -15,24 +15,6 @@ extern "C" { > > #include > > #include > > -/* > > - * Stability definitions > > - * > > - * These #defines are used in the tables of identifiers below to fill in the > > - * attribute fields associated with each identifier. The DT_ATTR_* macros are > > - * a convenience to permit more concise declarations of common attributes such > > - * as Stable/Stable/Common. > > - * > > - * Refer to the Solaris Dynamic Tracing Guide Stability chapter respectively > > - * for an explanation of these DTrace features and their values. > > - */ > > -#define DT_ATTR_STABCMN { DTRACE_STABILITY_STABLE, \ > > - DTRACE_STABILITY_STABLE, DTRACE_CLASS_COMMON } > > - > > -#define DT_ATTR_EVOLCMN { DTRACE_STABILITY_EVOLVING, \ > > - DTRACE_STABILITY_EVOLVING, DTRACE_CLASS_COMMON \ > > -} > > - > > /* > > * Versioning definitions > > * > > @@ -46,9 +28,11 @@ extern "C" { > > * Refer to the Solaris Dynamic Tracing Guide Versioning chapter for an > > * explanation of these DTrace features and their values. > > * > > - * You must update DT_VERS_LATEST and DT_VERS_STRING when adding a new version, > > - * and then add the new version to the _dtrace_versions[] array declared in > > - * dt_open.c.. > > + * When adding a new version: > > + * - Add a new DT_VERS_* macro > > + * - Add the new DT_VERS_* macro at the end of the DTRACE_VERSIONS macro > > + * - Set DT_VERS_LATEST to the new DT_VERS_* > > + * - Update DT_VERS_STRING to reflect the new version > > * > > * NOTE: Although the DTrace versioning scheme supports the labeling and > > * introduction of incompatible changes (e.g. dropping an interface in a > > @@ -57,16 +41,18 @@ extern "C" { > > * we ever need to provide divergent interfaces, this will need work. > > * > > * The version number should be increased for every customer visible release > > - * of Solaris. The major number should be incremented when a fundamental > > - * change has been made that would affect all consumers, and would reflect > > - * sweeping changes to DTrace or the D language. The minor number should be > > - * incremented when a change is introduced that could break scripts that had > > - * previously worked; for example, adding a new built-in variable could break > > - * a script which was already using that identifier. The micro number should > > - * be changed when introducing functionality changes or major bug fixes that > > - * do not affect backward compatibility -- this is merely to make capabilities > > - * easily determined from the version number. Minor bugs do not require any > > - * modification to the version number. > > + * of Linux. > > of Linux?? But we do not release with Linux versions. Oops, "of DTrace for Linux" was meant to be there. > > + * - The major number should be incremented when a fundamental change has been > > + * made that would affect all consumers, and would reflect sweeping changes > > + * to DTrace or the D language. > > + * - The minor number should be incremented when a change is introduced that > > + * could break scripts that had previously worked; for example, adding a new > > + * built-in variable could break a script which was already using that > > + * identifier. > > + * - The micro number should be changed when introducing functionality changes > > + * or major bug fixes that do not affect backward compatibility -- this is > > + * merely to make capabilities easily determined from the version number. > > + * Minor bugs do not require any modification to the version number. > > */ > > #define DT_VERS_1_0 DT_VERSION_NUMBER(1, 0, 0) > > #define DT_VERS_1_1 DT_VERSION_NUMBER(1, 1, 0) > > @@ -84,9 +70,59 @@ extern "C" { > > #define DT_VERS_1_6_4 DT_VERSION_NUMBER(1, 6, 4) > > #define DT_VERS_2_0 DT_VERSION_NUMBER(2, 0, 0) > > #define DT_VERS_2_0_1 DT_VERSION_NUMBER(2, 0, 1) > > +#define DT_VERS_2_0_2 DT_VERSION_NUMBER(2, 0, 2) > > + > > +#define DTRACE_VERSIONS { \ > > Why was DT_VERS_1_0 dropped from this list? Accidental deletion - fixed. > > + DT_VERS_1_1, /* D API 1.1.0 Solaris Express 6/05 */ \ > > + DT_VERS_1_2, /* D API 1.2.0 Solaris 10 Update 1 */ \ > > + DT_VERS_1_2_1, /* D API 1.2.1 Solaris Express 4/06 */ \ > > + DT_VERS_1_2_2, /* D API 1.2.2 Solaris Express 6/06 */ \ > > + DT_VERS_1_3, /* D API 1.3 Solaris Express 10/06 */ \ > > + DT_VERS_1_4, /* D API 1.4 Solaris Express 2/07 */ \ > > + DT_VERS_1_4_1, /* D API 1.4.1 Solaris Express 4/07 */ \ > > + DT_VERS_1_5, /* D API 1.5 Solaris Express 7/07 */ \ > > + DT_VERS_1_6, /* D API 1.6 */ \ > > + DT_VERS_1_6_1, /* D API 1.6.1 */ \ > > + DT_VERS_1_6_2, /* D API 1.6.2 */ \ > > + DT_VERS_1_6_3, /* D API 1.6.3 */ \ > > + DT_VERS_1_6_4, /* D API 1.6.4 */ \ > > + DT_VERS_2_0, /* D API 2.0 */ \ > > + DT_VERS_2_0_1, /* D API 2.0.1 */ \ > > + DT_VERS_2_0_2, /* D API 2.0.2 */ \ > > +} > > + > > +#define DT_VERS_LATEST DT_VERS_2_0_2 > > +#define DT_VERS_STRING "Oracle D 2.0.2" > > + > > +/* > > + * Interfaces for parsing and handling DTrace version strings. Version binding > > + * is a feature of the D compiler that is handled completely independently of > > + * the DTrace kernel infrastructure, so the definitions are here in libdtrace. > > + * Version strings are compiled into an encoded uint32_t which can be compared > > + * using C comparison operators. > > + */ > > +#define DT_VERSION_STRMAX 16 /* enough for "255.4095.4095\0" */ > > +#define DT_VERSION_MAJMAX 0xFF /* maximum major version number */ > > +#define DT_VERSION_MINMAX 0xFFF /* maximum minor version number */ > > +#define DT_VERSION_MICMAX 0xFFF /* maximum micro version number */ > > + > > +#define DT_VERSION_NUMBER(M, m, u) \ > > + ((((M) & 0xFF) << 24) | (((m) & 0xFFF) << 12) | ((u) & 0xFFF)) > > + > > +#define DT_VERSION_MAJOR(v) (((v) & 0xFF000000) >> 24) > > +#define DT_VERSION_MINOR(v) (((v) & 0x00FFF000) >> 12) > > +#define DT_VERSION_MICRO(v) ((v) & 0x00000FFF) > > + > > +typedef uint32_t dt_version_t; > > + > > +extern const dt_version_t _dtrace_versions[]; > > +extern const char *const _dtrace_version; > > + > > +extern char *dt_version_num2str(dt_version_t, char *, size_t); > > +extern int dt_version_str2num(const char *, dt_version_t *); > > +extern int dt_version_defined(dt_version_t); > > -#define DT_VERS_LATEST DT_VERS_2_0_1 > > -#define DT_VERS_STRING "Oracle D 2.0" > > +extern int dt_str2kver(const char *, dt_version_t *); > > #ifdef __cplusplus > > } > > _______________________________________________ > DTrace-devel mailing list > DTrace-devel at oss.oracle.com > https://oss.oracle.com/mailman/listinfo/dtrace-devel From nick.alcock at oracle.com Fri Mar 7 14:42:24 2025 From: nick.alcock at oracle.com (Nick Alcock) Date: Fri, 07 Mar 2025 14:42:24 +0000 Subject: [DTrace-devel] [PATCH] Refactor the versioning handling system In-Reply-To: (Kris Van Hees via DTrace-devel's message of "Thu, 6 Mar 2025 09:09:24 -0500") References: <430ab6de-3ddf-35b2-692d-5e3a386e17c0@oracle.com> Message-ID: <875xkkhpgv.fsf@esperi.org.uk> On 6 Mar 2025, Kris Van Hees via DTrace-devel outgrape: > Nick: see question below (mentions your name to find it easily) I love the idea of this! (but haven't looked at it closely). > On Fri, Feb 28, 2025 at 03:55:03PM -0500, Eugene Loh via DTrace-devel wrote: >> Did it use to be the case that when you ran DTrace you would want to know >> both what version of the tool is this and what version of the API?? But now >> there is only one version number?? We've been going that way anyhow, but >> maybe if we're committing to that simpler versioning, we should say so?? >> (The whole topic confused me, in any case.) > > To my knowledge (and based on the code) there has never been a way to print > any version other than the _dtrace_version which is the API version. It is > possible that prior to Linux a packaging version was assigned other than the > API version, but that seems doubtful because it would essentially be quite > meaningless and confusing. > > Theoretically, it could be envisioned that there be a tool version > (cmd/dtrace.c), library version (libdtrace), API version (exposed API of > libdtrace), and D language version. But since we are not dealing with a > complex frontend, multiple consumers, etc... I think that a single version > suffices for now. And that seems to be what has been done all along. Well, yes, but we already spit out more version info with dtrace -vV (git commit ID info). It is possible that we might stuff more in there too -- but right now they're all going to advance more or less in lockstep, and printing the one of them that changes with the highest frequency (as we are) should do. >> > new version is introduced. >> > >> > For building, the GNUmakefile and dtrace.spec files will also need >> > to be updated with the new version number. >> >> Maybe this comment about GNUmakefile and dtrace.spec should be in >> dt_version.h?? Also, what about libdtrace/Build (and libdtrace_VERSION)? > > Hm, yes, I can do that (and make it more generic so it refers to all packaging > config files that distros might need to update). > > As far as libdtrace_VERSION is concerned, I think Nick is best positioned to > answer whether that should follow the DTrace version or whether keeping it at > 2.0.0 is the most appropriate action, Only the SONAME (i.e. usually the first digit) really matters. The rest is up to individual projects. Some projects bump on every release, but honestly I think it makes more sense to bump the middle version number when we add libdtrace API functions (resetting the last to 0) and the last version on every release. That way, people can tell whether two library versions will be compatible if you upgrade by looking at the first digit, and if they'll be compatible if you *downgrade* by looking at the first and second. This is rare enough that it shouldn't be a burden (OK, OK, I'm sure we'll forget it every time and it'll end up exactly the same as what we're doing now, but in a perfect world...) From nick.alcock at oracle.com Fri Mar 7 14:43:00 2025 From: nick.alcock at oracle.com (Nick Alcock) Date: Fri, 7 Mar 2025 14:43:00 +0000 Subject: [DTrace-devel] [PATCH] dt_pid: pid grabs should be shortlived Message-ID: <20250307144300.230034-1-nick.alcock@oracle.com> If we use long-lived grabs for this, we are requiring that the process is ptraceable, and thus preventing pid tracing of system daemons, init, processes already being debugged or traced by others, etc. Signed-off-by: Nick Alcock --- libdtrace/dt_pid.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/libdtrace/dt_pid.c b/libdtrace/dt_pid.c index 76608f6904fee..4135c3ea656ec 100644 --- a/libdtrace/dt_pid.c +++ b/libdtrace/dt_pid.c @@ -1243,7 +1243,8 @@ dt_pid_create_pid_probes(dtrace_probedesc_t *pdp, dtrace_hdl_t *dtp, dt_pcb_t *p return 0; /* Grab the process. */ - if (dt_proc_grab_lock(dtp, pid, DTRACE_PROC_WAITING) < 0) { + if (dt_proc_grab_lock(dtp, pid, DTRACE_PROC_WAITING + | DTRACE_PROC_SHORTLIVED) < 0) { dt_pid_error(dtp, pcb, NULL, D_PROC_GRAB, "failed to grab process %d", (int)pid); return -1; base-commit: 39a5e0a8866b38679619fa357bb3082bc245aada -- 2.48.1.283.g18c60a128c From kris.van.hees at oracle.com Fri Mar 7 21:33:20 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Fri, 7 Mar 2025 16:33:20 -0500 Subject: [DTrace-devel] [PATCH 1/8] proc: convert to use standard SDT provider implementation Message-ID: <20250307213320.9439-1-kris.van.hees@oracle.com> The prov provider was the first SDT-based provider implememted in this version, and therefore handled the enabling of probes with custom code. When the other SDT-based providers (sched, ...) were implemented, a generic SDT-framework was developed. The proc provider now uses the SDT-framework. Signed-off-by: Kris Van Hees --- libdtrace/dt_prov_proc.c | 316 +++++---------------------------------- 1 file changed, 36 insertions(+), 280 deletions(-) diff --git a/libdtrace/dt_prov_proc.c b/libdtrace/dt_prov_proc.c index 2e514860..15fde6c9 100644 --- a/libdtrace/dt_prov_proc.c +++ b/libdtrace/dt_prov_proc.c @@ -8,77 +8,45 @@ */ #include #include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include #include "dt_dctx.h" #include "dt_cg.h" +#include "dt_provider_sdt.h" #include "dt_probe.h" -#include "dt_pt_regs.h" static const char prvname[] = "proc"; static const char modname[] = "vmlinux"; -/* - * The proc-provider probes make use of probes that are already provided by - * other providers. As such, the proc probes are 'dependent probes' because - * they depend on underlying probes to get triggered and they also depend on - * argument data provided by the underlying probe to manufacture their own - * arguments. - * - * As a type of SDT probes, proc probes are defined with a signature (list of - * arguments - possibly empty) that may use translator support to provide the - * actual argument values. Therefore, obtaining the value of arguments for - * a proc probe goes through two layers of processing: - * - * (1) the arguments of the underlying probe are reworked to match the - * expected layout of raw arguments for the proc probe - * (2) an argument mapping table (and supporting translators) is used to get - * the value of an arguument based on the raw variable data of the proc - * probe - * - * To accomplish this, proc probes generate a trampoline that rewrites the - * arguments of the underlying probe. (The dependent probe support code in the - * underlying probe saves the arguments of the underying probe in the mstate - * before executing the trampoline and clauses of the dependent probe, and it - * restores them afterwards in case there are multiple dependent probes.) - * - * Because proc probes dependent on an underlying probe that may be too generic - * (e.g. proc:::exec-success depending on syscall::execve*:return), the - * trampoline code can include a pre-condition (much like a predicate) that can - * bypass execution unless the condition is met (e.g. proc:::exec-success - * requires syscall::execve*:return's arg1 to be 0). - * - * FIXME: - * The dependent probe support should include a priority specification to drive - * the order in which dependent probes are added to the underlying probe. This - * is needed to enforce specific probe firing semantics (e.g. proc:::start must - * always precede proc:::lwp-start). - */ - -typedef struct probe_arg { - const char *name; /* name of probe */ - int argno; /* argument number */ - dt_argdesc_t argdesc; /* argument description */ -} probe_arg_t; +static probe_dep_t probes[] = { + { "create", + DTRACE_PROBESPEC_NAME, "rawtp:sched::sched_process_fork" }, + { "exec", + DTRACE_PROBESPEC_NAME, "syscall::execve*:entry" }, + { "exec-failure", + DTRACE_PROBESPEC_NAME, "syscall::execve*:return" }, + { "exec-success", + DTRACE_PROBESPEC_NAME, "syscall::execve*:return" }, + { "exit", + DTRACE_PROBESPEC_NAME, "rawtp:sched::sched_process_exit" }, + { "lwp-create", + DTRACE_PROBESPEC_NAME, "rawtp:sched::sched_process_fork" }, + { "lwp-exit", + DTRACE_PROBESPEC_NAME, "rawtp:sched::sched_process_exit" }, + { "lwp-start", + DTRACE_PROBESPEC_NAME, "fbt::schedule_tail:return" }, + { "signal-clear", + DTRACE_PROBESPEC_NAME, "syscall::rt_sigtimedwait:return" }, + { "signal-discard", + DTRACE_PROBESPEC_NAME, "rawtp:signal::signal_generate" }, + { "signal-handle", + DTRACE_PROBESPEC_NAME, "rawtp:signal::signal_deliver" }, + { "signal-send", + DTRACE_PROBESPEC_NAME, "fbt::complete_signal:entry" }, + { "start", + DTRACE_PROBESPEC_NAME, "fbt::schedule_tail:return" }, + { NULL, } +}; -/* - * Probe signature specifications - * - * This table *must* group the arguments of probes. I.e. the arguments of a - * given probe must be listed in consecutive records. - * A single probe entry that mentions only name of the probe indicates a probe - * that provides no arguments. - */ static probe_arg_t probe_args[] = { { "create", 0, { 0, 0, "struct task_struct *", "psinfo_t *" } }, { "exec", 0, { 0, DT_NF_USERLAND, "string", } }, @@ -100,6 +68,7 @@ static probe_arg_t probe_args[] = { { "signal-send", 1, { 0, 0, "struct task_struct *", "psinfo_t *" } }, { "signal-send", 2, { 1, 0, "int", } }, { "start", }, + { NULL, }, }; static const dtrace_pattr_t pattr = { @@ -115,173 +84,8 @@ static const dtrace_pattr_t pattr = { */ static int populate(dtrace_hdl_t *dtp) { - dt_provider_t *prv; - int i; - int n = 0; - - prv = dt_provider_create(dtp, prvname, &dt_proc, &pattr, NULL); - if (prv == NULL) - return -1; /* errno already set */ - - /* - * Create "proc" probes based on the probe_args list. Since each probe - * will have at least one entry (with argno == 0), we can use those - * entries to identify the probe names. - */ - for (i = 0; i < ARRAY_SIZE(probe_args); i++) { - probe_arg_t *arg = &probe_args[i]; - - if (arg->argno == 0 && - dt_probe_insert(dtp, prv, prvname, modname, "", arg->name, - NULL)) - n++; - } - - return n; -} - -static void enable(dtrace_hdl_t *dtp, dt_probe_t *prp) -{ - dt_probe_t *uprp = NULL; - dtrace_probedesc_t pd; - - if (strcmp(prp->desc->prb, "create") == 0 || - strcmp(prp->desc->prb, "lwp-create") == 0) { - pd.id = DTRACE_IDNONE; - pd.prv = "rawtp"; - pd.mod = "sched"; - pd.fun = ""; - pd.prb = "sched_process_fork"; - - uprp = dt_probe_lookup(dtp, &pd); - assert(uprp != NULL); - - dt_probe_add_dependent(dtp, uprp, prp); - dt_probe_enable(dtp, uprp); - } else if (strcmp(prp->desc->prb, "exec") == 0) { - pd.id = DTRACE_IDNONE; - pd.prv = "syscall"; - pd.mod = ""; - pd.fun = "execve"; - pd.prb = "entry"; - - uprp = dt_probe_lookup(dtp, &pd); - assert(uprp != NULL); - - dt_probe_add_dependent(dtp, uprp, prp); - dt_probe_enable(dtp, uprp); - - pd.fun = "execveat"; - - uprp = dt_probe_lookup(dtp, &pd); - assert(uprp != NULL); - - dt_probe_add_dependent(dtp, uprp, prp); - dt_probe_enable(dtp, uprp); - } else if (strcmp(prp->desc->prb, "exec-failure") == 0 || - strcmp(prp->desc->prb, "exec-success") == 0) { - pd.id = DTRACE_IDNONE; - pd.prv = "syscall"; - pd.mod = ""; - pd.fun = "execve"; - pd.prb = "return"; - - uprp = dt_probe_lookup(dtp, &pd); - assert(uprp != NULL); - - dt_probe_add_dependent(dtp, uprp, prp); - dt_probe_enable(dtp, uprp); - - pd.fun = "execveat"; - - uprp = dt_probe_lookup(dtp, &pd); - assert(uprp != NULL); - - dt_probe_add_dependent(dtp, uprp, prp); - dt_probe_enable(dtp, uprp); - } else if (strcmp(prp->desc->prb, "exit") == 0 || - strcmp(prp->desc->prb, "lwp-exit") == 0) { - pd.id = DTRACE_IDNONE; - pd.prv = "rawtp"; - pd.mod = ""; - pd.fun = ""; - pd.prb = "sched_process_exit"; - - uprp = dt_probe_lookup(dtp, &pd); - assert(uprp != NULL); - - dt_probe_add_dependent(dtp, uprp, prp); - dt_probe_enable(dtp, uprp); - } else if (strcmp(prp->desc->prb, "signal-clear") == 0) { - pd.id = DTRACE_IDNONE; - pd.prv = "syscall"; - pd.mod = ""; - pd.fun = "rt_sigtimedwait"; - pd.prb = "return"; - - uprp = dt_probe_lookup(dtp, &pd); - assert(uprp != NULL); - - dt_probe_add_dependent(dtp, uprp, prp); - dt_probe_enable(dtp, uprp); - } else if (strcmp(prp->desc->prb, "signal-discard") == 0) { - pd.id = DTRACE_IDNONE; - pd.prv = "rawtp"; - pd.mod = "signal"; - pd.fun = ""; - pd.prb = "signal_generate"; - - uprp = dt_probe_lookup(dtp, &pd); - assert(uprp != NULL); - - dt_probe_add_dependent(dtp, uprp, prp); - dt_probe_enable(dtp, uprp); - } else if (strcmp(prp->desc->prb, "signal-handle") == 0) { - pd.id = DTRACE_IDNONE; - pd.prv = "rawtp"; - pd.mod = "signal"; - pd.fun = ""; - pd.prb = "signal_deliver"; - - uprp = dt_probe_lookup(dtp, &pd); - assert(uprp != NULL); - - dt_probe_add_dependent(dtp, uprp, prp); - dt_probe_enable(dtp, uprp); - } else if (strcmp(prp->desc->prb, "signal-send") == 0) { - pd.id = DTRACE_IDNONE; - pd.prv = "fbt"; - pd.mod = ""; - pd.fun = "complete_signal"; - pd.prb = "entry"; - - uprp = dt_probe_lookup(dtp, &pd); - assert(uprp != NULL); - - dt_probe_add_dependent(dtp, uprp, prp); - dt_probe_enable(dtp, uprp); - } else if (strcmp(prp->desc->prb, "start") == 0 || - strcmp(prp->desc->prb, "lwp-start") == 0) { - pd.id = DTRACE_IDNONE; - pd.prv = "fbt"; - pd.mod = ""; - pd.fun = "schedule_tail"; - pd.prb = "return"; - - uprp = dt_probe_lookup(dtp, &pd); - assert(uprp != NULL); - - dt_probe_add_dependent(dtp, uprp, prp); - dt_probe_enable(dtp, uprp); - } - - /* - * Finally, ensure we're in the list of enablings as well. - * (This ensures that, among other things, the probes map - * gains entries for us.) - */ - if (!dt_in_list(&dtp->dt_enablings, prp)) - dt_list_append(&dtp->dt_enablings, prp); + return dt_sdt_populate(dtp, prvname, modname, &dt_proc, &pattr, + probe_args, probes); } /* @@ -434,61 +238,13 @@ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl) return 0; } -static int probe_info(dtrace_hdl_t *dtp, const dt_probe_t *prp, - int *argcp, dt_argdesc_t **argvp) -{ - int i; - int pidx = -1; - int argc = 0; - dt_argdesc_t *argv = NULL; - - for (i = 0; i < ARRAY_SIZE(probe_args); i++) { - probe_arg_t *arg = &probe_args[i]; - - if (strcmp(arg->name, prp->desc->prb) == 0) { - if (pidx == -1) { - pidx = i; - - if (arg->argdesc.native == NULL) - break; - } - - argc++; - } - } - - if (argc == 0) - goto done; - - argv = dt_zalloc(dtp, argc * sizeof(dt_argdesc_t)); - if (!argv) - return -ENOMEM; - - for (i = pidx; i < pidx + argc; i++) { - probe_arg_t *arg = &probe_args[i]; - dt_argdesc_t *argd = &arg->argdesc; - dt_argdesc_t *parg = &argv[arg->argno]; - - *parg = *argd; - if (argd->native) - parg->native = strdup(argd->native); - if (argd->xlate) - parg->xlate = strdup(argd->xlate); - } - -done: - *argcp = argc; - *argvp = argv; - - return 0; -} - dt_provimpl_t dt_proc = { .name = prvname, .prog_type = BPF_PROG_TYPE_UNSPEC, .populate = &populate, - .enable = &enable, + .enable = &dt_sdt_enable, .load_prog = &dt_bpf_prog_load, .trampoline = &trampoline, - .probe_info = &probe_info, + .probe_info = &dt_sdt_probe_info, + .destroy = &dt_sdt_destroy, }; -- 2.45.2 From kris.van.hees at oracle.com Fri Mar 7 21:34:35 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Fri, 7 Mar 2025 16:34:35 -0500 Subject: [DTrace-devel] [PATCH 2/8] sched: clean up unnecessary includes and functions Message-ID: <20250307213441.9495-1-kris.van.hees@oracle.com> Signed-off-by: Kris Van Hees --- libdtrace/dt_prov_sched.c | 30 ++---------------------------- 1 file changed, 2 insertions(+), 28 deletions(-) diff --git a/libdtrace/dt_prov_sched.c b/libdtrace/dt_prov_sched.c index e05ef246..125d5891 100644 --- a/libdtrace/dt_prov_sched.c +++ b/libdtrace/dt_prov_sched.c @@ -1,6 +1,6 @@ /* * Oracle Linux DTrace. - * Copyright (c) 2023, 2024, Oracle and/or its affiliates. All rights reserved. + * Copyright (c) 2023, 2025, Oracle and/or its affiliates. All rights reserved. * Licensed under the Universal Permissive License v 1.0 as shown at * http://oss.oracle.com/licenses/upl. * @@ -9,9 +9,6 @@ #include #include -#include -#include - #include "dt_dctx.h" #include "dt_cg.h" #include "dt_provider_sdt.h" @@ -146,36 +143,13 @@ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl) return 0; } -/* - * We need a custom enabling for on-cpu probes to ensure that the fbt function - * __perf_event_task_sched_in is called. __perf_event_task_sched_in will not - * be called unless context switch perf events have been enabled, so we do that - * here by opening a context switch count perf event but not attaching anything - * to it to minimize overhead. The alternative - attaching to - * cpc:::context_switches-all-1 and weeding out on- versus off-cpu events via a - * trampoline is too expensive. This approach works stably across kernels - * because __perf_event_task_sched_in() is not static, so not potentially - * subject to inlining or other optimizations. - */ -static void enable(dtrace_hdl_t *dtp, dt_probe_t *prp) -{ - return dt_sdt_enable(dtp, prp); -} - -static void detach(dtrace_hdl_t *dtp, const dt_probe_t *prp) -{ - if (prp->prv_data) - close((int)(long)prp->prv_data); -} - dt_provimpl_t dt_sched = { .name = prvname, .prog_type = BPF_PROG_TYPE_UNSPEC, .populate = &populate, - .enable = &enable, + .enable = &dt_sdt_enable, .load_prog = &dt_bpf_prog_load, .trampoline = &trampoline, .probe_info = &dt_sdt_probe_info, - .detach = &detach, .destroy = &dt_sdt_destroy, }; -- 2.45.2 From kris.van.hees at oracle.com Fri Mar 7 21:34:36 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Fri, 7 Mar 2025 16:34:36 -0500 Subject: [DTrace-devel] [PATCH 3/8] rawfbt: perform lookup on true symbol names In-Reply-To: <20250307213441.9495-1-kris.van.hees@oracle.com> References: <20250307213441.9495-1-kris.van.hees@oracle.com> Message-ID: <20250307213441.9495-2-kris.van.hees@oracle.com> When encountering a . symbol, a symbol lookup was done for instead of . under the assumption that names with . in them were not listed in kallsyms. But that is not true. Signed-off-by: Kris Van Hees --- libdtrace/dt_prov_rawfbt.c | 18 ------------------ 1 file changed, 18 deletions(-) diff --git a/libdtrace/dt_prov_rawfbt.c b/libdtrace/dt_prov_rawfbt.c index 4c8e8130..62f2f4f0 100644 --- a/libdtrace/dt_prov_rawfbt.c +++ b/libdtrace/dt_prov_rawfbt.c @@ -122,27 +122,9 @@ static int populate(dtrace_hdl_t *dtp) * try to determine the module name. */ if (!p) { - char *q; - - /* - * For synthetic symbol names (those containing '.'), - * we need to use the base name (before the '.') for - * module name lookup, because the synthetic forms are - * not recorded in kallsyms information. - * - * We replace the first '.' with a 0 to terminate the - * string, and after the lookup, we put it back. - */ - q = strchr(buf, '.'); - if (q != NULL) - *q = '\0'; - if (dtrace_lookup_by_name(dtp, DTRACE_OBJ_KMODS, buf, NULL, &sip) == 0) mod = sip.object; - - if (q != NULL) - *q = '.'; } else mod = p; -- 2.45.2 From kris.van.hees at oracle.com Fri Mar 7 21:34:37 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Fri, 7 Mar 2025 16:34:37 -0500 Subject: [DTrace-devel] [PATCH 4/8] ksyms: make symbol name filters less picky In-Reply-To: <20250307213441.9495-1-kris.van.hees@oracle.com> References: <20250307213441.9495-1-kris.van.hees@oracle.com> Message-ID: <20250307213441.9495-3-kris.van.hees@oracle.com> Some symbols were being filtered out even though they represent symbols that can actually be probed. Signed-off-by: Kris Van Hees --- libdtrace/dt_module.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/libdtrace/dt_module.c b/libdtrace/dt_module.c index dc00aa88..2e915e2f 100644 --- a/libdtrace/dt_module.c +++ b/libdtrace/dt_module.c @@ -1215,7 +1215,7 @@ dt_modsym_addsym(dtrace_hdl_t *dtp, dt_module_t *dmp, dt_kallsym_t *sym, (strstarts(sym->name, "__syscall_meta__")) || (strstarts(sym->name, "__p_syscall_meta__")) || (strstarts(sym->name, "__event_")) || - (strstarts(sym->name, "event_")) || + (strstarts(sym->name, "event_") && sym->type == 'd') || (strstarts(sym->name, "ftrace_event_")) || (strstarts(sym->name, "types__")) || (strstarts(sym->name, "args__")) || @@ -1223,7 +1223,6 @@ dt_modsym_addsym(dtrace_hdl_t *dtp, dt_module_t *dmp, dt_kallsym_t *sym, (strstarts(sym->name, "__tpstrtab_")) || (strstarts(sym->name, "__tpstrtab__")) || (strstarts(sym->name, "__initcall_")) || - (strstarts(sym->name, "__setup_")) || (strstarts(sym->name, "__pci_fixup_"))) skip = 1; #undef strstarts -- 2.45.2 From kris.van.hees at oracle.com Fri Mar 7 21:34:38 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Fri, 7 Mar 2025 16:34:38 -0500 Subject: [DTrace-devel] [PATCH 5/8] symtab: add support for 'traceable' flag In-Reply-To: <20250307213441.9495-1-kris.van.hees@oracle.com> References: <20250307213441.9495-1-kris.van.hees@oracle.com> Message-ID: <20250307213441.9495-4-kris.van.hees@oracle.com> Signed-off-by: Kris Van Hees --- libdtrace/dt_symtab.c | 53 +++++++++++++++++++++++++++++++++++++++---- libdtrace/dt_symtab.h | 6 +++++ 2 files changed, 55 insertions(+), 4 deletions(-) diff --git a/libdtrace/dt_symtab.c b/libdtrace/dt_symtab.c index db63cc88..4e46f280 100644 --- a/libdtrace/dt_symtab.c +++ b/libdtrace/dt_symtab.c @@ -23,9 +23,12 @@ #include #include -#define DT_ST_SORTED 0x01 /* Sorted, ready for searching. */ -#define DT_ST_PACKED 0x02 /* Symbol table packed +#define DT_ST_SORTED 0x01 /* Sorted, ready for searching. */ +#define DT_ST_PACKED 0x02 /* Symbol table packed * (necessarily sorted too) */ +#define DT_ST_TRACEABLE 0x04 /* Symbols have traceable flag */ + +#define DT_STB_TRACE 8 /* traceable symbol */ struct dt_symbol { dt_list_t dts_list; /* list forward/back pointers */ @@ -275,6 +278,12 @@ dt_symbol_by_name(dtrace_hdl_t *dtp, const char *name) return dt_htab_lookup(dtp->dt_kernsyms, &tmpl); } +dt_symbol_t * +dt_symbol_by_name_next(const dt_symbol_t *symbol) +{ + return symbol ? (dt_symbol_t *)symbol->dts_he.next : NULL; +} + /* Find a symbol in a given module. */ dt_symbol_t * dt_module_symbol_by_name(dtrace_hdl_t *dtp, dt_module_t *dmp, const char *name) @@ -548,7 +557,7 @@ dt_symbol_name(const dt_symbol_t *symbol) void dt_symbol_to_elfsym64(dtrace_hdl_t *dtp, dt_symbol_t *symbol, Elf64_Sym *elf_symp) { - elf_symp->st_info = symbol->dts_info; + elf_symp->st_info = symbol->dts_info & ~GELF_ST_INFO(DT_STB_TRACE, 0); elf_symp->st_value = symbol->dts_addr; elf_symp->st_size = symbol->dts_size; elf_symp->st_shndx = 1; /* 'not SHN_UNDEF' is all we guarantee */ @@ -557,7 +566,7 @@ dt_symbol_to_elfsym64(dtrace_hdl_t *dtp, dt_symbol_t *symbol, Elf64_Sym *elf_sym void dt_symbol_to_elfsym32(dtrace_hdl_t *dtp, dt_symbol_t *symbol, Elf32_Sym *elf_symp) { - elf_symp->st_info = symbol->dts_info; + elf_symp->st_info = symbol->dts_info & ~GELF_ST_INFO(DT_STB_TRACE, 0); elf_symp->st_value = symbol->dts_addr; elf_symp->st_size = symbol->dts_size; elf_symp->st_shndx = 1; /* 'not SHN_UNDEF' is all we guarantee */ @@ -581,3 +590,39 @@ dt_symbol_module(dt_symbol_t *symbol) { return symbol->dts_dmp; } + +/* + * Mark the symtab annotated with traceable flags on symbols. + */ +void +dt_symtab_set_traceable(dt_symtab_t *symtab) +{ + symtab->dtst_flags |= DT_ST_TRACEABLE; +} + +/* + * Return whether symbols have traceable flags. + */ +int +dt_symtab_traceable(const dt_symtab_t *symtab) +{ + return symtab->dtst_flags & DT_ST_TRACEABLE; +} + +/* + * Mark a symbol as traceable. + */ +void +dt_symbol_set_traceable(dt_symbol_t *symbol) +{ + symbol->dts_info |= GELF_ST_INFO(DT_STB_TRACE, 0); +} + +/* + * Return true if the symbol can be traced. + */ +int +dt_symbol_traceable(const dt_symbol_t *symbol) +{ + return GELF_ST_BIND(symbol->dts_info) & DT_STB_TRACE; +} diff --git a/libdtrace/dt_symtab.h b/libdtrace/dt_symtab.h index 8d396c46..9ee60c38 100644 --- a/libdtrace/dt_symtab.h +++ b/libdtrace/dt_symtab.h @@ -39,6 +39,7 @@ extern dt_symbol_t *dt_symbol_insert(dtrace_hdl_t *dtp, dt_symtab_t *symtab, struct dt_module *dmp, const char *name, GElf_Addr addr, GElf_Xword size, unsigned char info); extern dt_symbol_t *dt_symbol_by_name(dtrace_hdl_t *dtp, const char *name); +extern dt_symbol_t *dt_symbol_by_name_next(const dt_symbol_t *symbol); extern dt_symbol_t *dt_module_symbol_by_name(dtrace_hdl_t *dtp, struct dt_module *dmp, const char *name); extern dt_symbol_t *dt_symbol_by_addr(dt_symtab_t *symtab, GElf_Addr dts_addr); @@ -51,6 +52,11 @@ extern void dt_symbol_to_elfsym(dtrace_hdl_t *dtp, dt_symbol_t *symbol, GElf_Sym *elf_symp); extern struct dt_module *dt_symbol_module(dt_symbol_t *symbol); +extern void dt_symtab_set_traceable(dt_symtab_t *symtab); +extern int dt_symtab_traceable(const dt_symtab_t *symtab); +extern void dt_symbol_set_traceable(dt_symbol_t *symbol); +extern int dt_symbol_traceable(const dt_symbol_t *symbol); + #ifdef __cplusplus } #endif -- 2.45.2 From kris.van.hees at oracle.com Fri Mar 7 21:34:39 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Fri, 7 Mar 2025 16:34:39 -0500 Subject: [DTrace-devel] [PATCH 6/8] fbt: performance improvements In-Reply-To: <20250307213441.9495-1-kris.van.hees@oracle.com> References: <20250307213441.9495-1-kris.van.hees@oracle.com> Message-ID: <20250307213441.9495-5-kris.van.hees@oracle.com> Up until now, FBT probes were registered for every symbol that was listed as traceable. Most tracing session do not use most or even any of these, and the process of registering them all was quite slow. Going forward, FBT probes are registered on demand. If any FBT probes are to be registered, the first will incur the cost of reading the entire list of traceable symbols. Any further FBT probe registration will be able to be satisfied based on that initial processing. The performance improvement is therefore quite significant for tracing sessions that do not trigger any FBT probe registration, and if FBT probes are used, the improvement is still quite noticable because only the probes that are actually needed get registered. Signed-off-by: Kris Van Hees --- libdtrace/dt_module.c | 78 +++++++++++++++ libdtrace/dt_module.h | 2 + libdtrace/dt_prov_fbt.c | 217 +++++++++++++++++++++++++++------------- 3 files changed, 228 insertions(+), 69 deletions(-) diff --git a/libdtrace/dt_module.c b/libdtrace/dt_module.c index 2e915e2f..e7553a07 100644 --- a/libdtrace/dt_module.c +++ b/libdtrace/dt_module.c @@ -22,6 +22,7 @@ #include #include +#include #include #include @@ -1044,6 +1045,83 @@ dt_kern_module_find_ctf(dtrace_hdl_t *dtp, dt_module_t *dmp) } } +#define PROBE_LIST TRACEFS "available_filter_functions" + +/* + * Determine which kernel functions are traceable and mark them. + */ +void +dt_modsym_mark_traceable(dtrace_hdl_t *dtp) +{ + FILE *f; + char *buf = NULL; + size_t len = 0; + + if (dt_symtab_traceable(dtp->dt_exec->dm_kernsyms)) + return; + + f = fopen(PROBE_LIST, "r"); + if (f == NULL) + return; + + while (getline(&buf, &len, f) >= 0) { + char *p; + dt_symbol_t *sym = NULL; + + /* + * Here buf is either "funcname\n" or "funcname [modname]\n". + * The last line may not have a linefeed. + */ + p = strchr(buf, '\n'); + if (p) { + *p = '\0'; + if (p > buf && *(--p) == ']') + *p = '\0'; + } + + /* + * Now buf is either "funcname" or "funcname [modname". If + * there is no module name provided, we will use the default. + */ + p = strchr(buf, ' '); + if (p) { + *p++ = '\0'; + if (*p == '[') + p++; + } + +#define strstarts(var, x) (strncmp(var, x, strlen (x)) == 0) + /* Weed out __ftrace_invalid_address___* entries. */ + if (strstarts(buf, "__ftrace_invalid_address__") || + strstarts(buf, "__probestub_") || + strstarts(buf, "__traceiter_")) + continue; +#undef strstarts + + /* + * If we have a module name, look for the symbol in that + * module. + * If not, perform a general symbol lookup to find its first + * instance. + */ + if (p) { + dt_module_t *dmp = dt_module_lookup_by_name(dtp, p); + + if (dmp) + sym = dt_module_symbol_by_name(dtp, dmp, buf); + } else + sym = dt_symbol_by_name(dtp, buf); + + if (sym) + dt_symbol_set_traceable(sym); + } + + free(buf); + fclose(f); + + dt_symtab_set_traceable(dtp->dt_exec->dm_kernsyms); +} + /* * Symbol data can be collected in three ways: * - kallmodsyms diff --git a/libdtrace/dt_module.h b/libdtrace/dt_module.h index 56df17a6..dd3ad17c 100644 --- a/libdtrace/dt_module.h +++ b/libdtrace/dt_module.h @@ -25,6 +25,8 @@ extern dt_ident_t *dt_module_extern(dtrace_hdl_t *, dt_module_t *, extern const char *dt_module_modelname(dt_module_t *); +extern void dt_modsym_mark_traceable(dtrace_hdl_t *); + #ifdef __cplusplus } #endif diff --git a/libdtrace/dt_prov_fbt.c b/libdtrace/dt_prov_fbt.c index eef93879..d837e14d 100644 --- a/libdtrace/dt_prov_fbt.c +++ b/libdtrace/dt_prov_fbt.c @@ -41,10 +41,8 @@ #include "dt_pt_regs.h" static const char prvname[] = "fbt"; -static const char modname[] = "vmlinux"; #define KPROBE_EVENTS TRACEFS "kprobe_events" -#define PROBE_LIST TRACEFS "available_filter_functions" #define FBT_GROUP_FMT GROUP_FMT "_%s" #define FBT_GROUP_DATA GROUP_DATA, prp->desc->prb @@ -61,19 +59,11 @@ dt_provimpl_t dt_fbt_fprobe; dt_provimpl_t dt_fbt_kprobe; /* - * Scan the PROBE_LIST file and add entry and return probes for every function - * that is listed. + * Create the fbt provider. */ static int populate(dtrace_hdl_t *dtp) { dt_provider_t *prv; - FILE *f; - char *buf = NULL; - char *p; - const char *mod = modname; - size_t n; - dtrace_syminfo_t sip; - dtrace_probedesc_t pd; dt_fbt = BPF_HAS(dtp, BPF_FEAT_FENTRY) ? dt_fbt_fprobe : dt_fbt_kprobe; @@ -81,79 +71,166 @@ static int populate(dtrace_hdl_t *dtp) if (prv == NULL) return -1; /* errno already set */ - f = fopen(PROBE_LIST, "r"); - if (f == NULL) + return 0; +} + +/* Create a probe (if it does not exist yet). */ +static int provide_probe(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) +{ + dt_provider_t *prv = dt_provider_lookup(dtp, pdp->prv); + + if (prv == NULL) + return 0; + if (dt_probe_lookup(dtp, pdp) != NULL) return 0; + if (dt_tp_probe_insert(dtp, prv, pdp->prv, pdp->mod, pdp->fun, pdp->prb)) + return 1; - while (getline(&buf, &n, f) >= 0) { - /* - * Here buf is either "funcname\n" or "funcname [modname]\n". - * The last line may not have a linefeed. - */ - p = strchr(buf, '\n'); - if (p) { - *p = '\0'; - if (p > buf && *(--p) == ']') - *p = '\0'; + return 0; +} + +/* + * Try to provide probes for the given probe description. The caller ensures + * that the provider name in probe desxcription (if any) is a match for this + * provider. When this is called, we already know that this provider matches + * the provider component of the probe specification. + */ +#define FBT_ENTRY 1 +#define FBT_RETURN 2 + +static int provide(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) +{ + int n = 0; + int prb = 0; + dt_module_t *dmp = NULL; + dt_symbol_t *sym = NULL; + dt_htab_next_t *it = NULL; + dtrace_probedesc_t pd; + + dt_modsym_mark_traceable(dtp); + + /* + * Nothing to do if a probe name is specified and cannot match 'entry' + * or 'return'. + */ + if (dt_gmatch("entry", pdp->prb)) + prb |= FBT_ENTRY; + if (dt_gmatch("return", pdp->prb)) + prb |= FBT_RETURN; + if (prb == 0) + return 0; + + /* Synthetic function names are not supported for FBT. */ + if (strchr(pdp->fun, '.')) + return 0; + + /* + * If we have an explicit module name, check it. If not found, we can + * ignore this request. + */ + if (pdp->mod[0] != '\0' && strchr(pdp->mod, '*') == NULL) { + dmp = dt_module_lookup_by_name(dtp, pdp->mod); + if (dmp == NULL) + return 0; + } + + /* + * If we have an explicit function name, we start with a basic symbol + * name lookup. + */ + if (pdp->fun[0] != '\0' && strchr(pdp->fun, '*') == NULL) { + /* If we have a module, use it. */ + if (dmp != NULL) { + sym = dt_module_symbol_by_name(dtp, dmp, pdp->fun); + if (sym == NULL) + return 0; + if (!dt_symbol_traceable(sym)) + return 0; + + pd.id = DTRACE_IDNONE; + pd.prv = pdp->prv; + pd.mod = dmp->dm_name; + pd.fun = pdp->fun; + + if (prb & FBT_ENTRY) { + pd.prb = "entry"; + n += provide_probe(dtp, &pd); + } + if (prb & FBT_RETURN) { + pd.prb = "return"; + n += provide_probe(dtp, &pd); + } + + return n; } - /* - * Now buf is either "funcname" or "funcname [modname". If - * there is no module name provided, we will use the default. - */ - p = strchr(buf, ' '); - if (p) { - *p++ = '\0'; - if (*p == '[') - p++; + sym = dt_symbol_by_name(dtp, pdp->fun); + while (sym != NULL) { + const char *mod = dt_symbol_module(sym)->dm_name; + + if (dt_symbol_traceable(sym) && + dt_gmatch(mod, pdp->mod)) { + pd.id = DTRACE_IDNONE; + pd.prv = pdp->prv; + pd.mod = mod; + pd.fun = pdp->fun; + + if (prb & FBT_ENTRY) { + pd.prb = "entry"; + n += provide_probe(dtp, &pd); + } + if (prb & FBT_RETURN) { + pd.prb = "return"; + n += provide_probe(dtp, &pd); + } + + } + sym = dt_symbol_by_name_next(sym); } - /* Weed out synthetic symbol names (that are invalid). */ - if (strchr(buf, '.') != NULL) + return n; + } + + /* + * No explicit function name. We need to go through all possible + * symbol names and see if they match. + */ + while ((sym = dt_htab_next(dtp->dt_kernsyms, &it)) != NULL) { + dt_module_t *smp; + const char *fun; + + /* Ensure the symbol can be traced. */ + if (!dt_symbol_traceable(sym)) continue; -#define strstarts(var, x) (strncmp(var, x, strlen (x)) == 0) - /* Weed out __ftrace_invalid_address___* entries. */ - if (strstarts(buf, "__ftrace_invalid_address__") || - strstarts(buf, "__probestub_") || - strstarts(buf, "__traceiter_")) + /* Match the function name. */ + fun = dt_symbol_name(sym); + if (!dt_gmatch(fun, pdp->fun)) continue; -#undef strstarts - /* - * If we did not see a module name, perform a symbol lookup to - * try to determine the module name. - */ - if (!p) { - if (dtrace_lookup_by_name(dtp, DTRACE_OBJ_KMODS, buf, - NULL, &sip) == 0) - mod = sip.object; - } else - mod = p; + /* Validate the module name. */ + smp = dt_symbol_module(sym); + if (dmp) { + if (smp != dmp) + continue; + } else if (!dt_gmatch(smp->dm_name, pdp->mod)) + continue; - /* - * Due to the lack of module names in - * TRACEFS/available_filter_functions, there are some duplicate - * function names. We need to make sure that we do not create - * duplicate probes for these. - */ pd.id = DTRACE_IDNONE; - pd.prv = prvname; - pd.mod = mod; - pd.fun = buf; - pd.prb = "entry"; - if (dt_probe_lookup(dtp, &pd) != NULL) - continue; + pd.prv = pdp->prv; + pd.mod = smp->dm_name; + pd.fun = fun; - if (dt_tp_probe_insert(dtp, prv, prvname, mod, buf, "entry")) - n++; - if (dt_tp_probe_insert(dtp, prv, prvname, mod, buf, "return")) - n++; + if (prb & FBT_ENTRY) { + pd.prb = "entry"; + n += provide_probe(dtp, &pd); + } + if (prb & FBT_RETURN) { + pd.prb = "return"; + n += provide_probe(dtp, &pd); + } } - free(buf); - fclose(f); - return n; } @@ -447,6 +524,7 @@ dt_provimpl_t dt_fbt_fprobe = { .prog_type = BPF_PROG_TYPE_TRACING, .stack_skip = 4, .populate = &populate, + .provide = &provide, .load_prog = &fprobe_prog_load, .trampoline = &fprobe_trampoline, .attach = &dt_tp_probe_attach_raw, @@ -459,6 +537,7 @@ dt_provimpl_t dt_fbt_kprobe = { .name = prvname, .prog_type = BPF_PROG_TYPE_KPROBE, .populate = &populate, + .provide = &provide, .load_prog = &dt_bpf_prog_load, .trampoline = &kprobe_trampoline, .attach = &kprobe_attach, -- 2.45.2 From kris.van.hees at oracle.com Fri Mar 7 21:34:40 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Fri, 7 Mar 2025 16:34:40 -0500 Subject: [DTrace-devel] [PATCH 7/8] rawfbt: performance improvements In-Reply-To: <20250307213441.9495-1-kris.van.hees@oracle.com> References: <20250307213441.9495-1-kris.van.hees@oracle.com> Message-ID: <20250307213441.9495-6-kris.van.hees@oracle.com> Signed-off-by: Kris Van Hees --- libdtrace/dt_prov_rawfbt.c | 223 +++++++++++++++++++++++++------------ 1 file changed, 151 insertions(+), 72 deletions(-) diff --git a/libdtrace/dt_prov_rawfbt.c b/libdtrace/dt_prov_rawfbt.c index 62f2f4f0..52152655 100644 --- a/libdtrace/dt_prov_rawfbt.c +++ b/libdtrace/dt_prov_rawfbt.c @@ -44,10 +44,8 @@ #include "dt_pt_regs.h" static const char prvname[] = "rawfbt"; -static const char modname[] = "vmlinux"; #define KPROBE_EVENTS TRACEFS "kprobe_events" -#define PROBE_LIST TRACEFS "available_filter_functions" #define FBT_GROUP_FMT GROUP_FMT "_%s" #define FBT_GROUP_DATA GROUP_DATA, prp->desc->prb @@ -61,98 +59,178 @@ static const dtrace_pattr_t pattr = { }; /* - * Scan the PROBE_LIST file and add entry and return probes for every function - * that is listed. + * Create the rawfbt provider. */ static int populate(dtrace_hdl_t *dtp) { dt_provider_t *prv; - FILE *f; - char *buf = NULL; - size_t len = 0; - size_t n = 0; - dtrace_syminfo_t sip; - dtrace_probedesc_t pd; prv = dt_provider_create(dtp, prvname, &dt_rawfbt, &pattr, NULL); if (prv == NULL) return -1; /* errno already set */ - f = fopen(PROBE_LIST, "r"); - if (f == NULL) + return 0; +} + +/* Create a probe (if it does not exist yet). */ +static int provide_probe(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) +{ + dt_provider_t *prv = dt_provider_lookup(dtp, pdp->prv); + + if (prv == NULL) return 0; + if (dt_probe_lookup(dtp, pdp) != NULL) + return 0; +#ifdef DEBUG_FBT + if (dt_tp_probe_insert(dtp, prv, pdp->prv, pdp->mod, pdp->fun, pdp->prb)) { + fprintf(stderr, "%s(..., PROVIDE %s:%s:%s:%s) - ...\n", __func__, pdp->prv, pdp->mod, pdp->fun, pdp->prb); + return 1; + } +#else + if (dt_tp_probe_insert(dtp, prv, pdp->prv, pdp->mod, pdp->fun, pdp->prb)) + return 1; +#endif - while (getline(&buf, &len, f) >= 0) { - char *p; - const char *mod = modname; - dt_probe_t *prp; + return 0; +} - /* - * Here buf is either "funcname\n" or "funcname [modname]\n". - * The last line may not have a linefeed. - */ - p = strchr(buf, '\n'); - if (p) { - *p = '\0'; - if (p > buf && *(--p) == ']') - *p = '\0'; +/* + * Try to provide probes for the given probe description. The caller ensures + * that the provider name in probe desxcription (if any) is a match for this + * provider. When this is called, we already know that this provider matches + * the provider component of the probe specification. + */ +#define FBT_ENTRY 1 +#define FBT_RETURN 2 + +static int provide(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) +{ + int n = 0; + int prb = 0; + dt_module_t *dmp = NULL; + dt_symbol_t *sym = NULL; + dt_htab_next_t *it = NULL; + dtrace_probedesc_t pd; + + dt_modsym_mark_traceable(dtp); + + /* + * Nothing to do if a probe name is specified and cannot match 'entry' + * or 'return'. + */ + if (dt_gmatch("entry", pdp->prb)) + prb |= FBT_ENTRY; + if (dt_gmatch("return", pdp->prb)) + prb |= FBT_RETURN; + if (prb == 0) + return 0; + + /* + * If we have an explicit module name, check it. If not found, we can + * ignore this request. + */ + if (pdp->mod[0] != '\0' && strchr(pdp->mod, '*') == NULL) { + dmp = dt_module_lookup_by_name(dtp, pdp->mod); + if (dmp == NULL) + return 0; + } + + /* + * If we have an explicit function name, we start with a basic symbol + * name lookup. + */ + if (pdp->fun[0] != '\0' && strchr(pdp->fun, '*') == NULL) { + /* If we have a module, use it. */ + if (dmp != NULL) { + sym = dt_module_symbol_by_name(dtp, dmp, pdp->fun); + if (sym == NULL) + return 0; + if (!dt_symbol_traceable(sym)) + return 0; + + pd.id = DTRACE_IDNONE; + pd.prv = pdp->prv; + pd.mod = dmp->dm_name; + pd.fun = pdp->fun; + + if (prb & FBT_ENTRY) { + pd.prb = "entry"; + n += provide_probe(dtp, &pd); + } + if (prb & FBT_RETURN) { + pd.prb = "return"; + n += provide_probe(dtp, &pd); + } + + return n; } - /* - * Now buf is either "funcname" or "funcname [modname". If - * there is no module name provided, we will use the default. - */ - p = strchr(buf, ' '); - if (p) { - *p++ = '\0'; - if (*p == '[') - p++; + sym = dt_symbol_by_name(dtp, pdp->fun); + while (sym != NULL) { + const char *mod = dt_symbol_module(sym)->dm_name; + + if (dt_symbol_traceable(sym) && + dt_gmatch(mod, pdp->mod)) { + pd.id = DTRACE_IDNONE; + pd.prv = pdp->prv; + pd.mod = mod; + pd.fun = pdp->fun; + + if (prb & FBT_ENTRY) { + pd.prb = "entry"; + n += provide_probe(dtp, &pd); + } + if (prb & FBT_RETURN) { + pd.prb = "return"; + n += provide_probe(dtp, &pd); + } + + } + sym = dt_symbol_by_name_next(sym); } -#define strstarts(var, x) (strncmp(var, x, strlen (x)) == 0) - /* Weed out __ftrace_invalid_address___* entries. */ - if (strstarts(buf, "__ftrace_invalid_address__") || - strstarts(buf, "__probestub_") || - strstarts(buf, "__traceiter_")) + return n; + } + + /* + * No explicit function name. We need to go through all possible + * symbol names and see if they match. + */ + while ((sym = dt_htab_next(dtp->dt_kernsyms, &it)) != NULL) { + dt_module_t *smp; + const char *fun; + + /* Ensure the symbol can be traced. */ + if (!dt_symbol_traceable(sym)) continue; -#undef strstarts - /* - * If we did not see a module name, perform a symbol lookup to - * try to determine the module name. - */ - if (!p) { - if (dtrace_lookup_by_name(dtp, DTRACE_OBJ_KMODS, buf, - NULL, &sip) == 0) - mod = sip.object; - } else - mod = p; + /* Match the function name. */ + fun = dt_symbol_name(sym); + if (!dt_gmatch(fun, pdp->fun)) + continue; - /* - * Due to the lack of module names in - * TRACEFS/available_filter_functions, there are some duplicate - * function names. The kernel does not let us trace functions - * that have duplicates, so we need to remove the existing one. - */ - pd.id = DTRACE_IDNONE; - pd.prv = prvname; - pd.mod = mod; - pd.fun = buf; - pd.prb = "entry"; - prp = dt_probe_lookup(dtp, &pd); - if (prp != NULL) { - dt_probe_destroy(prp); + /* Validate the module name. */ + smp = dt_symbol_module(sym); + if (dmp) { + if (smp != dmp) + continue; + } else if (!dt_gmatch(smp->dm_name, pdp->mod)) continue; - } - if (dt_tp_probe_insert(dtp, prv, prvname, mod, buf, "entry")) - n++; - if (dt_tp_probe_insert(dtp, prv, prvname, mod, buf, "return")) - n++; - } + pd.id = DTRACE_IDNONE; + pd.prv = pdp->prv; + pd.mod = smp->dm_name; + pd.fun = fun; - free(buf); - fclose(f); + if (prb & FBT_ENTRY) { + pd.prb = "entry"; + n += provide_probe(dtp, &pd); + } + if (prb & FBT_RETURN) { + pd.prb = "return"; + n += provide_probe(dtp, &pd); + } + } return n; } @@ -306,6 +384,7 @@ dt_provimpl_t dt_rawfbt = { .name = prvname, .prog_type = BPF_PROG_TYPE_KPROBE, .populate = &populate, + .provide = &provide, .load_prog = &dt_bpf_prog_load, .trampoline = &trampoline, .attach = &attach, -- 2.45.2 From kris.van.hees at oracle.com Fri Mar 7 21:34:41 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Fri, 7 Mar 2025 16:34:41 -0500 Subject: [DTrace-devel] [PATCH 8/8] fbt, rawfbt: consolidate code to avoid duplication In-Reply-To: <20250307213441.9495-1-kris.van.hees@oracle.com> References: <20250307213441.9495-1-kris.van.hees@oracle.com> Message-ID: <20250307213441.9495-7-kris.van.hees@oracle.com> After optimizing both fbt and rawfbt providers, the resulting code has a significant amount of duplication. The rawfbt provider can now be defined in terms of the kprobe-based fbt provider functions. Signed-off-by: Kris Van Hees --- libdtrace/Build | 1 - libdtrace/dt_prov_fbt.c | 131 +++++++++---- libdtrace/dt_prov_rawfbt.c | 393 ------------------------------------- 3 files changed, 96 insertions(+), 429 deletions(-) delete mode 100644 libdtrace/dt_prov_rawfbt.c diff --git a/libdtrace/Build b/libdtrace/Build index 51e0f078..7e6e8a38 100644 --- a/libdtrace/Build +++ b/libdtrace/Build @@ -55,7 +55,6 @@ libdtrace-build_SOURCES = dt_aggregate.c \ dt_prov_lockstat.c \ dt_prov_proc.c \ dt_prov_profile.c \ - dt_prov_rawfbt.c \ dt_prov_rawtp.c \ dt_prov_sched.c \ dt_prov_sdt.c \ diff --git a/libdtrace/dt_prov_fbt.c b/libdtrace/dt_prov_fbt.c index d837e14d..93ed270e 100644 --- a/libdtrace/dt_prov_fbt.c +++ b/libdtrace/dt_prov_fbt.c @@ -6,17 +6,26 @@ * * The Function Boundary Tracing (FBT) provider for DTrace. * - * FBT probes are exposed by the kernel as kprobes. They are listed in the - * TRACEFS/available_filter_functions file. Some kprobes are associated with - * a specific kernel module, while most are in the core kernel. + * Kernnel functions can be traced through fentry/fexit probes (when available) + * and kprobes. The FBT provider supports both implementations and will use + * fentry/fexit probes if the kernel supports them, and fallback to kprobes + * otherwise. The FBT provider does not support tracing synthetic functions + * (i.e. compiler-generated functions with a . in their name). + * + * The rawfbt provider implements a variant of the FBT provider and always uses + * kprobes. This provider allow tracing of synthetic function. * * Mapping from event name to DTrace probe name: * * fbt:vmlinux::entry * fbt:vmlinux::return + * rawfbt:vmlinux::entry + * rawfbt:vmlinux::return * or * [] fbt:::entry * fbt:::return + * rawfbt:::entry + * rawfbt:::return */ #include #include @@ -57,18 +66,19 @@ static const dtrace_pattr_t pattr = { dt_provimpl_t dt_fbt_fprobe; dt_provimpl_t dt_fbt_kprobe; +dt_provimpl_t dt_rawfbt; /* - * Create the fbt provider. + * Create the fbt and rawfbt providers. */ static int populate(dtrace_hdl_t *dtp) { - dt_provider_t *prv; - dt_fbt = BPF_HAS(dtp, BPF_FEAT_FENTRY) ? dt_fbt_fprobe : dt_fbt_kprobe; - prv = dt_provider_create(dtp, prvname, &dt_fbt, &pattr, NULL); - if (prv == NULL) + if (dt_provider_create(dtp, dt_fbt.name, &dt_fbt, &pattr, + NULL) == NULL || + dt_provider_create(dtp, dt_rawfbt.name, &dt_rawfbt, &pattr, + NULL) == NULL) return -1; /* errno already set */ return 0; @@ -107,8 +117,6 @@ static int provide(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) dt_htab_next_t *it = NULL; dtrace_probedesc_t pd; - dt_modsym_mark_traceable(dtp); - /* * Nothing to do if a probe name is specified and cannot match 'entry' * or 'return'. @@ -120,8 +128,11 @@ static int provide(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) if (prb == 0) return 0; - /* Synthetic function names are not supported for FBT. */ - if (strchr(pdp->fun, '.')) + /* + * Unless we are dealing with a rawfbt probe, synthetic functions are + * not supported. + */ + if (strcmp(pdp->prv, dt_rawfbt.name) != 0 && strchr(pdp->fun, '.')) return 0; /* @@ -134,6 +145,14 @@ static int provide(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) return 0; } + /* + * Ensure that kernel symbols that are FBT-traceable are marked as + * such. We don't do this earlier in this function so that the + * preceding tests have the greatest opportunity to avoid doing this + * unnecessarily. + */ + dt_modsym_mark_traceable(dtp); + /* * If we have an explicit function name, we start with a basic symbol * name lookup. @@ -396,12 +415,12 @@ static int fprobe_prog_load(dtrace_hdl_t *dtp, const dt_probe_t *prp, \*******************************/ /* - * Generate a BPF trampoline for a FBT probe. + * Generate a BPF trampoline for a FBT (or rawfbt) probe. * * The trampoline function is called when a FBT probe triggers, and it must * satisfy the following prototype: * - * int dt_fbt(dt_pt_regs *regs) + * int dt_(raw)fbt(dt_pt_regs *regs) * * The trampoline will populate a dt_dctx_t struct and then call the function * that implements the compiled D clause. It returns 0 to the caller. @@ -422,7 +441,7 @@ static int kprobe_trampoline(dt_pcb_t *pcb, uint_t exitlbl) dt_cg_tramp_copy_rval_from_regs(pcb); /* - * fbt:::return arg0 should be the function offset for + * (raw)fbt:::return arg0 should be the function offset for * return instruction. Since we use kretprobes, however, * which do not fire until the function has returned to * its caller, information about the returning instruction @@ -441,11 +460,28 @@ static int kprobe_trampoline(dt_pcb_t *pcb, uint_t exitlbl) static int kprobe_attach(dtrace_hdl_t *dtp, const dt_probe_t *prp, int bpf_fd) { + const char *fun = prp->desc->fun; + char *tpn = (char *)fun; + int rc = -1; + if (!dt_tp_probe_has_info(prp)) { char *fn; FILE *f; - size_t len; - int fd, rc = -1; + int fd; + + /* + * For rawfbt probes, we need to apply a . -> _ conversion to + * ensure the tracepoint name is valid. + */ + if (strcmp(prp->desc->prv, dt_rawfbt.name) == 0) { + char *p; + + tpn = strdup(fun); + for (p = tpn; *p; p++) { + if (*p == '.') + *p = '_'; + } + } /* * Register the kprobe with the tracing subsystem. This will @@ -453,41 +489,42 @@ static int kprobe_attach(dtrace_hdl_t *dtp, const dt_probe_t *prp, int bpf_fd) */ fd = open(KPROBE_EVENTS, O_WRONLY | O_APPEND); if (fd == -1) - return -ENOENT; + goto out; rc = dprintf(fd, "%c:" FBT_GROUP_FMT "/%s %s\n", prp->desc->prb[0] == 'e' ? 'p' : 'r', - FBT_GROUP_DATA, prp->desc->fun, prp->desc->fun); + FBT_GROUP_DATA, tpn, fun); close(fd); if (rc == -1) - return -ENOENT; + goto out; /* create format file name */ - len = snprintf(NULL, 0, "%s" FBT_GROUP_FMT "/%s/format", - EVENTSFS, FBT_GROUP_DATA, prp->desc->fun) + 1; - fn = dt_alloc(dtp, len); - if (fn == NULL) - return -ENOENT; - - snprintf(fn, len, "%s" FBT_GROUP_FMT "/%s/format", EVENTSFS, - FBT_GROUP_DATA, prp->desc->fun); + if (asprintf(&fn, "%s" FBT_GROUP_FMT "/%s/format", EVENTSFS, + FBT_GROUP_DATA, tpn) == -1) + goto out; /* open format file */ f = fopen(fn, "r"); - dt_free(dtp, fn); + free(fn); if (f == NULL) - return -ENOENT; + goto out; /* read event id from format file */ rc = dt_tp_probe_info(dtp, f, 0, prp, NULL, NULL); fclose(f); if (rc < 0) - return -ENOENT; + goto out; } /* attach BPF program to the probe */ - return dt_tp_probe_attach(dtp, prp, bpf_fd); + rc = dt_tp_probe_attach(dtp, prp, bpf_fd); + +out: + if (tpn != prp->desc->fun) + free(tpn); + + return rc == -1 ? -ENOENT : rc; } /* @@ -503,7 +540,8 @@ static int kprobe_attach(dtrace_hdl_t *dtp, const dt_probe_t *prp, int bpf_fd) */ static void kprobe_detach(dtrace_hdl_t *dtp, const dt_probe_t *prp) { - int fd; + int fd; + char *tpn = (char *)prp->desc->fun; if (!dt_tp_probe_has_info(prp)) return; @@ -514,9 +552,20 @@ static void kprobe_detach(dtrace_hdl_t *dtp, const dt_probe_t *prp) if (fd == -1) return; - dprintf(fd, "-:" FBT_GROUP_FMT "/%s\n", FBT_GROUP_DATA, - prp->desc->fun); + if (strcmp(prp->desc->prv, dt_rawfbt.name) == 0) { + char *p; + + for (p = tpn; *p; p++) { + if (*p == '.') + *p = '_'; + } + } + + dprintf(fd, "-:" FBT_GROUP_FMT "/%s\n", FBT_GROUP_DATA, tpn); close(fd); + + if (tpn != prp->desc->fun) + free(tpn); } dt_provimpl_t dt_fbt_fprobe = { @@ -549,3 +598,15 @@ dt_provimpl_t dt_fbt = { .name = prvname, .populate = &populate, }; + +dt_provimpl_t dt_rawfbt = { + .name = "rawfbt", + .prog_type = BPF_PROG_TYPE_KPROBE, + .populate = &populate, + .provide = &provide, + .load_prog = &dt_bpf_prog_load, + .trampoline = &kprobe_trampoline, + .attach = &kprobe_attach, + .detach = &kprobe_detach, + .probe_destroy = &dt_tp_probe_destroy, +}; diff --git a/libdtrace/dt_prov_rawfbt.c b/libdtrace/dt_prov_rawfbt.c deleted file mode 100644 index 52152655..00000000 --- a/libdtrace/dt_prov_rawfbt.c +++ /dev/null @@ -1,393 +0,0 @@ -/* - * Oracle Linux DTrace. - * Copyright (c) 2024, Oracle and/or its affiliates. All rights reserved. - * Licensed under the Universal Permissive License v 1.0 as shown at - * http://oss.oracle.com/licenses/upl. - * - * The Raw Function Boundary Tracing provider for DTrace. - * - * The kernel provides kprobes to trace specific symbols. They are listed in - * the TRACEFS/available_filter_functions file. Kprobes may be associated with - * a symbol in the core kernel or with a symbol in a specific kernel module. - * Whereas the fbt provider supports tracing regular symbols only, the rawfbt - * provider also provides access to synthetic symbols, i.e. symbols created by - * compiler optimizations. - * - * Mapping from event name to DTrace probe name: - * - * rawfbt:vmlinux::entry - * rawfbt:vmlinux::return - * or - * [] rawfbt:::entry - * rawfbt:::return - */ -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include - -#include "dt_btf.h" -#include "dt_dctx.h" -#include "dt_cg.h" -#include "dt_module.h" -#include "dt_provider_tp.h" -#include "dt_probe.h" -#include "dt_pt_regs.h" - -static const char prvname[] = "rawfbt"; - -#define KPROBE_EVENTS TRACEFS "kprobe_events" - -#define FBT_GROUP_FMT GROUP_FMT "_%s" -#define FBT_GROUP_DATA GROUP_DATA, prp->desc->prb - -static const dtrace_pattr_t pattr = { -{ DTRACE_STABILITY_EVOLVING, DTRACE_STABILITY_EVOLVING, DTRACE_CLASS_COMMON }, -{ DTRACE_STABILITY_PRIVATE, DTRACE_STABILITY_PRIVATE, DTRACE_CLASS_UNKNOWN }, -{ DTRACE_STABILITY_PRIVATE, DTRACE_STABILITY_PRIVATE, DTRACE_CLASS_ISA }, -{ DTRACE_STABILITY_EVOLVING, DTRACE_STABILITY_EVOLVING, DTRACE_CLASS_COMMON }, -{ DTRACE_STABILITY_PRIVATE, DTRACE_STABILITY_PRIVATE, DTRACE_CLASS_ISA }, -}; - -/* - * Create the rawfbt provider. - */ -static int populate(dtrace_hdl_t *dtp) -{ - dt_provider_t *prv; - - prv = dt_provider_create(dtp, prvname, &dt_rawfbt, &pattr, NULL); - if (prv == NULL) - return -1; /* errno already set */ - - return 0; -} - -/* Create a probe (if it does not exist yet). */ -static int provide_probe(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) -{ - dt_provider_t *prv = dt_provider_lookup(dtp, pdp->prv); - - if (prv == NULL) - return 0; - if (dt_probe_lookup(dtp, pdp) != NULL) - return 0; -#ifdef DEBUG_FBT - if (dt_tp_probe_insert(dtp, prv, pdp->prv, pdp->mod, pdp->fun, pdp->prb)) { - fprintf(stderr, "%s(..., PROVIDE %s:%s:%s:%s) - ...\n", __func__, pdp->prv, pdp->mod, pdp->fun, pdp->prb); - return 1; - } -#else - if (dt_tp_probe_insert(dtp, prv, pdp->prv, pdp->mod, pdp->fun, pdp->prb)) - return 1; -#endif - - return 0; -} - -/* - * Try to provide probes for the given probe description. The caller ensures - * that the provider name in probe desxcription (if any) is a match for this - * provider. When this is called, we already know that this provider matches - * the provider component of the probe specification. - */ -#define FBT_ENTRY 1 -#define FBT_RETURN 2 - -static int provide(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) -{ - int n = 0; - int prb = 0; - dt_module_t *dmp = NULL; - dt_symbol_t *sym = NULL; - dt_htab_next_t *it = NULL; - dtrace_probedesc_t pd; - - dt_modsym_mark_traceable(dtp); - - /* - * Nothing to do if a probe name is specified and cannot match 'entry' - * or 'return'. - */ - if (dt_gmatch("entry", pdp->prb)) - prb |= FBT_ENTRY; - if (dt_gmatch("return", pdp->prb)) - prb |= FBT_RETURN; - if (prb == 0) - return 0; - - /* - * If we have an explicit module name, check it. If not found, we can - * ignore this request. - */ - if (pdp->mod[0] != '\0' && strchr(pdp->mod, '*') == NULL) { - dmp = dt_module_lookup_by_name(dtp, pdp->mod); - if (dmp == NULL) - return 0; - } - - /* - * If we have an explicit function name, we start with a basic symbol - * name lookup. - */ - if (pdp->fun[0] != '\0' && strchr(pdp->fun, '*') == NULL) { - /* If we have a module, use it. */ - if (dmp != NULL) { - sym = dt_module_symbol_by_name(dtp, dmp, pdp->fun); - if (sym == NULL) - return 0; - if (!dt_symbol_traceable(sym)) - return 0; - - pd.id = DTRACE_IDNONE; - pd.prv = pdp->prv; - pd.mod = dmp->dm_name; - pd.fun = pdp->fun; - - if (prb & FBT_ENTRY) { - pd.prb = "entry"; - n += provide_probe(dtp, &pd); - } - if (prb & FBT_RETURN) { - pd.prb = "return"; - n += provide_probe(dtp, &pd); - } - - return n; - } - - sym = dt_symbol_by_name(dtp, pdp->fun); - while (sym != NULL) { - const char *mod = dt_symbol_module(sym)->dm_name; - - if (dt_symbol_traceable(sym) && - dt_gmatch(mod, pdp->mod)) { - pd.id = DTRACE_IDNONE; - pd.prv = pdp->prv; - pd.mod = mod; - pd.fun = pdp->fun; - - if (prb & FBT_ENTRY) { - pd.prb = "entry"; - n += provide_probe(dtp, &pd); - } - if (prb & FBT_RETURN) { - pd.prb = "return"; - n += provide_probe(dtp, &pd); - } - - } - sym = dt_symbol_by_name_next(sym); - } - - return n; - } - - /* - * No explicit function name. We need to go through all possible - * symbol names and see if they match. - */ - while ((sym = dt_htab_next(dtp->dt_kernsyms, &it)) != NULL) { - dt_module_t *smp; - const char *fun; - - /* Ensure the symbol can be traced. */ - if (!dt_symbol_traceable(sym)) - continue; - - /* Match the function name. */ - fun = dt_symbol_name(sym); - if (!dt_gmatch(fun, pdp->fun)) - continue; - - /* Validate the module name. */ - smp = dt_symbol_module(sym); - if (dmp) { - if (smp != dmp) - continue; - } else if (!dt_gmatch(smp->dm_name, pdp->mod)) - continue; - - pd.id = DTRACE_IDNONE; - pd.prv = pdp->prv; - pd.mod = smp->dm_name; - pd.fun = fun; - - if (prb & FBT_ENTRY) { - pd.prb = "entry"; - n += provide_probe(dtp, &pd); - } - if (prb & FBT_RETURN) { - pd.prb = "return"; - n += provide_probe(dtp, &pd); - } - } - - return n; -} - -/* - * Generate a BPF trampoline for a FBT probe. - * - * The trampoline function is called when a FBT probe triggers, and it must - * satisfy the following prototype: - * - * int dt_rawfbt(dt_pt_regs *regs) - * - * The trampoline will populate a dt_dctx_t struct and then call the function - * that implements the compiled D clause. It returns 0 to the caller. - */ -static int trampoline(dt_pcb_t *pcb, uint_t exitlbl) -{ - dt_cg_tramp_prologue(pcb); - - /* - * After the dt_cg_tramp_prologue() call, we have: - * // (%r7 = dctx->mst) - * // (%r8 = dctx->ctx) - */ - dt_cg_tramp_copy_regs(pcb); - if (strcmp(pcb->pcb_probe->desc->prb, "return") == 0) { - dt_irlist_t *dlp = &pcb->pcb_ir; - - dt_cg_tramp_copy_rval_from_regs(pcb); - - /* - * fbt:::return arg0 should be the function offset for - * return instruction. Since we use kretprobes, however, - * which do not fire until the function has returned to - * its caller, information about the returning instruction - * in the callee has been lost. - * - * Set arg0=-1 to indicate that we do not know the value. - */ - dt_cg_xsetx(dlp, NULL, DT_LBL_NONE, BPF_REG_0, -1); - emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(0), BPF_REG_0)); - } else - dt_cg_tramp_copy_args_from_regs(pcb, 1); - dt_cg_tramp_epilogue(pcb); - - return 0; -} - -static int attach(dtrace_hdl_t *dtp, const dt_probe_t *prp, int bpf_fd) -{ - char *prb = NULL; - - if (!dt_tp_probe_has_info(prp)) { - char *fn, *p; - FILE *f; - int fd, rc = -1; - - /* - * The tracepoint event we will be creating needs to have a - * valid name. We use a copy of the probe name, with . -> _ - * conversion. - */ - prb = strdup(prp->desc->fun); - for (p = prb; *p; p++) { - if (*p == '.') - *p = '_'; - } - - /* - * Register the kprobe with the tracing subsystem. This will - * create a tracepoint event. - */ - fd = open(KPROBE_EVENTS, O_WRONLY | O_APPEND); - if (fd == -1) - goto fail; - - rc = dprintf(fd, "%c:" FBT_GROUP_FMT "/%s %s\n", - prp->desc->prb[0] == 'e' ? 'p' : 'r', - FBT_GROUP_DATA, prb, prp->desc->fun); - close(fd); - if (rc == -1) - goto fail; - - /* create format file name */ - if (asprintf(&fn, "%s" FBT_GROUP_FMT "/%s/format", EVENTSFS, - FBT_GROUP_DATA, prb) == -1) - goto fail; - - /* open format file */ - f = fopen(fn, "r"); - free(fn); - if (f == NULL) - goto fail; - - /* read event id from format file */ - rc = dt_tp_probe_info(dtp, f, 0, prp, NULL, NULL); - fclose(f); - - if (rc < 0) - goto fail; - - free(prb); - } - - /* attach BPF program to the probe */ - return dt_tp_probe_attach(dtp, prp, bpf_fd); - -fail: - free(prb); - return -ENOENT; -} - -/* - * Try to clean up system resources that may have been allocated for this - * probe. - * - * If there is an event FD, we close it. - * - * We also try to remove any kprobe that may have been created for the probe. - * This is harmless for probes that didn't get created. If the removal fails - * for some reason we are out of luck - fortunately it is not harmful to the - * system as a whole. - */ -static void detach(dtrace_hdl_t *dtp, const dt_probe_t *prp) -{ - int fd; - char *prb, *p; - - if (!dt_tp_probe_has_info(prp)) - return; - - dt_tp_probe_detach(dtp, prp); - - fd = open(KPROBE_EVENTS, O_WRONLY | O_APPEND); - if (fd == -1) - return; - - /* The tracepoint event is the probe nam, with . -> _ conversion. */ - prb = strdup(prp->desc->fun); - for (p = prb; *p; p++) { - if (*p == '.') - *p = '_'; - } - - dprintf(fd, "-:" FBT_GROUP_FMT "/%s\n", FBT_GROUP_DATA, prb); - free(prb); - close(fd); -} - -dt_provimpl_t dt_rawfbt = { - .name = prvname, - .prog_type = BPF_PROG_TYPE_KPROBE, - .populate = &populate, - .provide = &provide, - .load_prog = &dt_bpf_prog_load, - .trampoline = &trampoline, - .attach = &attach, - .detach = &detach, - .probe_destroy = &dt_tp_probe_destroy, -}; -- 2.45.2 From eugene.loh at oracle.com Mon Mar 10 21:47:58 2025 From: eugene.loh at oracle.com (Eugene Loh) Date: Mon, 10 Mar 2025 17:47:58 -0400 Subject: [DTrace-devel] [PATCH 1/8] proc: convert to use standard SDT provider implementation In-Reply-To: <20250307213320.9439-1-kris.van.hees@oracle.com> References: <20250307213320.9439-1-kris.van.hees@oracle.com> Message-ID: Reviewed-by: Eugene Loh albeit a few minor comments on the commit message. On 3/7/25 16:33, Kris Van Hees via DTrace-devel wrote: > The prov provider was the first SDT-based provider implememted in this s/prov /proc / s/implememted/implemented/ > version, and therefore handled the enabling of probes with custom code. "this version" seems rather ambiguous.? version of what? How about s/this version/this port of DTrace to eBPF/ or something? > When the other SDT-based providers (sched, ...) were implemented, a > generic SDT-framework was developed. > > The proc provider now uses the SDT-framework. "Now" sometimes means "up until this patch."? How about using the Linux convention of describing changes "in imperative mood."? E.g., "Switch the proc provider over to the SDT framework." > Signed-off-by: Kris Van Hees > --- > libdtrace/dt_prov_proc.c | 316 +++++---------------------------------- > 1 file changed, 36 insertions(+), 280 deletions(-) > > diff --git a/libdtrace/dt_prov_proc.c b/libdtrace/dt_prov_proc.c > index 2e514860..15fde6c9 100644 > --- a/libdtrace/dt_prov_proc.c > +++ b/libdtrace/dt_prov_proc.c > @@ -8,77 +8,45 @@ > */ > #include > #include > -#include > -#include > -#include > -#include > -#include > -#include > -#include > -#include > -#include > - > -#include > > #include "dt_dctx.h" > #include "dt_cg.h" > +#include "dt_provider_sdt.h" > #include "dt_probe.h" > -#include "dt_pt_regs.h" > > static const char prvname[] = "proc"; > static const char modname[] = "vmlinux"; > > -/* > - * The proc-provider probes make use of probes that are already provided by > - * other providers. As such, the proc probes are 'dependent probes' because > - * they depend on underlying probes to get triggered and they also depend on > - * argument data provided by the underlying probe to manufacture their own > - * arguments. > - * > - * As a type of SDT probes, proc probes are defined with a signature (list of > - * arguments - possibly empty) that may use translator support to provide the > - * actual argument values. Therefore, obtaining the value of arguments for > - * a proc probe goes through two layers of processing: > - * > - * (1) the arguments of the underlying probe are reworked to match the > - * expected layout of raw arguments for the proc probe > - * (2) an argument mapping table (and supporting translators) is used to get > - * the value of an arguument based on the raw variable data of the proc > - * probe > - * > - * To accomplish this, proc probes generate a trampoline that rewrites the > - * arguments of the underlying probe. (The dependent probe support code in the > - * underlying probe saves the arguments of the underying probe in the mstate > - * before executing the trampoline and clauses of the dependent probe, and it > - * restores them afterwards in case there are multiple dependent probes.) > - * > - * Because proc probes dependent on an underlying probe that may be too generic > - * (e.g. proc:::exec-success depending on syscall::execve*:return), the > - * trampoline code can include a pre-condition (much like a predicate) that can > - * bypass execution unless the condition is met (e.g. proc:::exec-success > - * requires syscall::execve*:return's arg1 to be 0). > - * > - * FIXME: > - * The dependent probe support should include a priority specification to drive > - * the order in which dependent probes are added to the underlying probe. This > - * is needed to enforce specific probe firing semantics (e.g. proc:::start must > - * always precede proc:::lwp-start). > - */ > - > -typedef struct probe_arg { > - const char *name; /* name of probe */ > - int argno; /* argument number */ > - dt_argdesc_t argdesc; /* argument description */ > -} probe_arg_t; > +static probe_dep_t probes[] = { > + { "create", > + DTRACE_PROBESPEC_NAME, "rawtp:sched::sched_process_fork" }, > + { "exec", > + DTRACE_PROBESPEC_NAME, "syscall::execve*:entry" }, > + { "exec-failure", > + DTRACE_PROBESPEC_NAME, "syscall::execve*:return" }, > + { "exec-success", > + DTRACE_PROBESPEC_NAME, "syscall::execve*:return" }, > + { "exit", > + DTRACE_PROBESPEC_NAME, "rawtp:sched::sched_process_exit" }, > + { "lwp-create", > + DTRACE_PROBESPEC_NAME, "rawtp:sched::sched_process_fork" }, > + { "lwp-exit", > + DTRACE_PROBESPEC_NAME, "rawtp:sched::sched_process_exit" }, > + { "lwp-start", > + DTRACE_PROBESPEC_NAME, "fbt::schedule_tail:return" }, > + { "signal-clear", > + DTRACE_PROBESPEC_NAME, "syscall::rt_sigtimedwait:return" }, > + { "signal-discard", > + DTRACE_PROBESPEC_NAME, "rawtp:signal::signal_generate" }, > + { "signal-handle", > + DTRACE_PROBESPEC_NAME, "rawtp:signal::signal_deliver" }, > + { "signal-send", > + DTRACE_PROBESPEC_NAME, "fbt::complete_signal:entry" }, > + { "start", > + DTRACE_PROBESPEC_NAME, "fbt::schedule_tail:return" }, > + { NULL, } > +}; > > -/* > - * Probe signature specifications > - * > - * This table *must* group the arguments of probes. I.e. the arguments of a > - * given probe must be listed in consecutive records. > - * A single probe entry that mentions only name of the probe indicates a probe > - * that provides no arguments. > - */ > static probe_arg_t probe_args[] = { > { "create", 0, { 0, 0, "struct task_struct *", "psinfo_t *" } }, > { "exec", 0, { 0, DT_NF_USERLAND, "string", } }, > @@ -100,6 +68,7 @@ static probe_arg_t probe_args[] = { > { "signal-send", 1, { 0, 0, "struct task_struct *", "psinfo_t *" } }, > { "signal-send", 2, { 1, 0, "int", } }, > { "start", }, > + { NULL, }, > }; > > static const dtrace_pattr_t pattr = { > @@ -115,173 +84,8 @@ static const dtrace_pattr_t pattr = { > */ > static int populate(dtrace_hdl_t *dtp) > { > - dt_provider_t *prv; > - int i; > - int n = 0; > - > - prv = dt_provider_create(dtp, prvname, &dt_proc, &pattr, NULL); > - if (prv == NULL) > - return -1; /* errno already set */ > - > - /* > - * Create "proc" probes based on the probe_args list. Since each probe > - * will have at least one entry (with argno == 0), we can use those > - * entries to identify the probe names. > - */ > - for (i = 0; i < ARRAY_SIZE(probe_args); i++) { > - probe_arg_t *arg = &probe_args[i]; > - > - if (arg->argno == 0 && > - dt_probe_insert(dtp, prv, prvname, modname, "", arg->name, > - NULL)) > - n++; > - } > - > - return n; > -} > - > -static void enable(dtrace_hdl_t *dtp, dt_probe_t *prp) > -{ > - dt_probe_t *uprp = NULL; > - dtrace_probedesc_t pd; > - > - if (strcmp(prp->desc->prb, "create") == 0 || > - strcmp(prp->desc->prb, "lwp-create") == 0) { > - pd.id = DTRACE_IDNONE; > - pd.prv = "rawtp"; > - pd.mod = "sched"; > - pd.fun = ""; > - pd.prb = "sched_process_fork"; > - > - uprp = dt_probe_lookup(dtp, &pd); > - assert(uprp != NULL); > - > - dt_probe_add_dependent(dtp, uprp, prp); > - dt_probe_enable(dtp, uprp); > - } else if (strcmp(prp->desc->prb, "exec") == 0) { > - pd.id = DTRACE_IDNONE; > - pd.prv = "syscall"; > - pd.mod = ""; > - pd.fun = "execve"; > - pd.prb = "entry"; > - > - uprp = dt_probe_lookup(dtp, &pd); > - assert(uprp != NULL); > - > - dt_probe_add_dependent(dtp, uprp, prp); > - dt_probe_enable(dtp, uprp); > - > - pd.fun = "execveat"; > - > - uprp = dt_probe_lookup(dtp, &pd); > - assert(uprp != NULL); > - > - dt_probe_add_dependent(dtp, uprp, prp); > - dt_probe_enable(dtp, uprp); > - } else if (strcmp(prp->desc->prb, "exec-failure") == 0 || > - strcmp(prp->desc->prb, "exec-success") == 0) { > - pd.id = DTRACE_IDNONE; > - pd.prv = "syscall"; > - pd.mod = ""; > - pd.fun = "execve"; > - pd.prb = "return"; > - > - uprp = dt_probe_lookup(dtp, &pd); > - assert(uprp != NULL); > - > - dt_probe_add_dependent(dtp, uprp, prp); > - dt_probe_enable(dtp, uprp); > - > - pd.fun = "execveat"; > - > - uprp = dt_probe_lookup(dtp, &pd); > - assert(uprp != NULL); > - > - dt_probe_add_dependent(dtp, uprp, prp); > - dt_probe_enable(dtp, uprp); > - } else if (strcmp(prp->desc->prb, "exit") == 0 || > - strcmp(prp->desc->prb, "lwp-exit") == 0) { > - pd.id = DTRACE_IDNONE; > - pd.prv = "rawtp"; > - pd.mod = ""; > - pd.fun = ""; > - pd.prb = "sched_process_exit"; > - > - uprp = dt_probe_lookup(dtp, &pd); > - assert(uprp != NULL); > - > - dt_probe_add_dependent(dtp, uprp, prp); > - dt_probe_enable(dtp, uprp); > - } else if (strcmp(prp->desc->prb, "signal-clear") == 0) { > - pd.id = DTRACE_IDNONE; > - pd.prv = "syscall"; > - pd.mod = ""; > - pd.fun = "rt_sigtimedwait"; > - pd.prb = "return"; > - > - uprp = dt_probe_lookup(dtp, &pd); > - assert(uprp != NULL); > - > - dt_probe_add_dependent(dtp, uprp, prp); > - dt_probe_enable(dtp, uprp); > - } else if (strcmp(prp->desc->prb, "signal-discard") == 0) { > - pd.id = DTRACE_IDNONE; > - pd.prv = "rawtp"; > - pd.mod = "signal"; > - pd.fun = ""; > - pd.prb = "signal_generate"; > - > - uprp = dt_probe_lookup(dtp, &pd); > - assert(uprp != NULL); > - > - dt_probe_add_dependent(dtp, uprp, prp); > - dt_probe_enable(dtp, uprp); > - } else if (strcmp(prp->desc->prb, "signal-handle") == 0) { > - pd.id = DTRACE_IDNONE; > - pd.prv = "rawtp"; > - pd.mod = "signal"; > - pd.fun = ""; > - pd.prb = "signal_deliver"; > - > - uprp = dt_probe_lookup(dtp, &pd); > - assert(uprp != NULL); > - > - dt_probe_add_dependent(dtp, uprp, prp); > - dt_probe_enable(dtp, uprp); > - } else if (strcmp(prp->desc->prb, "signal-send") == 0) { > - pd.id = DTRACE_IDNONE; > - pd.prv = "fbt"; > - pd.mod = ""; > - pd.fun = "complete_signal"; > - pd.prb = "entry"; > - > - uprp = dt_probe_lookup(dtp, &pd); > - assert(uprp != NULL); > - > - dt_probe_add_dependent(dtp, uprp, prp); > - dt_probe_enable(dtp, uprp); > - } else if (strcmp(prp->desc->prb, "start") == 0 || > - strcmp(prp->desc->prb, "lwp-start") == 0) { > - pd.id = DTRACE_IDNONE; > - pd.prv = "fbt"; > - pd.mod = ""; > - pd.fun = "schedule_tail"; > - pd.prb = "return"; > - > - uprp = dt_probe_lookup(dtp, &pd); > - assert(uprp != NULL); > - > - dt_probe_add_dependent(dtp, uprp, prp); > - dt_probe_enable(dtp, uprp); > - } > - > - /* > - * Finally, ensure we're in the list of enablings as well. > - * (This ensures that, among other things, the probes map > - * gains entries for us.) > - */ > - if (!dt_in_list(&dtp->dt_enablings, prp)) > - dt_list_append(&dtp->dt_enablings, prp); > + return dt_sdt_populate(dtp, prvname, modname, &dt_proc, &pattr, > + probe_args, probes); > } > > /* > @@ -434,61 +238,13 @@ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl) > return 0; > } > > -static int probe_info(dtrace_hdl_t *dtp, const dt_probe_t *prp, > - int *argcp, dt_argdesc_t **argvp) > -{ > - int i; > - int pidx = -1; > - int argc = 0; > - dt_argdesc_t *argv = NULL; > - > - for (i = 0; i < ARRAY_SIZE(probe_args); i++) { > - probe_arg_t *arg = &probe_args[i]; > - > - if (strcmp(arg->name, prp->desc->prb) == 0) { > - if (pidx == -1) { > - pidx = i; > - > - if (arg->argdesc.native == NULL) > - break; > - } > - > - argc++; > - } > - } > - > - if (argc == 0) > - goto done; > - > - argv = dt_zalloc(dtp, argc * sizeof(dt_argdesc_t)); > - if (!argv) > - return -ENOMEM; > - > - for (i = pidx; i < pidx + argc; i++) { > - probe_arg_t *arg = &probe_args[i]; > - dt_argdesc_t *argd = &arg->argdesc; > - dt_argdesc_t *parg = &argv[arg->argno]; > - > - *parg = *argd; > - if (argd->native) > - parg->native = strdup(argd->native); > - if (argd->xlate) > - parg->xlate = strdup(argd->xlate); > - } > - > -done: > - *argcp = argc; > - *argvp = argv; > - > - return 0; > -} > - > dt_provimpl_t dt_proc = { > .name = prvname, > .prog_type = BPF_PROG_TYPE_UNSPEC, > .populate = &populate, > - .enable = &enable, > + .enable = &dt_sdt_enable, > .load_prog = &dt_bpf_prog_load, > .trampoline = &trampoline, > - .probe_info = &probe_info, > + .probe_info = &dt_sdt_probe_info, > + .destroy = &dt_sdt_destroy, > }; From eugene.loh at oracle.com Mon Mar 10 21:54:33 2025 From: eugene.loh at oracle.com (Eugene Loh) Date: Mon, 10 Mar 2025 17:54:33 -0400 Subject: [DTrace-devel] [PATCH 2/8] sched: clean up unnecessary includes and functions In-Reply-To: <20250307213441.9495-1-kris.van.hees@oracle.com> References: <20250307213441.9495-1-kris.van.hees@oracle.com> Message-ID: <1ccaa89f-229d-410b-ae0d-2f2d8d564083@oracle.com> Reviewed-by: Eugene Loh On 3/7/25 16:34, Kris Van Hees via DTrace-devel wrote: > Signed-off-by: Kris Van Hees > --- > libdtrace/dt_prov_sched.c | 30 ++---------------------------- > 1 file changed, 2 insertions(+), 28 deletions(-) > > diff --git a/libdtrace/dt_prov_sched.c b/libdtrace/dt_prov_sched.c > index e05ef246..125d5891 100644 > --- a/libdtrace/dt_prov_sched.c > +++ b/libdtrace/dt_prov_sched.c > @@ -1,6 +1,6 @@ > /* > * Oracle Linux DTrace. > - * Copyright (c) 2023, 2024, Oracle and/or its affiliates. All rights reserved. > + * Copyright (c) 2023, 2025, Oracle and/or its affiliates. All rights reserved. > * Licensed under the Universal Permissive License v 1.0 as shown at > * http://oss.oracle.com/licenses/upl. > * > @@ -9,9 +9,6 @@ > #include > #include > > -#include > -#include > - > #include "dt_dctx.h" > #include "dt_cg.h" > #include "dt_provider_sdt.h" > @@ -146,36 +143,13 @@ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl) > return 0; > } > > -/* > - * We need a custom enabling for on-cpu probes to ensure that the fbt function > - * __perf_event_task_sched_in is called. __perf_event_task_sched_in will not > - * be called unless context switch perf events have been enabled, so we do that > - * here by opening a context switch count perf event but not attaching anything > - * to it to minimize overhead. The alternative - attaching to > - * cpc:::context_switches-all-1 and weeding out on- versus off-cpu events via a > - * trampoline is too expensive. This approach works stably across kernels > - * because __perf_event_task_sched_in() is not static, so not potentially > - * subject to inlining or other optimizations. > - */ > -static void enable(dtrace_hdl_t *dtp, dt_probe_t *prp) > -{ > - return dt_sdt_enable(dtp, prp); > -} > - > -static void detach(dtrace_hdl_t *dtp, const dt_probe_t *prp) > -{ > - if (prp->prv_data) > - close((int)(long)prp->prv_data); > -} > - > dt_provimpl_t dt_sched = { > .name = prvname, > .prog_type = BPF_PROG_TYPE_UNSPEC, > .populate = &populate, > - .enable = &enable, > + .enable = &dt_sdt_enable, > .load_prog = &dt_bpf_prog_load, > .trampoline = &trampoline, > .probe_info = &dt_sdt_probe_info, > - .detach = &detach, > .destroy = &dt_sdt_destroy, > }; From eugene.loh at oracle.com Mon Mar 10 22:03:03 2025 From: eugene.loh at oracle.com (Eugene Loh) Date: Mon, 10 Mar 2025 18:03:03 -0400 Subject: [DTrace-devel] [PATCH 3/8] rawfbt: perform lookup on true symbol names In-Reply-To: <20250307213441.9495-2-kris.van.hees@oracle.com> References: <20250307213441.9495-1-kris.van.hees@oracle.com> <20250307213441.9495-2-kris.van.hees@oracle.com> Message-ID: Reviewed-by: Eugene Loh On 3/7/25 16:34, Kris Van Hees wrote: > When encountering a . symbol, a symbol lookup was done for > instead of . under the assumption that names with . > in them were not listed in kallsyms. But that is not true. > > Signed-off-by: Kris Van Hees > --- > libdtrace/dt_prov_rawfbt.c | 18 ------------------ > 1 file changed, 18 deletions(-) > > diff --git a/libdtrace/dt_prov_rawfbt.c b/libdtrace/dt_prov_rawfbt.c > index 4c8e8130..62f2f4f0 100644 > --- a/libdtrace/dt_prov_rawfbt.c > +++ b/libdtrace/dt_prov_rawfbt.c > @@ -122,27 +122,9 @@ static int populate(dtrace_hdl_t *dtp) > * try to determine the module name. > */ > if (!p) { > - char *q; > - > - /* > - * For synthetic symbol names (those containing '.'), > - * we need to use the base name (before the '.') for > - * module name lookup, because the synthetic forms are > - * not recorded in kallsyms information. > - * > - * We replace the first '.' with a 0 to terminate the > - * string, and after the lookup, we put it back. > - */ > - q = strchr(buf, '.'); > - if (q != NULL) > - *q = '\0'; > - > if (dtrace_lookup_by_name(dtp, DTRACE_OBJ_KMODS, buf, > NULL, &sip) == 0) > mod = sip.object; > - > - if (q != NULL) > - *q = '.'; > } else > mod = p; > From eugene.loh at oracle.com Mon Mar 10 22:04:13 2025 From: eugene.loh at oracle.com (Eugene Loh) Date: Mon, 10 Mar 2025 18:04:13 -0400 Subject: [DTrace-devel] [PATCH 4/8] ksyms: make symbol name filters less picky In-Reply-To: <20250307213441.9495-3-kris.van.hees@oracle.com> References: <20250307213441.9495-1-kris.van.hees@oracle.com> <20250307213441.9495-3-kris.van.hees@oracle.com> Message-ID: Reviewed-by: Eugene Loh On 3/7/25 16:34, Kris Van Hees wrote: > Some symbols were being filtered out even though they represent symbols > that can actually be probed. > > Signed-off-by: Kris Van Hees > --- > libdtrace/dt_module.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/libdtrace/dt_module.c b/libdtrace/dt_module.c > index dc00aa88..2e915e2f 100644 > --- a/libdtrace/dt_module.c > +++ b/libdtrace/dt_module.c > @@ -1215,7 +1215,7 @@ dt_modsym_addsym(dtrace_hdl_t *dtp, dt_module_t *dmp, dt_kallsym_t *sym, > (strstarts(sym->name, "__syscall_meta__")) || > (strstarts(sym->name, "__p_syscall_meta__")) || > (strstarts(sym->name, "__event_")) || > - (strstarts(sym->name, "event_")) || > + (strstarts(sym->name, "event_") && sym->type == 'd') || > (strstarts(sym->name, "ftrace_event_")) || > (strstarts(sym->name, "types__")) || > (strstarts(sym->name, "args__")) || > @@ -1223,7 +1223,6 @@ dt_modsym_addsym(dtrace_hdl_t *dtp, dt_module_t *dmp, dt_kallsym_t *sym, > (strstarts(sym->name, "__tpstrtab_")) || > (strstarts(sym->name, "__tpstrtab__")) || > (strstarts(sym->name, "__initcall_")) || > - (strstarts(sym->name, "__setup_")) || > (strstarts(sym->name, "__pci_fixup_"))) > skip = 1; > #undef strstarts From eugene.loh at oracle.com Wed Mar 12 05:17:51 2025 From: eugene.loh at oracle.com (Eugene Loh) Date: Wed, 12 Mar 2025 01:17:51 -0400 Subject: [DTrace-devel] [PATCH 6/8] fbt: performance improvements In-Reply-To: <20250307213441.9495-5-kris.van.hees@oracle.com> References: <20250307213441.9495-1-kris.van.hees@oracle.com> <20250307213441.9495-5-kris.van.hees@oracle.com> Message-ID: Sorry for the slow progress.? Anyhow, with this patch, I get failures on test/unittest/lockstat/tst.lockstat-summary.d on x86 UEK7 systems.? Stuff like: ? ?? ?? dtrace: could not enable tracing: BPF program load for 'fbt:vmlinux:native_queued_spin_lock_slowp: Invalid argument Well, in dt_prov_lockstat.c, I see: ??????? { "spin-spin", DTRACE_PROBESPEC_FUNC, "fbt::queued_spin_lock_*" }, ??????? { "spin-spin", DTRACE_PROBESPEC_FUNC, "fbt::native_queued_spin_lock_*" }, And on those problematic systems, I see: ? ?? ?? $ sudo build/run-dtrace -lP fbt |& grep native_queued ??????? 98429??????? fbt?????????? vmlinux native_queued_spin_lock_slowpath return ??????? 98428??????? fbt?????????? vmlinux native_queued_spin_lock_slowpath entry ???????? 9433??????? fbt?????????? vmlinux native_queued_spin_lock_slowpath.part.0 return ???????? 9432??????? fbt?????????? vmlinux native_queued_spin_lock_slowpath.part.0 entry In contrast, on UEK8, "dtrace -lP fbt" does not include the .part.0 probes. Back on the UEK7 systems, if I modify dt_prov_lockstat.c like this: ??? - ? { "spin-spin", DTRACE_PROBESPEC_FUNC, "fbt::native_queued_spin_lock_*" }, ??? + ? { "spin-spin", DTRACE_PROBESPEC_FUNC, "fbt::native_queued_spin_lock_slowpath" }, the test passes. Is that the right change to make?? If so, shall I submit a patch, or should it go with your patch series? On 3/7/25 16:34, Kris Van Hees wrote: > Up until now, FBT probes were registered for every symbol that was > listed as traceable. Most tracing session do not use most or even > any of these, and the process of registering them all was quite > slow. > > Going forward, FBT probes are registered on demand. > > If any FBT probes are to be registered, the first will incur the > cost of reading the entire list of traceable symbols. Any further > FBT probe registration will be able to be satisfied based on that > initial processing. The performance improvement is therefore quite > significant for tracing sessions that do not trigger any FBT probe > registration, and if FBT probes are used, the improvement is still > quite noticable because only the probes that are actually needed > get registered. > > Signed-off-by: Kris Van Hees > --- > libdtrace/dt_module.c | 78 +++++++++++++++ > libdtrace/dt_module.h | 2 + > libdtrace/dt_prov_fbt.c | 217 +++++++++++++++++++++++++++------------- > 3 files changed, 228 insertions(+), 69 deletions(-) > > diff --git a/libdtrace/dt_module.c b/libdtrace/dt_module.c > index 2e915e2f..e7553a07 100644 > --- a/libdtrace/dt_module.c > +++ b/libdtrace/dt_module.c > @@ -22,6 +22,7 @@ > #include > > #include > +#include > > #include > #include > @@ -1044,6 +1045,83 @@ dt_kern_module_find_ctf(dtrace_hdl_t *dtp, dt_module_t *dmp) > } > } > > +#define PROBE_LIST TRACEFS "available_filter_functions" > + > +/* > + * Determine which kernel functions are traceable and mark them. > + */ > +void > +dt_modsym_mark_traceable(dtrace_hdl_t *dtp) > +{ > + FILE *f; > + char *buf = NULL; > + size_t len = 0; > + > + if (dt_symtab_traceable(dtp->dt_exec->dm_kernsyms)) > + return; > + > + f = fopen(PROBE_LIST, "r"); > + if (f == NULL) > + return; > + > + while (getline(&buf, &len, f) >= 0) { > + char *p; > + dt_symbol_t *sym = NULL; > + > + /* > + * Here buf is either "funcname\n" or "funcname [modname]\n". > + * The last line may not have a linefeed. > + */ > + p = strchr(buf, '\n'); > + if (p) { > + *p = '\0'; > + if (p > buf && *(--p) == ']') > + *p = '\0'; > + } > + > + /* > + * Now buf is either "funcname" or "funcname [modname". If > + * there is no module name provided, we will use the default. > + */ > + p = strchr(buf, ' '); > + if (p) { > + *p++ = '\0'; > + if (*p == '[') > + p++; > + } > + > +#define strstarts(var, x) (strncmp(var, x, strlen (x)) == 0) > + /* Weed out __ftrace_invalid_address___* entries. */ > + if (strstarts(buf, "__ftrace_invalid_address__") || > + strstarts(buf, "__probestub_") || > + strstarts(buf, "__traceiter_")) > + continue; > +#undef strstarts > + > + /* > + * If we have a module name, look for the symbol in that > + * module. > + * If not, perform a general symbol lookup to find its first > + * instance. > + */ > + if (p) { > + dt_module_t *dmp = dt_module_lookup_by_name(dtp, p); > + > + if (dmp) > + sym = dt_module_symbol_by_name(dtp, dmp, buf); > + } else > + sym = dt_symbol_by_name(dtp, buf); > + > + if (sym) > + dt_symbol_set_traceable(sym); > + } > + > + free(buf); > + fclose(f); > + > + dt_symtab_set_traceable(dtp->dt_exec->dm_kernsyms); > +} > + > /* > * Symbol data can be collected in three ways: > * - kallmodsyms > diff --git a/libdtrace/dt_module.h b/libdtrace/dt_module.h > index 56df17a6..dd3ad17c 100644 > --- a/libdtrace/dt_module.h > +++ b/libdtrace/dt_module.h > @@ -25,6 +25,8 @@ extern dt_ident_t *dt_module_extern(dtrace_hdl_t *, dt_module_t *, > > extern const char *dt_module_modelname(dt_module_t *); > > +extern void dt_modsym_mark_traceable(dtrace_hdl_t *); > + > #ifdef __cplusplus > } > #endif > diff --git a/libdtrace/dt_prov_fbt.c b/libdtrace/dt_prov_fbt.c > index eef93879..d837e14d 100644 > --- a/libdtrace/dt_prov_fbt.c > +++ b/libdtrace/dt_prov_fbt.c > @@ -41,10 +41,8 @@ > #include "dt_pt_regs.h" > > static const char prvname[] = "fbt"; > -static const char modname[] = "vmlinux"; > > #define KPROBE_EVENTS TRACEFS "kprobe_events" > -#define PROBE_LIST TRACEFS "available_filter_functions" > > #define FBT_GROUP_FMT GROUP_FMT "_%s" > #define FBT_GROUP_DATA GROUP_DATA, prp->desc->prb > @@ -61,19 +59,11 @@ dt_provimpl_t dt_fbt_fprobe; > dt_provimpl_t dt_fbt_kprobe; > > /* > - * Scan the PROBE_LIST file and add entry and return probes for every function > - * that is listed. > + * Create the fbt provider. > */ > static int populate(dtrace_hdl_t *dtp) > { > dt_provider_t *prv; > - FILE *f; > - char *buf = NULL; > - char *p; > - const char *mod = modname; > - size_t n; > - dtrace_syminfo_t sip; > - dtrace_probedesc_t pd; > > dt_fbt = BPF_HAS(dtp, BPF_FEAT_FENTRY) ? dt_fbt_fprobe : dt_fbt_kprobe; > > @@ -81,79 +71,166 @@ static int populate(dtrace_hdl_t *dtp) > if (prv == NULL) > return -1; /* errno already set */ > > - f = fopen(PROBE_LIST, "r"); > - if (f == NULL) > + return 0; > +} > + > +/* Create a probe (if it does not exist yet). */ > +static int provide_probe(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) > +{ > + dt_provider_t *prv = dt_provider_lookup(dtp, pdp->prv); > + > + if (prv == NULL) > + return 0; > + if (dt_probe_lookup(dtp, pdp) != NULL) > return 0; > + if (dt_tp_probe_insert(dtp, prv, pdp->prv, pdp->mod, pdp->fun, pdp->prb)) > + return 1; > > - while (getline(&buf, &n, f) >= 0) { > - /* > - * Here buf is either "funcname\n" or "funcname [modname]\n". > - * The last line may not have a linefeed. > - */ > - p = strchr(buf, '\n'); > - if (p) { > - *p = '\0'; > - if (p > buf && *(--p) == ']') > - *p = '\0'; > + return 0; > +} > + > +/* > + * Try to provide probes for the given probe description. The caller ensures > + * that the provider name in probe desxcription (if any) is a match for this > + * provider. When this is called, we already know that this provider matches > + * the provider component of the probe specification. > + */ > +#define FBT_ENTRY 1 > +#define FBT_RETURN 2 > + > +static int provide(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) > +{ > + int n = 0; > + int prb = 0; > + dt_module_t *dmp = NULL; > + dt_symbol_t *sym = NULL; > + dt_htab_next_t *it = NULL; > + dtrace_probedesc_t pd; > + > + dt_modsym_mark_traceable(dtp); > + > + /* > + * Nothing to do if a probe name is specified and cannot match 'entry' > + * or 'return'. > + */ > + if (dt_gmatch("entry", pdp->prb)) > + prb |= FBT_ENTRY; > + if (dt_gmatch("return", pdp->prb)) > + prb |= FBT_RETURN; > + if (prb == 0) > + return 0; > + > + /* Synthetic function names are not supported for FBT. */ > + if (strchr(pdp->fun, '.')) > + return 0; > + > + /* > + * If we have an explicit module name, check it. If not found, we can > + * ignore this request. > + */ > + if (pdp->mod[0] != '\0' && strchr(pdp->mod, '*') == NULL) { > + dmp = dt_module_lookup_by_name(dtp, pdp->mod); > + if (dmp == NULL) > + return 0; > + } > + > + /* > + * If we have an explicit function name, we start with a basic symbol > + * name lookup. > + */ > + if (pdp->fun[0] != '\0' && strchr(pdp->fun, '*') == NULL) { > + /* If we have a module, use it. */ > + if (dmp != NULL) { > + sym = dt_module_symbol_by_name(dtp, dmp, pdp->fun); > + if (sym == NULL) > + return 0; > + if (!dt_symbol_traceable(sym)) > + return 0; > + > + pd.id = DTRACE_IDNONE; > + pd.prv = pdp->prv; > + pd.mod = dmp->dm_name; > + pd.fun = pdp->fun; > + > + if (prb & FBT_ENTRY) { > + pd.prb = "entry"; > + n += provide_probe(dtp, &pd); > + } > + if (prb & FBT_RETURN) { > + pd.prb = "return"; > + n += provide_probe(dtp, &pd); > + } > + > + return n; > } > > - /* > - * Now buf is either "funcname" or "funcname [modname". If > - * there is no module name provided, we will use the default. > - */ > - p = strchr(buf, ' '); > - if (p) { > - *p++ = '\0'; > - if (*p == '[') > - p++; > + sym = dt_symbol_by_name(dtp, pdp->fun); > + while (sym != NULL) { > + const char *mod = dt_symbol_module(sym)->dm_name; > + > + if (dt_symbol_traceable(sym) && > + dt_gmatch(mod, pdp->mod)) { > + pd.id = DTRACE_IDNONE; > + pd.prv = pdp->prv; > + pd.mod = mod; > + pd.fun = pdp->fun; > + > + if (prb & FBT_ENTRY) { > + pd.prb = "entry"; > + n += provide_probe(dtp, &pd); > + } > + if (prb & FBT_RETURN) { > + pd.prb = "return"; > + n += provide_probe(dtp, &pd); > + } > + > + } > + sym = dt_symbol_by_name_next(sym); > } > > - /* Weed out synthetic symbol names (that are invalid). */ > - if (strchr(buf, '.') != NULL) > + return n; > + } > + > + /* > + * No explicit function name. We need to go through all possible > + * symbol names and see if they match. > + */ > + while ((sym = dt_htab_next(dtp->dt_kernsyms, &it)) != NULL) { > + dt_module_t *smp; > + const char *fun; > + > + /* Ensure the symbol can be traced. */ > + if (!dt_symbol_traceable(sym)) > continue; > > -#define strstarts(var, x) (strncmp(var, x, strlen (x)) == 0) > - /* Weed out __ftrace_invalid_address___* entries. */ > - if (strstarts(buf, "__ftrace_invalid_address__") || > - strstarts(buf, "__probestub_") || > - strstarts(buf, "__traceiter_")) > + /* Match the function name. */ > + fun = dt_symbol_name(sym); > + if (!dt_gmatch(fun, pdp->fun)) > continue; > -#undef strstarts > > - /* > - * If we did not see a module name, perform a symbol lookup to > - * try to determine the module name. > - */ > - if (!p) { > - if (dtrace_lookup_by_name(dtp, DTRACE_OBJ_KMODS, buf, > - NULL, &sip) == 0) > - mod = sip.object; > - } else > - mod = p; > + /* Validate the module name. */ > + smp = dt_symbol_module(sym); > + if (dmp) { > + if (smp != dmp) > + continue; > + } else if (!dt_gmatch(smp->dm_name, pdp->mod)) > + continue; > > - /* > - * Due to the lack of module names in > - * TRACEFS/available_filter_functions, there are some duplicate > - * function names. We need to make sure that we do not create > - * duplicate probes for these. > - */ > pd.id = DTRACE_IDNONE; > - pd.prv = prvname; > - pd.mod = mod; > - pd.fun = buf; > - pd.prb = "entry"; > - if (dt_probe_lookup(dtp, &pd) != NULL) > - continue; > + pd.prv = pdp->prv; > + pd.mod = smp->dm_name; > + pd.fun = fun; > > - if (dt_tp_probe_insert(dtp, prv, prvname, mod, buf, "entry")) > - n++; > - if (dt_tp_probe_insert(dtp, prv, prvname, mod, buf, "return")) > - n++; > + if (prb & FBT_ENTRY) { > + pd.prb = "entry"; > + n += provide_probe(dtp, &pd); > + } > + if (prb & FBT_RETURN) { > + pd.prb = "return"; > + n += provide_probe(dtp, &pd); > + } > } > > - free(buf); > - fclose(f); > - > return n; > } > > @@ -447,6 +524,7 @@ dt_provimpl_t dt_fbt_fprobe = { > .prog_type = BPF_PROG_TYPE_TRACING, > .stack_skip = 4, > .populate = &populate, > + .provide = &provide, > .load_prog = &fprobe_prog_load, > .trampoline = &fprobe_trampoline, > .attach = &dt_tp_probe_attach_raw, > @@ -459,6 +537,7 @@ dt_provimpl_t dt_fbt_kprobe = { > .name = prvname, > .prog_type = BPF_PROG_TYPE_KPROBE, > .populate = &populate, > + .provide = &provide, > .load_prog = &dt_bpf_prog_load, > .trampoline = &kprobe_trampoline, > .attach = &kprobe_attach, From kris.van.hees at oracle.com Wed Mar 12 05:33:46 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Wed, 12 Mar 2025 01:33:46 -0400 Subject: [DTrace-devel] [PATCH 6/8] fbt: performance improvements In-Reply-To: References: <20250307213441.9495-1-kris.van.hees@oracle.com> <20250307213441.9495-5-kris.van.hees@oracle.com> Message-ID: On Wed, Mar 12, 2025 at 01:17:51AM -0400, Eugene Loh wrote: > Sorry for the slow progress.? Anyhow, with this patch, I get failures on > test/unittest/lockstat/tst.lockstat-summary.d on x86 UEK7 systems.? Stuff > like: > > ? ?? ?? dtrace: could not enable tracing: BPF program load for > 'fbt:vmlinux:native_queued_spin_lock_slowp: Invalid argument > > Well, in dt_prov_lockstat.c, I see: > ??????? { "spin-spin", DTRACE_PROBESPEC_FUNC, "fbt::queued_spin_lock_*" }, > ??????? { "spin-spin", DTRACE_PROBESPEC_FUNC, > "fbt::native_queued_spin_lock_*" }, > > And on those problematic systems, I see: > ? ?? ?? $ sudo build/run-dtrace -lP fbt |& grep native_queued > ??????? 98429??????? fbt?????????? vmlinux native_queued_spin_lock_slowpath > return > ??????? 98428??????? fbt?????????? vmlinux native_queued_spin_lock_slowpath > entry > ???????? 9433??????? fbt?????????? vmlinux > native_queued_spin_lock_slowpath.part.0 return > ???????? 9432??????? fbt?????????? vmlinux > native_queued_spin_lock_slowpath.part.0 entry > > In contrast, on UEK8, "dtrace -lP fbt" does not include the .part.0 probes. > > Back on the UEK7 systems, if I modify dt_prov_lockstat.c like this: > ??? - ? { "spin-spin", DTRACE_PROBESPEC_FUNC, > "fbt::native_queued_spin_lock_*" }, > ??? + ? { "spin-spin", DTRACE_PROBESPEC_FUNC, > "fbt::native_queued_spin_lock_slowpath" }, > the test passes. > > Is that the right change to make?? If so, shall I submit a patch, or should > it go with your patch series? No change should be needed to the lockstat provider. You uncovered a bug in my patch - I'll fix it. In short, while the provide() function filters out function names that have a '.' in them for the pdp->fun specification string, it fails to do so for the case where globbing requires us to loop over possible function names that could match pdp->fun (or all function names if pdp->fun == ""). We need to exclude names with "." in them there also because the FBT provider does not allow probing of those synmbols. I think I'll just defer that check to provide_probe() since I can do it in a single place for all cases. Incidentally, I also just noticed that dt_modsym_mark_traceable(dtp); is being done too early. We only really need that to be done once we get to looking at function symbols. I'll move it - that way we avoid marking function traceable for probes that cannot be FBT probes because of probe name or module name. I'll send out a v2 tomorrow with that fix. > On 3/7/25 16:34, Kris Van Hees wrote: > > Up until now, FBT probes were registered for every symbol that was > > listed as traceable. Most tracing session do not use most or even > > any of these, and the process of registering them all was quite > > slow. > > > > Going forward, FBT probes are registered on demand. > > > > If any FBT probes are to be registered, the first will incur the > > cost of reading the entire list of traceable symbols. Any further > > FBT probe registration will be able to be satisfied based on that > > initial processing. The performance improvement is therefore quite > > significant for tracing sessions that do not trigger any FBT probe > > registration, and if FBT probes are used, the improvement is still > > quite noticable because only the probes that are actually needed > > get registered. > > > > Signed-off-by: Kris Van Hees > > --- > > libdtrace/dt_module.c | 78 +++++++++++++++ > > libdtrace/dt_module.h | 2 + > > libdtrace/dt_prov_fbt.c | 217 +++++++++++++++++++++++++++------------- > > 3 files changed, 228 insertions(+), 69 deletions(-) > > > > diff --git a/libdtrace/dt_module.c b/libdtrace/dt_module.c > > index 2e915e2f..e7553a07 100644 > > --- a/libdtrace/dt_module.c > > +++ b/libdtrace/dt_module.c > > @@ -22,6 +22,7 @@ > > #include > > #include > > +#include > > #include > > #include > > @@ -1044,6 +1045,83 @@ dt_kern_module_find_ctf(dtrace_hdl_t *dtp, dt_module_t *dmp) > > } > > } > > +#define PROBE_LIST TRACEFS "available_filter_functions" > > + > > +/* > > + * Determine which kernel functions are traceable and mark them. > > + */ > > +void > > +dt_modsym_mark_traceable(dtrace_hdl_t *dtp) > > +{ > > + FILE *f; > > + char *buf = NULL; > > + size_t len = 0; > > + > > + if (dt_symtab_traceable(dtp->dt_exec->dm_kernsyms)) > > + return; > > + > > + f = fopen(PROBE_LIST, "r"); > > + if (f == NULL) > > + return; > > + > > + while (getline(&buf, &len, f) >= 0) { > > + char *p; > > + dt_symbol_t *sym = NULL; > > + > > + /* > > + * Here buf is either "funcname\n" or "funcname [modname]\n". > > + * The last line may not have a linefeed. > > + */ > > + p = strchr(buf, '\n'); > > + if (p) { > > + *p = '\0'; > > + if (p > buf && *(--p) == ']') > > + *p = '\0'; > > + } > > + > > + /* > > + * Now buf is either "funcname" or "funcname [modname". If > > + * there is no module name provided, we will use the default. > > + */ > > + p = strchr(buf, ' '); > > + if (p) { > > + *p++ = '\0'; > > + if (*p == '[') > > + p++; > > + } > > + > > +#define strstarts(var, x) (strncmp(var, x, strlen (x)) == 0) > > + /* Weed out __ftrace_invalid_address___* entries. */ > > + if (strstarts(buf, "__ftrace_invalid_address__") || > > + strstarts(buf, "__probestub_") || > > + strstarts(buf, "__traceiter_")) > > + continue; > > +#undef strstarts > > + > > + /* > > + * If we have a module name, look for the symbol in that > > + * module. > > + * If not, perform a general symbol lookup to find its first > > + * instance. > > + */ > > + if (p) { > > + dt_module_t *dmp = dt_module_lookup_by_name(dtp, p); > > + > > + if (dmp) > > + sym = dt_module_symbol_by_name(dtp, dmp, buf); > > + } else > > + sym = dt_symbol_by_name(dtp, buf); > > + > > + if (sym) > > + dt_symbol_set_traceable(sym); > > + } > > + > > + free(buf); > > + fclose(f); > > + > > + dt_symtab_set_traceable(dtp->dt_exec->dm_kernsyms); > > +} > > + > > /* > > * Symbol data can be collected in three ways: > > * - kallmodsyms > > diff --git a/libdtrace/dt_module.h b/libdtrace/dt_module.h > > index 56df17a6..dd3ad17c 100644 > > --- a/libdtrace/dt_module.h > > +++ b/libdtrace/dt_module.h > > @@ -25,6 +25,8 @@ extern dt_ident_t *dt_module_extern(dtrace_hdl_t *, dt_module_t *, > > extern const char *dt_module_modelname(dt_module_t *); > > +extern void dt_modsym_mark_traceable(dtrace_hdl_t *); > > + > > #ifdef __cplusplus > > } > > #endif > > diff --git a/libdtrace/dt_prov_fbt.c b/libdtrace/dt_prov_fbt.c > > index eef93879..d837e14d 100644 > > --- a/libdtrace/dt_prov_fbt.c > > +++ b/libdtrace/dt_prov_fbt.c > > @@ -41,10 +41,8 @@ > > #include "dt_pt_regs.h" > > static const char prvname[] = "fbt"; > > -static const char modname[] = "vmlinux"; > > #define KPROBE_EVENTS TRACEFS "kprobe_events" > > -#define PROBE_LIST TRACEFS "available_filter_functions" > > #define FBT_GROUP_FMT GROUP_FMT "_%s" > > #define FBT_GROUP_DATA GROUP_DATA, prp->desc->prb > > @@ -61,19 +59,11 @@ dt_provimpl_t dt_fbt_fprobe; > > dt_provimpl_t dt_fbt_kprobe; > > /* > > - * Scan the PROBE_LIST file and add entry and return probes for every function > > - * that is listed. > > + * Create the fbt provider. > > */ > > static int populate(dtrace_hdl_t *dtp) > > { > > dt_provider_t *prv; > > - FILE *f; > > - char *buf = NULL; > > - char *p; > > - const char *mod = modname; > > - size_t n; > > - dtrace_syminfo_t sip; > > - dtrace_probedesc_t pd; > > dt_fbt = BPF_HAS(dtp, BPF_FEAT_FENTRY) ? dt_fbt_fprobe : dt_fbt_kprobe; > > @@ -81,79 +71,166 @@ static int populate(dtrace_hdl_t *dtp) > > if (prv == NULL) > > return -1; /* errno already set */ > > - f = fopen(PROBE_LIST, "r"); > > - if (f == NULL) > > + return 0; > > +} > > + > > +/* Create a probe (if it does not exist yet). */ > > +static int provide_probe(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) > > +{ > > + dt_provider_t *prv = dt_provider_lookup(dtp, pdp->prv); > > + > > + if (prv == NULL) > > + return 0; > > + if (dt_probe_lookup(dtp, pdp) != NULL) > > return 0; > > + if (dt_tp_probe_insert(dtp, prv, pdp->prv, pdp->mod, pdp->fun, pdp->prb)) > > + return 1; > > - while (getline(&buf, &n, f) >= 0) { > > - /* > > - * Here buf is either "funcname\n" or "funcname [modname]\n". > > - * The last line may not have a linefeed. > > - */ > > - p = strchr(buf, '\n'); > > - if (p) { > > - *p = '\0'; > > - if (p > buf && *(--p) == ']') > > - *p = '\0'; > > + return 0; > > +} > > + > > +/* > > + * Try to provide probes for the given probe description. The caller ensures > > + * that the provider name in probe desxcription (if any) is a match for this > > + * provider. When this is called, we already know that this provider matches > > + * the provider component of the probe specification. > > + */ > > +#define FBT_ENTRY 1 > > +#define FBT_RETURN 2 > > + > > +static int provide(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) > > +{ > > + int n = 0; > > + int prb = 0; > > + dt_module_t *dmp = NULL; > > + dt_symbol_t *sym = NULL; > > + dt_htab_next_t *it = NULL; > > + dtrace_probedesc_t pd; > > + > > + dt_modsym_mark_traceable(dtp); > > + > > + /* > > + * Nothing to do if a probe name is specified and cannot match 'entry' > > + * or 'return'. > > + */ > > + if (dt_gmatch("entry", pdp->prb)) > > + prb |= FBT_ENTRY; > > + if (dt_gmatch("return", pdp->prb)) > > + prb |= FBT_RETURN; > > + if (prb == 0) > > + return 0; > > + > > + /* Synthetic function names are not supported for FBT. */ > > + if (strchr(pdp->fun, '.')) > > + return 0; > > + > > + /* > > + * If we have an explicit module name, check it. If not found, we can > > + * ignore this request. > > + */ > > + if (pdp->mod[0] != '\0' && strchr(pdp->mod, '*') == NULL) { > > + dmp = dt_module_lookup_by_name(dtp, pdp->mod); > > + if (dmp == NULL) > > + return 0; > > + } > > + > > + /* > > + * If we have an explicit function name, we start with a basic symbol > > + * name lookup. > > + */ > > + if (pdp->fun[0] != '\0' && strchr(pdp->fun, '*') == NULL) { > > + /* If we have a module, use it. */ > > + if (dmp != NULL) { > > + sym = dt_module_symbol_by_name(dtp, dmp, pdp->fun); > > + if (sym == NULL) > > + return 0; > > + if (!dt_symbol_traceable(sym)) > > + return 0; > > + > > + pd.id = DTRACE_IDNONE; > > + pd.prv = pdp->prv; > > + pd.mod = dmp->dm_name; > > + pd.fun = pdp->fun; > > + > > + if (prb & FBT_ENTRY) { > > + pd.prb = "entry"; > > + n += provide_probe(dtp, &pd); > > + } > > + if (prb & FBT_RETURN) { > > + pd.prb = "return"; > > + n += provide_probe(dtp, &pd); > > + } > > + > > + return n; > > } > > - /* > > - * Now buf is either "funcname" or "funcname [modname". If > > - * there is no module name provided, we will use the default. > > - */ > > - p = strchr(buf, ' '); > > - if (p) { > > - *p++ = '\0'; > > - if (*p == '[') > > - p++; > > + sym = dt_symbol_by_name(dtp, pdp->fun); > > + while (sym != NULL) { > > + const char *mod = dt_symbol_module(sym)->dm_name; > > + > > + if (dt_symbol_traceable(sym) && > > + dt_gmatch(mod, pdp->mod)) { > > + pd.id = DTRACE_IDNONE; > > + pd.prv = pdp->prv; > > + pd.mod = mod; > > + pd.fun = pdp->fun; > > + > > + if (prb & FBT_ENTRY) { > > + pd.prb = "entry"; > > + n += provide_probe(dtp, &pd); > > + } > > + if (prb & FBT_RETURN) { > > + pd.prb = "return"; > > + n += provide_probe(dtp, &pd); > > + } > > + > > + } > > + sym = dt_symbol_by_name_next(sym); > > } > > - /* Weed out synthetic symbol names (that are invalid). */ > > - if (strchr(buf, '.') != NULL) > > + return n; > > + } > > + > > + /* > > + * No explicit function name. We need to go through all possible > > + * symbol names and see if they match. > > + */ > > + while ((sym = dt_htab_next(dtp->dt_kernsyms, &it)) != NULL) { > > + dt_module_t *smp; > > + const char *fun; > > + > > + /* Ensure the symbol can be traced. */ > > + if (!dt_symbol_traceable(sym)) > > continue; > > -#define strstarts(var, x) (strncmp(var, x, strlen (x)) == 0) > > - /* Weed out __ftrace_invalid_address___* entries. */ > > - if (strstarts(buf, "__ftrace_invalid_address__") || > > - strstarts(buf, "__probestub_") || > > - strstarts(buf, "__traceiter_")) > > + /* Match the function name. */ > > + fun = dt_symbol_name(sym); > > + if (!dt_gmatch(fun, pdp->fun)) > > continue; > > -#undef strstarts > > - /* > > - * If we did not see a module name, perform a symbol lookup to > > - * try to determine the module name. > > - */ > > - if (!p) { > > - if (dtrace_lookup_by_name(dtp, DTRACE_OBJ_KMODS, buf, > > - NULL, &sip) == 0) > > - mod = sip.object; > > - } else > > - mod = p; > > + /* Validate the module name. */ > > + smp = dt_symbol_module(sym); > > + if (dmp) { > > + if (smp != dmp) > > + continue; > > + } else if (!dt_gmatch(smp->dm_name, pdp->mod)) > > + continue; > > - /* > > - * Due to the lack of module names in > > - * TRACEFS/available_filter_functions, there are some duplicate > > - * function names. We need to make sure that we do not create > > - * duplicate probes for these. > > - */ > > pd.id = DTRACE_IDNONE; > > - pd.prv = prvname; > > - pd.mod = mod; > > - pd.fun = buf; > > - pd.prb = "entry"; > > - if (dt_probe_lookup(dtp, &pd) != NULL) > > - continue; > > + pd.prv = pdp->prv; > > + pd.mod = smp->dm_name; > > + pd.fun = fun; > > - if (dt_tp_probe_insert(dtp, prv, prvname, mod, buf, "entry")) > > - n++; > > - if (dt_tp_probe_insert(dtp, prv, prvname, mod, buf, "return")) > > - n++; > > + if (prb & FBT_ENTRY) { > > + pd.prb = "entry"; > > + n += provide_probe(dtp, &pd); > > + } > > + if (prb & FBT_RETURN) { > > + pd.prb = "return"; > > + n += provide_probe(dtp, &pd); > > + } > > } > > - free(buf); > > - fclose(f); > > - > > return n; > > } > > @@ -447,6 +524,7 @@ dt_provimpl_t dt_fbt_fprobe = { > > .prog_type = BPF_PROG_TYPE_TRACING, > > .stack_skip = 4, > > .populate = &populate, > > + .provide = &provide, > > .load_prog = &fprobe_prog_load, > > .trampoline = &fprobe_trampoline, > > .attach = &dt_tp_probe_attach_raw, > > @@ -459,6 +537,7 @@ dt_provimpl_t dt_fbt_kprobe = { > > .name = prvname, > > .prog_type = BPF_PROG_TYPE_KPROBE, > > .populate = &populate, > > + .provide = &provide, > > .load_prog = &dt_bpf_prog_load, > > .trampoline = &kprobe_trampoline, > > .attach = &kprobe_attach, From eugene.loh at oracle.com Wed Mar 12 05:45:57 2025 From: eugene.loh at oracle.com (Eugene Loh) Date: Wed, 12 Mar 2025 01:45:57 -0400 Subject: [DTrace-devel] [PATCH 7/8] rawfbt: performance improvements In-Reply-To: <20250307213441.9495-6-kris.van.hees@oracle.com> References: <20250307213441.9495-1-kris.van.hees@oracle.com> <20250307213441.9495-6-kris.van.hees@oracle.com> Message-ID: <257942be-0eef-86f4-5617-789455ac7a91@oracle.com> Preliminary comment on this one...? a couple of USDT tests start to fail reproducibly with this patch.? The explanation is weird. The tests filter out specific run-dependent probe ID values before checking with their .r results files.? But because some of these probe IDs get a lot smaller (narrower, fewer digits) with this fbt patch, the filters break.? I have a patch for those USDT tests.? I suppose that patch should land before these fbt patches.? So, my fault.? Sorry. On 3/7/25 16:34, Kris Van Hees via DTrace-devel wrote: > Signed-off-by: Kris Van Hees > --- > libdtrace/dt_prov_rawfbt.c | 223 +++++++++++++++++++++++++------------ > 1 file changed, 151 insertions(+), 72 deletions(-) > > diff --git a/libdtrace/dt_prov_rawfbt.c b/libdtrace/dt_prov_rawfbt.c > index 62f2f4f0..52152655 100644 > --- a/libdtrace/dt_prov_rawfbt.c > +++ b/libdtrace/dt_prov_rawfbt.c > @@ -44,10 +44,8 @@ > #include "dt_pt_regs.h" > > static const char prvname[] = "rawfbt"; > -static const char modname[] = "vmlinux"; > > #define KPROBE_EVENTS TRACEFS "kprobe_events" > -#define PROBE_LIST TRACEFS "available_filter_functions" > > #define FBT_GROUP_FMT GROUP_FMT "_%s" > #define FBT_GROUP_DATA GROUP_DATA, prp->desc->prb > @@ -61,98 +59,178 @@ static const dtrace_pattr_t pattr = { > }; > > /* > - * Scan the PROBE_LIST file and add entry and return probes for every function > - * that is listed. > + * Create the rawfbt provider. > */ > static int populate(dtrace_hdl_t *dtp) > { > dt_provider_t *prv; > - FILE *f; > - char *buf = NULL; > - size_t len = 0; > - size_t n = 0; > - dtrace_syminfo_t sip; > - dtrace_probedesc_t pd; > > prv = dt_provider_create(dtp, prvname, &dt_rawfbt, &pattr, NULL); > if (prv == NULL) > return -1; /* errno already set */ > > - f = fopen(PROBE_LIST, "r"); > - if (f == NULL) > + return 0; > +} > + > +/* Create a probe (if it does not exist yet). */ > +static int provide_probe(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) > +{ > + dt_provider_t *prv = dt_provider_lookup(dtp, pdp->prv); > + > + if (prv == NULL) > return 0; > + if (dt_probe_lookup(dtp, pdp) != NULL) > + return 0; > +#ifdef DEBUG_FBT > + if (dt_tp_probe_insert(dtp, prv, pdp->prv, pdp->mod, pdp->fun, pdp->prb)) { > + fprintf(stderr, "%s(..., PROVIDE %s:%s:%s:%s) - ...\n", __func__, pdp->prv, pdp->mod, pdp->fun, pdp->prb); > + return 1; > + } > +#else > + if (dt_tp_probe_insert(dtp, prv, pdp->prv, pdp->mod, pdp->fun, pdp->prb)) > + return 1; > +#endif > > - while (getline(&buf, &len, f) >= 0) { > - char *p; > - const char *mod = modname; > - dt_probe_t *prp; > + return 0; > +} > > - /* > - * Here buf is either "funcname\n" or "funcname [modname]\n". > - * The last line may not have a linefeed. > - */ > - p = strchr(buf, '\n'); > - if (p) { > - *p = '\0'; > - if (p > buf && *(--p) == ']') > - *p = '\0'; > +/* > + * Try to provide probes for the given probe description. The caller ensures > + * that the provider name in probe desxcription (if any) is a match for this > + * provider. When this is called, we already know that this provider matches > + * the provider component of the probe specification. > + */ > +#define FBT_ENTRY 1 > +#define FBT_RETURN 2 > + > +static int provide(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) > +{ > + int n = 0; > + int prb = 0; > + dt_module_t *dmp = NULL; > + dt_symbol_t *sym = NULL; > + dt_htab_next_t *it = NULL; > + dtrace_probedesc_t pd; > + > + dt_modsym_mark_traceable(dtp); > + > + /* > + * Nothing to do if a probe name is specified and cannot match 'entry' > + * or 'return'. > + */ > + if (dt_gmatch("entry", pdp->prb)) > + prb |= FBT_ENTRY; > + if (dt_gmatch("return", pdp->prb)) > + prb |= FBT_RETURN; > + if (prb == 0) > + return 0; > + > + /* > + * If we have an explicit module name, check it. If not found, we can > + * ignore this request. > + */ > + if (pdp->mod[0] != '\0' && strchr(pdp->mod, '*') == NULL) { > + dmp = dt_module_lookup_by_name(dtp, pdp->mod); > + if (dmp == NULL) > + return 0; > + } > + > + /* > + * If we have an explicit function name, we start with a basic symbol > + * name lookup. > + */ > + if (pdp->fun[0] != '\0' && strchr(pdp->fun, '*') == NULL) { > + /* If we have a module, use it. */ > + if (dmp != NULL) { > + sym = dt_module_symbol_by_name(dtp, dmp, pdp->fun); > + if (sym == NULL) > + return 0; > + if (!dt_symbol_traceable(sym)) > + return 0; > + > + pd.id = DTRACE_IDNONE; > + pd.prv = pdp->prv; > + pd.mod = dmp->dm_name; > + pd.fun = pdp->fun; > + > + if (prb & FBT_ENTRY) { > + pd.prb = "entry"; > + n += provide_probe(dtp, &pd); > + } > + if (prb & FBT_RETURN) { > + pd.prb = "return"; > + n += provide_probe(dtp, &pd); > + } > + > + return n; > } > > - /* > - * Now buf is either "funcname" or "funcname [modname". If > - * there is no module name provided, we will use the default. > - */ > - p = strchr(buf, ' '); > - if (p) { > - *p++ = '\0'; > - if (*p == '[') > - p++; > + sym = dt_symbol_by_name(dtp, pdp->fun); > + while (sym != NULL) { > + const char *mod = dt_symbol_module(sym)->dm_name; > + > + if (dt_symbol_traceable(sym) && > + dt_gmatch(mod, pdp->mod)) { > + pd.id = DTRACE_IDNONE; > + pd.prv = pdp->prv; > + pd.mod = mod; > + pd.fun = pdp->fun; > + > + if (prb & FBT_ENTRY) { > + pd.prb = "entry"; > + n += provide_probe(dtp, &pd); > + } > + if (prb & FBT_RETURN) { > + pd.prb = "return"; > + n += provide_probe(dtp, &pd); > + } > + > + } > + sym = dt_symbol_by_name_next(sym); > } > > -#define strstarts(var, x) (strncmp(var, x, strlen (x)) == 0) > - /* Weed out __ftrace_invalid_address___* entries. */ > - if (strstarts(buf, "__ftrace_invalid_address__") || > - strstarts(buf, "__probestub_") || > - strstarts(buf, "__traceiter_")) > + return n; > + } > + > + /* > + * No explicit function name. We need to go through all possible > + * symbol names and see if they match. > + */ > + while ((sym = dt_htab_next(dtp->dt_kernsyms, &it)) != NULL) { > + dt_module_t *smp; > + const char *fun; > + > + /* Ensure the symbol can be traced. */ > + if (!dt_symbol_traceable(sym)) > continue; > -#undef strstarts > > - /* > - * If we did not see a module name, perform a symbol lookup to > - * try to determine the module name. > - */ > - if (!p) { > - if (dtrace_lookup_by_name(dtp, DTRACE_OBJ_KMODS, buf, > - NULL, &sip) == 0) > - mod = sip.object; > - } else > - mod = p; > + /* Match the function name. */ > + fun = dt_symbol_name(sym); > + if (!dt_gmatch(fun, pdp->fun)) > + continue; > > - /* > - * Due to the lack of module names in > - * TRACEFS/available_filter_functions, there are some duplicate > - * function names. The kernel does not let us trace functions > - * that have duplicates, so we need to remove the existing one. > - */ > - pd.id = DTRACE_IDNONE; > - pd.prv = prvname; > - pd.mod = mod; > - pd.fun = buf; > - pd.prb = "entry"; > - prp = dt_probe_lookup(dtp, &pd); > - if (prp != NULL) { > - dt_probe_destroy(prp); > + /* Validate the module name. */ > + smp = dt_symbol_module(sym); > + if (dmp) { > + if (smp != dmp) > + continue; > + } else if (!dt_gmatch(smp->dm_name, pdp->mod)) > continue; > - } > > - if (dt_tp_probe_insert(dtp, prv, prvname, mod, buf, "entry")) > - n++; > - if (dt_tp_probe_insert(dtp, prv, prvname, mod, buf, "return")) > - n++; > - } > + pd.id = DTRACE_IDNONE; > + pd.prv = pdp->prv; > + pd.mod = smp->dm_name; > + pd.fun = fun; > > - free(buf); > - fclose(f); > + if (prb & FBT_ENTRY) { > + pd.prb = "entry"; > + n += provide_probe(dtp, &pd); > + } > + if (prb & FBT_RETURN) { > + pd.prb = "return"; > + n += provide_probe(dtp, &pd); > + } > + } > > return n; > } > @@ -306,6 +384,7 @@ dt_provimpl_t dt_rawfbt = { > .name = prvname, > .prog_type = BPF_PROG_TYPE_KPROBE, > .populate = &populate, > + .provide = &provide, > .load_prog = &dt_bpf_prog_load, > .trampoline = &trampoline, > .attach = &attach, From kris.van.hees at oracle.com Wed Mar 12 06:38:35 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Wed, 12 Mar 2025 02:38:35 -0400 Subject: [DTrace-devel] [PATCH 7/8] rawfbt: performance improvements In-Reply-To: <257942be-0eef-86f4-5617-789455ac7a91@oracle.com> References: <20250307213441.9495-1-kris.van.hees@oracle.com> <20250307213441.9495-6-kris.van.hees@oracle.com> <257942be-0eef-86f4-5617-789455ac7a91@oracle.com> Message-ID: On Wed, Mar 12, 2025 at 01:45:57AM -0400, Eugene Loh via DTrace-devel wrote: > Preliminary comment on this one...? a couple of USDT tests start to fail > reproducibly with this patch.? The explanation is weird. The tests filter > out specific run-dependent probe ID values before checking with their .r > results files.? But because some of these probe IDs get a lot smaller > (narrower, fewer digits) with this fbt patch, the filters break.? I have a > patch for those USDT tests.? I suppose that patch should land before these > fbt patches.? So, my fault.? Sorry. > > On 3/7/25 16:34, Kris Van Hees via DTrace-devel wrote: > > Signed-off-by: Kris Van Hees > > --- > > libdtrace/dt_prov_rawfbt.c | 223 +++++++++++++++++++++++++------------ > > 1 file changed, 151 insertions(+), 72 deletions(-) > > > > diff --git a/libdtrace/dt_prov_rawfbt.c b/libdtrace/dt_prov_rawfbt.c > > index 62f2f4f0..52152655 100644 > > --- a/libdtrace/dt_prov_rawfbt.c > > +++ b/libdtrace/dt_prov_rawfbt.c > > @@ -44,10 +44,8 @@ > > #include "dt_pt_regs.h" > > static const char prvname[] = "rawfbt"; > > -static const char modname[] = "vmlinux"; > > #define KPROBE_EVENTS TRACEFS "kprobe_events" > > -#define PROBE_LIST TRACEFS "available_filter_functions" > > #define FBT_GROUP_FMT GROUP_FMT "_%s" > > #define FBT_GROUP_DATA GROUP_DATA, prp->desc->prb > > @@ -61,98 +59,178 @@ static const dtrace_pattr_t pattr = { > > }; > > /* > > - * Scan the PROBE_LIST file and add entry and return probes for every function > > - * that is listed. > > + * Create the rawfbt provider. > > */ > > static int populate(dtrace_hdl_t *dtp) > > { > > dt_provider_t *prv; > > - FILE *f; > > - char *buf = NULL; > > - size_t len = 0; > > - size_t n = 0; > > - dtrace_syminfo_t sip; > > - dtrace_probedesc_t pd; > > prv = dt_provider_create(dtp, prvname, &dt_rawfbt, &pattr, NULL); > > if (prv == NULL) > > return -1; /* errno already set */ > > - f = fopen(PROBE_LIST, "r"); > > - if (f == NULL) > > + return 0; > > +} > > + > > +/* Create a probe (if it does not exist yet). */ > > +static int provide_probe(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) > > +{ > > + dt_provider_t *prv = dt_provider_lookup(dtp, pdp->prv); > > + > > + if (prv == NULL) > > return 0; > > + if (dt_probe_lookup(dtp, pdp) != NULL) > > + return 0; > > +#ifdef DEBUG_FBT Hm, something tells me I mailed ut a slightly earlier patch than I intended. I am pretty sure I made a version of this patch without the DEBUG_FBT stuff. > > + if (dt_tp_probe_insert(dtp, prv, pdp->prv, pdp->mod, pdp->fun, pdp->prb)) { > > + fprintf(stderr, "%s(..., PROVIDE %s:%s:%s:%s) - ...\n", __func__, pdp->prv, pdp->mod, pdp->fun, pdp->prb); > > + return 1; > > + } > > +#else > > + if (dt_tp_probe_insert(dtp, prv, pdp->prv, pdp->mod, pdp->fun, pdp->prb)) > > + return 1; > > +#endif > > - while (getline(&buf, &len, f) >= 0) { > > - char *p; > > - const char *mod = modname; > > - dt_probe_t *prp; > > + return 0; > > +} > > - /* > > - * Here buf is either "funcname\n" or "funcname [modname]\n". > > - * The last line may not have a linefeed. > > - */ > > - p = strchr(buf, '\n'); > > - if (p) { > > - *p = '\0'; > > - if (p > buf && *(--p) == ']') > > - *p = '\0'; > > +/* > > + * Try to provide probes for the given probe description. The caller ensures > > + * that the provider name in probe desxcription (if any) is a match for this > > + * provider. When this is called, we already know that this provider matches > > + * the provider component of the probe specification. > > + */ > > +#define FBT_ENTRY 1 > > +#define FBT_RETURN 2 > > + > > +static int provide(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) > > +{ > > + int n = 0; > > + int prb = 0; > > + dt_module_t *dmp = NULL; > > + dt_symbol_t *sym = NULL; > > + dt_htab_next_t *it = NULL; > > + dtrace_probedesc_t pd; > > + > > + dt_modsym_mark_traceable(dtp); Again, this should be moved lower. > > + > > + /* > > + * Nothing to do if a probe name is specified and cannot match 'entry' > > + * or 'return'. > > + */ > > + if (dt_gmatch("entry", pdp->prb)) > > + prb |= FBT_ENTRY; > > + if (dt_gmatch("return", pdp->prb)) > > + prb |= FBT_RETURN; > > + if (prb == 0) > > + return 0; > > + > > + /* > > + * If we have an explicit module name, check it. If not found, we can > > + * ignore this request. > > + */ > > + if (pdp->mod[0] != '\0' && strchr(pdp->mod, '*') == NULL) { > > + dmp = dt_module_lookup_by_name(dtp, pdp->mod); > > + if (dmp == NULL) > > + return 0; > > + } > > + > > + /* > > + * If we have an explicit function name, we start with a basic symbol > > + * name lookup. > > + */ > > + if (pdp->fun[0] != '\0' && strchr(pdp->fun, '*') == NULL) { > > + /* If we have a module, use it. */ > > + if (dmp != NULL) { > > + sym = dt_module_symbol_by_name(dtp, dmp, pdp->fun); > > + if (sym == NULL) > > + return 0; > > + if (!dt_symbol_traceable(sym)) > > + return 0; > > + > > + pd.id = DTRACE_IDNONE; > > + pd.prv = pdp->prv; > > + pd.mod = dmp->dm_name; > > + pd.fun = pdp->fun; > > + > > + if (prb & FBT_ENTRY) { > > + pd.prb = "entry"; > > + n += provide_probe(dtp, &pd); > > + } > > + if (prb & FBT_RETURN) { > > + pd.prb = "return"; > > + n += provide_probe(dtp, &pd); > > + } > > + > > + return n; > > } > > - /* > > - * Now buf is either "funcname" or "funcname [modname". If > > - * there is no module name provided, we will use the default. > > - */ > > - p = strchr(buf, ' '); > > - if (p) { > > - *p++ = '\0'; > > - if (*p == '[') > > - p++; > > + sym = dt_symbol_by_name(dtp, pdp->fun); > > + while (sym != NULL) { > > + const char *mod = dt_symbol_module(sym)->dm_name; > > + > > + if (dt_symbol_traceable(sym) && > > + dt_gmatch(mod, pdp->mod)) { > > + pd.id = DTRACE_IDNONE; > > + pd.prv = pdp->prv; > > + pd.mod = mod; > > + pd.fun = pdp->fun; > > + > > + if (prb & FBT_ENTRY) { > > + pd.prb = "entry"; > > + n += provide_probe(dtp, &pd); > > + } > > + if (prb & FBT_RETURN) { > > + pd.prb = "return"; > > + n += provide_probe(dtp, &pd); > > + } > > + > > + } > > + sym = dt_symbol_by_name_next(sym); > > } > > -#define strstarts(var, x) (strncmp(var, x, strlen (x)) == 0) > > - /* Weed out __ftrace_invalid_address___* entries. */ > > - if (strstarts(buf, "__ftrace_invalid_address__") || > > - strstarts(buf, "__probestub_") || > > - strstarts(buf, "__traceiter_")) > > + return n; > > + } > > + > > + /* > > + * No explicit function name. We need to go through all possible > > + * symbol names and see if they match. > > + */ > > + while ((sym = dt_htab_next(dtp->dt_kernsyms, &it)) != NULL) { > > + dt_module_t *smp; > > + const char *fun; > > + > > + /* Ensure the symbol can be traced. */ > > + if (!dt_symbol_traceable(sym)) > > continue; > > -#undef strstarts > > - /* > > - * If we did not see a module name, perform a symbol lookup to > > - * try to determine the module name. > > - */ > > - if (!p) { > > - if (dtrace_lookup_by_name(dtp, DTRACE_OBJ_KMODS, buf, > > - NULL, &sip) == 0) > > - mod = sip.object; > > - } else > > - mod = p; > > + /* Match the function name. */ > > + fun = dt_symbol_name(sym); > > + if (!dt_gmatch(fun, pdp->fun)) > > + continue; > > - /* > > - * Due to the lack of module names in > > - * TRACEFS/available_filter_functions, there are some duplicate > > - * function names. The kernel does not let us trace functions > > - * that have duplicates, so we need to remove the existing one. > > - */ > > - pd.id = DTRACE_IDNONE; > > - pd.prv = prvname; > > - pd.mod = mod; > > - pd.fun = buf; > > - pd.prb = "entry"; > > - prp = dt_probe_lookup(dtp, &pd); > > - if (prp != NULL) { > > - dt_probe_destroy(prp); > > + /* Validate the module name. */ > > + smp = dt_symbol_module(sym); > > + if (dmp) { > > + if (smp != dmp) > > + continue; > > + } else if (!dt_gmatch(smp->dm_name, pdp->mod)) > > continue; > > - } > > - if (dt_tp_probe_insert(dtp, prv, prvname, mod, buf, "entry")) > > - n++; > > - if (dt_tp_probe_insert(dtp, prv, prvname, mod, buf, "return")) > > - n++; > > - } > > + pd.id = DTRACE_IDNONE; > > + pd.prv = pdp->prv; > > + pd.mod = smp->dm_name; > > + pd.fun = fun; > > - free(buf); > > - fclose(f); > > + if (prb & FBT_ENTRY) { > > + pd.prb = "entry"; > > + n += provide_probe(dtp, &pd); > > + } > > + if (prb & FBT_RETURN) { > > + pd.prb = "return"; > > + n += provide_probe(dtp, &pd); > > + } > > + } > > return n; > > } > > @@ -306,6 +384,7 @@ dt_provimpl_t dt_rawfbt = { > > .name = prvname, > > .prog_type = BPF_PROG_TYPE_KPROBE, > > .populate = &populate, > > + .provide = &provide, > > .load_prog = &dt_bpf_prog_load, > > .trampoline = &trampoline, > > .attach = &attach, > > _______________________________________________ > DTrace-devel mailing list > DTrace-devel at oss.oracle.com > https://oss.oracle.com/mailman/listinfo/dtrace-devel From kris.van.hees at oracle.com Wed Mar 12 15:08:22 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Wed, 12 Mar 2025 15:08:22 -0000 Subject: [DTrace-devel] [PATCH v2 6/8] fbt: performance improvements Message-ID: Up until now, FBT probes were registered for every symbol that was listed as traceable. Most tracing session do not use most or even any of these, and the process of registering them all was quite slow. Going forward, FBT probes are registered on demand. If any FBT probes are to be registered, the first will incur the cost of reading the entire list of traceable symbols. Any further FBT probe registration will be able to be satisfied based on that initial processing. The performance improvement is therefore quite significant for tracing sessions that do not trigger any FBT probe registration, and if FBT probes are used, the improvement is still quite noticable because only the probes that are actually needed get registered. Signed-off-by: Kris Van Hees --- libdtrace/dt_module.c | 78 ++++++++++++++ libdtrace/dt_module.h | 2 + libdtrace/dt_prov_fbt.c | 219 +++++++++++++++++++++++++++------------- 3 files changed, 229 insertions(+), 70 deletions(-) diff --git a/libdtrace/dt_module.c b/libdtrace/dt_module.c index 2e915e2f..e7553a07 100644 --- a/libdtrace/dt_module.c +++ b/libdtrace/dt_module.c @@ -22,6 +22,7 @@ #include #include +#include #include #include @@ -1044,6 +1045,83 @@ dt_kern_module_find_ctf(dtrace_hdl_t *dtp, dt_module_t *dmp) } } +#define PROBE_LIST TRACEFS "available_filter_functions" + +/* + * Determine which kernel functions are traceable and mark them. + */ +void +dt_modsym_mark_traceable(dtrace_hdl_t *dtp) +{ + FILE *f; + char *buf = NULL; + size_t len = 0; + + if (dt_symtab_traceable(dtp->dt_exec->dm_kernsyms)) + return; + + f = fopen(PROBE_LIST, "r"); + if (f == NULL) + return; + + while (getline(&buf, &len, f) >= 0) { + char *p; + dt_symbol_t *sym = NULL; + + /* + * Here buf is either "funcname\n" or "funcname [modname]\n". + * The last line may not have a linefeed. + */ + p = strchr(buf, '\n'); + if (p) { + *p = '\0'; + if (p > buf && *(--p) == ']') + *p = '\0'; + } + + /* + * Now buf is either "funcname" or "funcname [modname". If + * there is no module name provided, we will use the default. + */ + p = strchr(buf, ' '); + if (p) { + *p++ = '\0'; + if (*p == '[') + p++; + } + +#define strstarts(var, x) (strncmp(var, x, strlen (x)) == 0) + /* Weed out __ftrace_invalid_address___* entries. */ + if (strstarts(buf, "__ftrace_invalid_address__") || + strstarts(buf, "__probestub_") || + strstarts(buf, "__traceiter_")) + continue; +#undef strstarts + + /* + * If we have a module name, look for the symbol in that + * module. + * If not, perform a general symbol lookup to find its first + * instance. + */ + if (p) { + dt_module_t *dmp = dt_module_lookup_by_name(dtp, p); + + if (dmp) + sym = dt_module_symbol_by_name(dtp, dmp, buf); + } else + sym = dt_symbol_by_name(dtp, buf); + + if (sym) + dt_symbol_set_traceable(sym); + } + + free(buf); + fclose(f); + + dt_symtab_set_traceable(dtp->dt_exec->dm_kernsyms); +} + /* * Symbol data can be collected in three ways: * - kallmodsyms diff --git a/libdtrace/dt_module.h b/libdtrace/dt_module.h index 56df17a6..dd3ad17c 100644 --- a/libdtrace/dt_module.h +++ b/libdtrace/dt_module.h @@ -25,6 +25,8 @@ extern dt_ident_t *dt_module_extern(dtrace_hdl_t *, dt_module_t *, extern const char *dt_module_modelname(dt_module_t *); +extern void dt_modsym_mark_traceable(dtrace_hdl_t *); + #ifdef __cplusplus } #endif diff --git a/libdtrace/dt_prov_fbt.c b/libdtrace/dt_prov_fbt.c index eef93879..827156cc 100644 --- a/libdtrace/dt_prov_fbt.c +++ b/libdtrace/dt_prov_fbt.c @@ -1,6 +1,6 @@ /* * Oracle Linux DTrace. - * Copyright (c) 2019, 2024, Oracle and/or its affiliates. All rights reserved. + * Copyright (c) 2019, 2025, Oracle and/or its affiliates. All rights reserved. * Licensed under the Universal Permissive License v 1.0 as shown at * http://oss.oracle.com/licenses/upl. * @@ -41,10 +41,8 @@ #include "dt_pt_regs.h" static const char prvname[] = "fbt"; -static const char modname[] = "vmlinux"; #define KPROBE_EVENTS TRACEFS "kprobe_events" -#define PROBE_LIST TRACEFS "available_filter_functions" #define FBT_GROUP_FMT GROUP_FMT "_%s" #define FBT_GROUP_DATA GROUP_DATA, prp->desc->prb @@ -61,19 +59,11 @@ dt_provimpl_t dt_fbt_fprobe; dt_provimpl_t dt_fbt_kprobe; /* - * Scan the PROBE_LIST file and add entry and return probes for every function - * that is listed. + * Create the fbt provider. */ static int populate(dtrace_hdl_t *dtp) { dt_provider_t *prv; - FILE *f; - char *buf = NULL; - char *p; - const char *mod = modname; - size_t n; - dtrace_syminfo_t sip; - dtrace_probedesc_t pd; dt_fbt = BPF_HAS(dtp, BPF_FEAT_FENTRY) ? dt_fbt_fprobe : dt_fbt_kprobe; @@ -81,79 +71,166 @@ static int populate(dtrace_hdl_t *dtp) if (prv == NULL) return -1; /* errno already set */ - f = fopen(PROBE_LIST, "r"); - if (f == NULL) + return 0; +} + +/* Create a probe (if it does not exist yet). */ +static int provide_probe(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) +{ + dt_provider_t *prv = dt_provider_lookup(dtp, pdp->prv); + + if (prv == NULL) + return 0; + if (dt_probe_lookup(dtp, pdp) != NULL) return 0; + if (dt_tp_probe_insert(dtp, prv, pdp->prv, pdp->mod, pdp->fun, pdp->prb)) + return 1; - while (getline(&buf, &n, f) >= 0) { - /* - * Here buf is either "funcname\n" or "funcname [modname]\n". - * The last line may not have a linefeed. - */ - p = strchr(buf, '\n'); - if (p) { - *p = '\0'; - if (p > buf && *(--p) == ']') - *p = '\0'; + return 0; +} + +/* + * Try to provide probes for the given probe description. The caller ensures + * that the provider name in probe desxcription (if any) is a match for this + * provider. When this is called, we already know that this provider matches + * the provider component of the probe specification. + */ +#define FBT_ENTRY 1 +#define FBT_RETURN 2 + +static int provide(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) +{ + int n = 0; + int prb = 0; + dt_module_t *dmp = NULL; + dt_symbol_t *sym = NULL; + dt_htab_next_t *it = NULL; + dtrace_probedesc_t pd; + + dt_modsym_mark_traceable(dtp); + + /* + * Nothing to do if a probe name is specified and cannot match 'entry' + * or 'return'. + */ + if (dt_gmatch("entry", pdp->prb)) + prb |= FBT_ENTRY; + if (dt_gmatch("return", pdp->prb)) + prb |= FBT_RETURN; + if (prb == 0) + return 0; + + /* Synthetic function names are not supported for FBT. */ + if (strchr(pdp->fun, '.')) + return 0; + + /* + * If we have an explicit module name, check it. If not found, we can + * ignore this request. + */ + if (pdp->mod[0] != '\0' && strchr(pdp->mod, '*') == NULL) { + dmp = dt_module_lookup_by_name(dtp, pdp->mod); + if (dmp == NULL) + return 0; + } + + /* + * If we have an explicit function name, we start with a basic symbol + * name lookup. + */ + if (pdp->fun[0] != '\0' && strchr(pdp->fun, '*') == NULL) { + /* If we have a module, use it. */ + if (dmp != NULL) { + sym = dt_module_symbol_by_name(dtp, dmp, pdp->fun); + if (sym == NULL) + return 0; + if (!dt_symbol_traceable(sym)) + return 0; + + pd.id = DTRACE_IDNONE; + pd.prv = pdp->prv; + pd.mod = dmp->dm_name; + pd.fun = pdp->fun; + + if (prb & FBT_ENTRY) { + pd.prb = "entry"; + n += provide_probe(dtp, &pd); + } + if (prb & FBT_RETURN) { + pd.prb = "return"; + n += provide_probe(dtp, &pd); + } + + return n; } - /* - * Now buf is either "funcname" or "funcname [modname". If - * there is no module name provided, we will use the default. - */ - p = strchr(buf, ' '); - if (p) { - *p++ = '\0'; - if (*p == '[') - p++; + sym = dt_symbol_by_name(dtp, pdp->fun); + while (sym != NULL) { + const char *mod = dt_symbol_module(sym)->dm_name; + + if (dt_symbol_traceable(sym) && + dt_gmatch(mod, pdp->mod)) { + pd.id = DTRACE_IDNONE; + pd.prv = pdp->prv; + pd.mod = mod; + pd.fun = pdp->fun; + + if (prb & FBT_ENTRY) { + pd.prb = "entry"; + n += provide_probe(dtp, &pd); + } + if (prb & FBT_RETURN) { + pd.prb = "return"; + n += provide_probe(dtp, &pd); + } + + } + sym = dt_symbol_by_name_next(sym); } - /* Weed out synthetic symbol names (that are invalid). */ - if (strchr(buf, '.') != NULL) + return n; + } + + /* + * No explicit function name. We need to go through all possible + * symbol names and see if they match. + */ + while ((sym = dt_htab_next(dtp->dt_kernsyms, &it)) != NULL) { + dt_module_t *smp; + const char *fun; + + /* Ensure the symbol can be traced. */ + if (!dt_symbol_traceable(sym)) continue; -#define strstarts(var, x) (strncmp(var, x, strlen (x)) == 0) - /* Weed out __ftrace_invalid_address___* entries. */ - if (strstarts(buf, "__ftrace_invalid_address__") || - strstarts(buf, "__probestub_") || - strstarts(buf, "__traceiter_")) + /* Function name cannot be synthetic and must match. */ + fun = dt_symbol_name(sym); + if (strchr(fun, '.') || !dt_gmatch(fun, pdp->fun)) continue; -#undef strstarts - /* - * If we did not see a module name, perform a symbol lookup to - * try to determine the module name. - */ - if (!p) { - if (dtrace_lookup_by_name(dtp, DTRACE_OBJ_KMODS, buf, - NULL, &sip) == 0) - mod = sip.object; - } else - mod = p; + /* Validate the module name. */ + smp = dt_symbol_module(sym); + if (dmp) { + if (smp != dmp) + continue; + } else if (!dt_gmatch(smp->dm_name, pdp->mod)) + continue; - /* - * Due to the lack of module names in - * TRACEFS/available_filter_functions, there are some duplicate - * function names. We need to make sure that we do not create - * duplicate probes for these. - */ pd.id = DTRACE_IDNONE; - pd.prv = prvname; - pd.mod = mod; - pd.fun = buf; - pd.prb = "entry"; - if (dt_probe_lookup(dtp, &pd) != NULL) - continue; + pd.prv = pdp->prv; + pd.mod = smp->dm_name; + pd.fun = fun; - if (dt_tp_probe_insert(dtp, prv, prvname, mod, buf, "entry")) - n++; - if (dt_tp_probe_insert(dtp, prv, prvname, mod, buf, "return")) - n++; + if (prb & FBT_ENTRY) { + pd.prb = "entry"; + n += provide_probe(dtp, &pd); + } + if (prb & FBT_RETURN) { + pd.prb = "return"; + n += provide_probe(dtp, &pd); + } } - free(buf); - fclose(f); - return n; } @@ -447,6 +524,7 @@ dt_provimpl_t dt_fbt_fprobe = { .prog_type = BPF_PROG_TYPE_TRACING, .stack_skip = 4, .populate = &populate, + .provide = &provide, .load_prog = &fprobe_prog_load, .trampoline = &fprobe_trampoline, .attach = &dt_tp_probe_attach_raw, @@ -459,6 +537,7 @@ dt_provimpl_t dt_fbt_kprobe = { .name = prvname, .prog_type = BPF_PROG_TYPE_KPROBE, .populate = &populate, + .provide = &provide, .load_prog = &dt_bpf_prog_load, .trampoline = &kprobe_trampoline, .attach = &kprobe_attach, -- 2.45.2 From kris.van.hees at oracle.com Mon Mar 3 05:38:48 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Mon, 3 Mar 2025 00:38:48 -0500 Subject: [DTrace-devel] [PATCH v2 7/8] rawfbt: performance improvements Message-ID: <09d06928f75f3fbd757909f703fa576f.kris.van.hees@oracle.com> Signed-off-by: Kris Van Hees --- libdtrace/dt_prov_rawfbt.c | 218 ++++++++++++++++++++++++------------- 1 file changed, 145 insertions(+), 73 deletions(-) diff --git a/libdtrace/dt_prov_rawfbt.c b/libdtrace/dt_prov_rawfbt.c index 62f2f4f0..ebcd1a16 100644 --- a/libdtrace/dt_prov_rawfbt.c +++ b/libdtrace/dt_prov_rawfbt.c @@ -1,6 +1,6 @@ /* * Oracle Linux DTrace. - * Copyright (c) 2024, Oracle and/or its affiliates. All rights reserved. + * Copyright (c) 2024, 2025, Oracle and/or its affiliates. All rights reserved. * Licensed under the Universal Permissive License v 1.0 as shown at * http://oss.oracle.com/licenses/upl. * @@ -44,10 +44,8 @@ #include "dt_pt_regs.h" static const char prvname[] = "rawfbt"; -static const char modname[] = "vmlinux"; #define KPROBE_EVENTS TRACEFS "kprobe_events" -#define PROBE_LIST TRACEFS "available_filter_functions" #define FBT_GROUP_FMT GROUP_FMT "_%s" #define FBT_GROUP_DATA GROUP_DATA, prp->desc->prb @@ -61,98 +59,171 @@ static const dtrace_pattr_t pattr = { }; /* - * Scan the PROBE_LIST file and add entry and return probes for every function - * that is listed. + * Create the rawfbt provider. */ static int populate(dtrace_hdl_t *dtp) { dt_provider_t *prv; - FILE *f; - char *buf = NULL; - size_t len = 0; - size_t n = 0; - dtrace_syminfo_t sip; - dtrace_probedesc_t pd; prv = dt_provider_create(dtp, prvname, &dt_rawfbt, &pattr, NULL); if (prv == NULL) return -1; /* errno already set */ - f = fopen(PROBE_LIST, "r"); - if (f == NULL) + return 0; +} + +/* Create a probe (if it does not exist yet). */ +static int provide_probe(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) +{ + dt_provider_t *prv = dt_provider_lookup(dtp, pdp->prv); + + if (prv == NULL) + return 0; + if (dt_probe_lookup(dtp, pdp) != NULL) return 0; + if (dt_tp_probe_insert(dtp, prv, pdp->prv, pdp->mod, pdp->fun, pdp->prb)) + return 1; - while (getline(&buf, &len, f) >= 0) { - char *p; - const char *mod = modname; - dt_probe_t *prp; + return 0; +} - /* - * Here buf is either "funcname\n" or "funcname [modname]\n". - * The last line may not have a linefeed. - */ - p = strchr(buf, '\n'); - if (p) { - *p = '\0'; - if (p > buf && *(--p) == ']') - *p = '\0'; +/* + * Try to provide probes for the given probe description. The caller ensures + * that the provider name in probe desxcription (if any) is a match for this + * provider. When this is called, we already know that this provider matches + * the provider component of the probe specification. + */ +#define FBT_ENTRY 1 +#define FBT_RETURN 2 + +static int provide(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) +{ + int n = 0; + int prb = 0; + dt_module_t *dmp = NULL; + dt_symbol_t *sym = NULL; + dt_htab_next_t *it = NULL; + dtrace_probedesc_t pd; + + dt_modsym_mark_traceable(dtp); + + /* + * Nothing to do if a probe name is specified and cannot match 'entry' + * or 'return'. + */ + if (dt_gmatch("entry", pdp->prb)) + prb |= FBT_ENTRY; + if (dt_gmatch("return", pdp->prb)) + prb |= FBT_RETURN; + if (prb == 0) + return 0; + + /* + * If we have an explicit module name, check it. If not found, we can + * ignore this request. + */ + if (pdp->mod[0] != '\0' && strchr(pdp->mod, '*') == NULL) { + dmp = dt_module_lookup_by_name(dtp, pdp->mod); + if (dmp == NULL) + return 0; + } + + /* + * If we have an explicit function name, we start with a basic symbol + * name lookup. + */ + if (pdp->fun[0] != '\0' && strchr(pdp->fun, '*') == NULL) { + /* If we have a module, use it. */ + if (dmp != NULL) { + sym = dt_module_symbol_by_name(dtp, dmp, pdp->fun); + if (sym == NULL) + return 0; + if (!dt_symbol_traceable(sym)) + return 0; + + pd.id = DTRACE_IDNONE; + pd.prv = pdp->prv; + pd.mod = dmp->dm_name; + pd.fun = pdp->fun; + + if (prb & FBT_ENTRY) { + pd.prb = "entry"; + n += provide_probe(dtp, &pd); + } + if (prb & FBT_RETURN) { + pd.prb = "return"; + n += provide_probe(dtp, &pd); + } + + return n; } - /* - * Now buf is either "funcname" or "funcname [modname". If - * there is no module name provided, we will use the default. - */ - p = strchr(buf, ' '); - if (p) { - *p++ = '\0'; - if (*p == '[') - p++; + sym = dt_symbol_by_name(dtp, pdp->fun); + while (sym != NULL) { + const char *mod = dt_symbol_module(sym)->dm_name; + + if (dt_symbol_traceable(sym) && + dt_gmatch(mod, pdp->mod)) { + pd.id = DTRACE_IDNONE; + pd.prv = pdp->prv; + pd.mod = mod; + pd.fun = pdp->fun; + + if (prb & FBT_ENTRY) { + pd.prb = "entry"; + n += provide_probe(dtp, &pd); + } + if (prb & FBT_RETURN) { + pd.prb = "return"; + n += provide_probe(dtp, &pd); + } + + } + sym = dt_symbol_by_name_next(sym); } -#define strstarts(var, x) (strncmp(var, x, strlen (x)) == 0) - /* Weed out __ftrace_invalid_address___* entries. */ - if (strstarts(buf, "__ftrace_invalid_address__") || - strstarts(buf, "__probestub_") || - strstarts(buf, "__traceiter_")) + return n; + } + + /* + * No explicit function name. We need to go through all possible + * symbol names and see if they match. + */ + while ((sym = dt_htab_next(dtp->dt_kernsyms, &it)) != NULL) { + dt_module_t *smp; + const char *fun; + + /* Ensure the symbol can be traced. */ + if (!dt_symbol_traceable(sym)) continue; -#undef strstarts - /* - * If we did not see a module name, perform a symbol lookup to - * try to determine the module name. - */ - if (!p) { - if (dtrace_lookup_by_name(dtp, DTRACE_OBJ_KMODS, buf, - NULL, &sip) == 0) - mod = sip.object; - } else - mod = p; + /* Match the function name. */ + fun = dt_symbol_name(sym); + if (!dt_gmatch(fun, pdp->fun)) + continue; - /* - * Due to the lack of module names in - * TRACEFS/available_filter_functions, there are some duplicate - * function names. The kernel does not let us trace functions - * that have duplicates, so we need to remove the existing one. - */ - pd.id = DTRACE_IDNONE; - pd.prv = prvname; - pd.mod = mod; - pd.fun = buf; - pd.prb = "entry"; - prp = dt_probe_lookup(dtp, &pd); - if (prp != NULL) { - dt_probe_destroy(prp); + /* Validate the module name. */ + smp = dt_symbol_module(sym); + if (dmp) { + if (smp != dmp) + continue; + } else if (!dt_gmatch(smp->dm_name, pdp->mod)) continue; - } - if (dt_tp_probe_insert(dtp, prv, prvname, mod, buf, "entry")) - n++; - if (dt_tp_probe_insert(dtp, prv, prvname, mod, buf, "return")) - n++; - } + pd.id = DTRACE_IDNONE; + pd.prv = pdp->prv; + pd.mod = smp->dm_name; + pd.fun = fun; - free(buf); - fclose(f); + if (prb & FBT_ENTRY) { + pd.prb = "entry"; + n += provide_probe(dtp, &pd); + } + if (prb & FBT_RETURN) { + pd.prb = "return"; + n += provide_probe(dtp, &pd); + } + } return n; } @@ -306,6 +377,7 @@ dt_provimpl_t dt_rawfbt = { .name = prvname, .prog_type = BPF_PROG_TYPE_KPROBE, .populate = &populate, + .provide = &provide, .load_prog = &dt_bpf_prog_load, .trampoline = &trampoline, .attach = &attach, -- 2.45.2 From kris.van.hees at oracle.com Thu Mar 6 19:03:47 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Thu, 6 Mar 2025 14:03:47 -0500 Subject: [DTrace-devel] [PATCH v2 8/8] fbt, rawfbt: consolidate code to avoid duplication Message-ID: <206f9d172e4bfcb53acf6cae86ddc77f.kris.van.hees@oracle.com> After optimizing both fbt and rawfbt providers, the resulting code has a significant amount of duplication. The rawfbt provider can now be defined in terms of the kprobe-based fbt provider functions. Signed-off-by: Kris Van Hees --- libdtrace/Build | 1 - libdtrace/dt_prov_fbt.c | 139 +++++++++---- libdtrace/dt_prov_rawfbt.c | 386 ------------------------------------- 3 files changed, 102 insertions(+), 424 deletions(-) delete mode 100644 libdtrace/dt_prov_rawfbt.c diff --git a/libdtrace/Build b/libdtrace/Build index 51e0f078..7e6e8a38 100644 --- a/libdtrace/Build +++ b/libdtrace/Build @@ -55,7 +55,6 @@ libdtrace-build_SOURCES = dt_aggregate.c \ dt_prov_lockstat.c \ dt_prov_proc.c \ dt_prov_profile.c \ - dt_prov_rawfbt.c \ dt_prov_rawtp.c \ dt_prov_sched.c \ dt_prov_sdt.c \ diff --git a/libdtrace/dt_prov_fbt.c b/libdtrace/dt_prov_fbt.c index 827156cc..1489275a 100644 --- a/libdtrace/dt_prov_fbt.c +++ b/libdtrace/dt_prov_fbt.c @@ -6,17 +6,26 @@ * * The Function Boundary Tracing (FBT) provider for DTrace. * - * FBT probes are exposed by the kernel as kprobes. They are listed in the - * TRACEFS/available_filter_functions file. Some kprobes are associated with - * a specific kernel module, while most are in the core kernel. + * Kernnel functions can be traced through fentry/fexit probes (when available) + * and kprobes. The FBT provider supports both implementations and will use + * fentry/fexit probes if the kernel supports them, and fallback to kprobes + * otherwise. The FBT provider does not support tracing synthetic functions + * (i.e. compiler-generated functions with a . in their name). + * + * The rawfbt provider implements a variant of the FBT provider and always uses + * kprobes. This provider allow tracing of synthetic function. * * Mapping from event name to DTrace probe name: * * fbt:vmlinux::entry * fbt:vmlinux::return + * rawfbt:vmlinux::entry + * rawfbt:vmlinux::return * or * [] fbt:::entry * fbt:::return + * rawfbt:::entry + * rawfbt:::return */ #include #include @@ -57,18 +66,19 @@ static const dtrace_pattr_t pattr = { dt_provimpl_t dt_fbt_fprobe; dt_provimpl_t dt_fbt_kprobe; +dt_provimpl_t dt_rawfbt; /* - * Create the fbt provider. + * Create the fbt and rawfbt providers. */ static int populate(dtrace_hdl_t *dtp) { - dt_provider_t *prv; - dt_fbt = BPF_HAS(dtp, BPF_FEAT_FENTRY) ? dt_fbt_fprobe : dt_fbt_kprobe; - prv = dt_provider_create(dtp, prvname, &dt_fbt, &pattr, NULL); - if (prv == NULL) + if (dt_provider_create(dtp, dt_fbt.name, &dt_fbt, &pattr, + NULL) == NULL || + dt_provider_create(dtp, dt_rawfbt.name, &dt_rawfbt, &pattr, + NULL) == NULL) return -1; /* errno already set */ return 0; @@ -102,13 +112,12 @@ static int provide(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) { int n = 0; int prb = 0; + int rawfbt = 0; dt_module_t *dmp = NULL; dt_symbol_t *sym = NULL; dt_htab_next_t *it = NULL; dtrace_probedesc_t pd; - dt_modsym_mark_traceable(dtp); - /* * Nothing to do if a probe name is specified and cannot match 'entry' * or 'return'. @@ -120,9 +129,15 @@ static int provide(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) if (prb == 0) return 0; - /* Synthetic function names are not supported for FBT. */ - if (strchr(pdp->fun, '.')) - return 0; + /* + * Unless we are dealing with a rawfbt probe, synthetic functions are + * not supported. + */ + if (strcmp(pdp->prv, dt_rawfbt.name) != 0) { + if (strchr(pdp->fun, '.')) + return 0; + } else + rawfbt = 1; /* * If we have an explicit module name, check it. If not found, we can @@ -134,6 +149,14 @@ static int provide(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) return 0; } + /* + * Ensure that kernel symbols that are FBT-traceable are marked as + * such. We don't do this earlier in this function so that the + * preceding tests have the greatest opportunity to avoid doing this + * unnecessarily. + */ + dt_modsym_mark_traceable(dtp); + /* * If we have an explicit function name, we start with a basic symbol * name lookup. @@ -205,7 +228,7 @@ static int provide(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) /* Function name cannot be synthetic and must match. */ fun = dt_symbol_name(sym); - if (strchr(fun, '.') || !dt_gmatch(fun, pdp->fun)) + if ((!rawfbt && strchr(fun, '.')) || !dt_gmatch(fun, pdp->fun)) continue; /* Validate the module name. */ @@ -396,12 +419,12 @@ static int fprobe_prog_load(dtrace_hdl_t *dtp, const dt_probe_t *prp, \*******************************/ /* - * Generate a BPF trampoline for a FBT probe. + * Generate a BPF trampoline for a FBT (or rawfbt) probe. * * The trampoline function is called when a FBT probe triggers, and it must * satisfy the following prototype: * - * int dt_fbt(dt_pt_regs *regs) + * int dt_(raw)fbt(dt_pt_regs *regs) * * The trampoline will populate a dt_dctx_t struct and then call the function * that implements the compiled D clause. It returns 0 to the caller. @@ -422,7 +445,7 @@ static int kprobe_trampoline(dt_pcb_t *pcb, uint_t exitlbl) dt_cg_tramp_copy_rval_from_regs(pcb); /* - * fbt:::return arg0 should be the function offset for + * (raw)fbt:::return arg0 should be the function offset for * return instruction. Since we use kretprobes, however, * which do not fire until the function has returned to * its caller, information about the returning instruction @@ -441,11 +464,28 @@ static int kprobe_trampoline(dt_pcb_t *pcb, uint_t exitlbl) static int kprobe_attach(dtrace_hdl_t *dtp, const dt_probe_t *prp, int bpf_fd) { + const char *fun = prp->desc->fun; + char *tpn = (char *)fun; + int rc = -1; + if (!dt_tp_probe_has_info(prp)) { char *fn; FILE *f; - size_t len; - int fd, rc = -1; + int fd; + + /* + * For rawfbt probes, we need to apply a . -> _ conversion to + * ensure the tracepoint name is valid. + */ + if (strcmp(prp->desc->prv, dt_rawfbt.name) == 0) { + char *p; + + tpn = strdup(fun); + for (p = tpn; *p; p++) { + if (*p == '.') + *p = '_'; + } + } /* * Register the kprobe with the tracing subsystem. This will @@ -453,41 +493,42 @@ static int kprobe_attach(dtrace_hdl_t *dtp, const dt_probe_t *prp, int bpf_fd) */ fd = open(KPROBE_EVENTS, O_WRONLY | O_APPEND); if (fd == -1) - return -ENOENT; + goto out; rc = dprintf(fd, "%c:" FBT_GROUP_FMT "/%s %s\n", prp->desc->prb[0] == 'e' ? 'p' : 'r', - FBT_GROUP_DATA, prp->desc->fun, prp->desc->fun); + FBT_GROUP_DATA, tpn, fun); close(fd); if (rc == -1) - return -ENOENT; + goto out; /* create format file name */ - len = snprintf(NULL, 0, "%s" FBT_GROUP_FMT "/%s/format", - EVENTSFS, FBT_GROUP_DATA, prp->desc->fun) + 1; - fn = dt_alloc(dtp, len); - if (fn == NULL) - return -ENOENT; - - snprintf(fn, len, "%s" FBT_GROUP_FMT "/%s/format", EVENTSFS, - FBT_GROUP_DATA, prp->desc->fun); + if (asprintf(&fn, "%s" FBT_GROUP_FMT "/%s/format", EVENTSFS, + FBT_GROUP_DATA, tpn) == -1) + goto out; /* open format file */ f = fopen(fn, "r"); - dt_free(dtp, fn); + free(fn); if (f == NULL) - return -ENOENT; + goto out; /* read event id from format file */ rc = dt_tp_probe_info(dtp, f, 0, prp, NULL, NULL); fclose(f); if (rc < 0) - return -ENOENT; + goto out; } /* attach BPF program to the probe */ - return dt_tp_probe_attach(dtp, prp, bpf_fd); + rc = dt_tp_probe_attach(dtp, prp, bpf_fd); + +out: + if (tpn != prp->desc->fun) + free(tpn); + + return rc == -1 ? -ENOENT : rc; } /* @@ -503,7 +544,8 @@ static int kprobe_attach(dtrace_hdl_t *dtp, const dt_probe_t *prp, int bpf_fd) */ static void kprobe_detach(dtrace_hdl_t *dtp, const dt_probe_t *prp) { - int fd; + int fd; + char *tpn = (char *)prp->desc->fun; if (!dt_tp_probe_has_info(prp)) return; @@ -514,9 +556,20 @@ static void kprobe_detach(dtrace_hdl_t *dtp, const dt_probe_t *prp) if (fd == -1) return; - dprintf(fd, "-:" FBT_GROUP_FMT "/%s\n", FBT_GROUP_DATA, - prp->desc->fun); + if (strcmp(prp->desc->prv, dt_rawfbt.name) == 0) { + char *p; + + for (p = tpn; *p; p++) { + if (*p == '.') + *p = '_'; + } + } + + dprintf(fd, "-:" FBT_GROUP_FMT "/%s\n", FBT_GROUP_DATA, tpn); close(fd); + + if (tpn != prp->desc->fun) + free(tpn); } dt_provimpl_t dt_fbt_fprobe = { @@ -549,3 +602,15 @@ dt_provimpl_t dt_fbt = { .name = prvname, .populate = &populate, }; + +dt_provimpl_t dt_rawfbt = { + .name = "rawfbt", + .prog_type = BPF_PROG_TYPE_KPROBE, + .populate = &populate, + .provide = &provide, + .load_prog = &dt_bpf_prog_load, + .trampoline = &kprobe_trampoline, + .attach = &kprobe_attach, + .detach = &kprobe_detach, + .probe_destroy = &dt_tp_probe_destroy, +}; diff --git a/libdtrace/dt_prov_rawfbt.c b/libdtrace/dt_prov_rawfbt.c deleted file mode 100644 index ebcd1a16..00000000 --- a/libdtrace/dt_prov_rawfbt.c +++ /dev/null @@ -1,386 +0,0 @@ -/* - * Oracle Linux DTrace. - * Copyright (c) 2024, 2025, Oracle and/or its affiliates. All rights reserved. - * Licensed under the Universal Permissive License v 1.0 as shown at - * http://oss.oracle.com/licenses/upl. - * - * The Raw Function Boundary Tracing provider for DTrace. - * - * The kernel provides kprobes to trace specific symbols. They are listed in - * the TRACEFS/available_filter_functions file. Kprobes may be associated with - * a symbol in the core kernel or with a symbol in a specific kernel module. - * Whereas the fbt provider supports tracing regular symbols only, the rawfbt - * provider also provides access to synthetic symbols, i.e. symbols created by - * compiler optimizations. - * - * Mapping from event name to DTrace probe name: - * - * rawfbt:vmlinux::entry - * rawfbt:vmlinux::return - * or - * [] rawfbt:::entry - * rawfbt:::return - */ -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include - -#include "dt_btf.h" -#include "dt_dctx.h" -#include "dt_cg.h" -#include "dt_module.h" -#include "dt_provider_tp.h" -#include "dt_probe.h" -#include "dt_pt_regs.h" - -static const char prvname[] = "rawfbt"; - -#define KPROBE_EVENTS TRACEFS "kprobe_events" - -#define FBT_GROUP_FMT GROUP_FMT "_%s" -#define FBT_GROUP_DATA GROUP_DATA, prp->desc->prb - -static const dtrace_pattr_t pattr = { -{ DTRACE_STABILITY_EVOLVING, DTRACE_STABILITY_EVOLVING, DTRACE_CLASS_COMMON }, -{ DTRACE_STABILITY_PRIVATE, DTRACE_STABILITY_PRIVATE, DTRACE_CLASS_UNKNOWN }, -{ DTRACE_STABILITY_PRIVATE, DTRACE_STABILITY_PRIVATE, DTRACE_CLASS_ISA }, -{ DTRACE_STABILITY_EVOLVING, DTRACE_STABILITY_EVOLVING, DTRACE_CLASS_COMMON }, -{ DTRACE_STABILITY_PRIVATE, DTRACE_STABILITY_PRIVATE, DTRACE_CLASS_ISA }, -}; - -/* - * Create the rawfbt provider. - */ -static int populate(dtrace_hdl_t *dtp) -{ - dt_provider_t *prv; - - prv = dt_provider_create(dtp, prvname, &dt_rawfbt, &pattr, NULL); - if (prv == NULL) - return -1; /* errno already set */ - - return 0; -} - -/* Create a probe (if it does not exist yet). */ -static int provide_probe(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) -{ - dt_provider_t *prv = dt_provider_lookup(dtp, pdp->prv); - - if (prv == NULL) - return 0; - if (dt_probe_lookup(dtp, pdp) != NULL) - return 0; - if (dt_tp_probe_insert(dtp, prv, pdp->prv, pdp->mod, pdp->fun, pdp->prb)) - return 1; - - return 0; -} - -/* - * Try to provide probes for the given probe description. The caller ensures - * that the provider name in probe desxcription (if any) is a match for this - * provider. When this is called, we already know that this provider matches - * the provider component of the probe specification. - */ -#define FBT_ENTRY 1 -#define FBT_RETURN 2 - -static int provide(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) -{ - int n = 0; - int prb = 0; - dt_module_t *dmp = NULL; - dt_symbol_t *sym = NULL; - dt_htab_next_t *it = NULL; - dtrace_probedesc_t pd; - - dt_modsym_mark_traceable(dtp); - - /* - * Nothing to do if a probe name is specified and cannot match 'entry' - * or 'return'. - */ - if (dt_gmatch("entry", pdp->prb)) - prb |= FBT_ENTRY; - if (dt_gmatch("return", pdp->prb)) - prb |= FBT_RETURN; - if (prb == 0) - return 0; - - /* - * If we have an explicit module name, check it. If not found, we can - * ignore this request. - */ - if (pdp->mod[0] != '\0' && strchr(pdp->mod, '*') == NULL) { - dmp = dt_module_lookup_by_name(dtp, pdp->mod); - if (dmp == NULL) - return 0; - } - - /* - * If we have an explicit function name, we start with a basic symbol - * name lookup. - */ - if (pdp->fun[0] != '\0' && strchr(pdp->fun, '*') == NULL) { - /* If we have a module, use it. */ - if (dmp != NULL) { - sym = dt_module_symbol_by_name(dtp, dmp, pdp->fun); - if (sym == NULL) - return 0; - if (!dt_symbol_traceable(sym)) - return 0; - - pd.id = DTRACE_IDNONE; - pd.prv = pdp->prv; - pd.mod = dmp->dm_name; - pd.fun = pdp->fun; - - if (prb & FBT_ENTRY) { - pd.prb = "entry"; - n += provide_probe(dtp, &pd); - } - if (prb & FBT_RETURN) { - pd.prb = "return"; - n += provide_probe(dtp, &pd); - } - - return n; - } - - sym = dt_symbol_by_name(dtp, pdp->fun); - while (sym != NULL) { - const char *mod = dt_symbol_module(sym)->dm_name; - - if (dt_symbol_traceable(sym) && - dt_gmatch(mod, pdp->mod)) { - pd.id = DTRACE_IDNONE; - pd.prv = pdp->prv; - pd.mod = mod; - pd.fun = pdp->fun; - - if (prb & FBT_ENTRY) { - pd.prb = "entry"; - n += provide_probe(dtp, &pd); - } - if (prb & FBT_RETURN) { - pd.prb = "return"; - n += provide_probe(dtp, &pd); - } - - } - sym = dt_symbol_by_name_next(sym); - } - - return n; - } - - /* - * No explicit function name. We need to go through all possible - * symbol names and see if they match. - */ - while ((sym = dt_htab_next(dtp->dt_kernsyms, &it)) != NULL) { - dt_module_t *smp; - const char *fun; - - /* Ensure the symbol can be traced. */ - if (!dt_symbol_traceable(sym)) - continue; - - /* Match the function name. */ - fun = dt_symbol_name(sym); - if (!dt_gmatch(fun, pdp->fun)) - continue; - - /* Validate the module name. */ - smp = dt_symbol_module(sym); - if (dmp) { - if (smp != dmp) - continue; - } else if (!dt_gmatch(smp->dm_name, pdp->mod)) - continue; - - pd.id = DTRACE_IDNONE; - pd.prv = pdp->prv; - pd.mod = smp->dm_name; - pd.fun = fun; - - if (prb & FBT_ENTRY) { - pd.prb = "entry"; - n += provide_probe(dtp, &pd); - } - if (prb & FBT_RETURN) { - pd.prb = "return"; - n += provide_probe(dtp, &pd); - } - } - - return n; -} - -/* - * Generate a BPF trampoline for a FBT probe. - * - * The trampoline function is called when a FBT probe triggers, and it must - * satisfy the following prototype: - * - * int dt_rawfbt(dt_pt_regs *regs) - * - * The trampoline will populate a dt_dctx_t struct and then call the function - * that implements the compiled D clause. It returns 0 to the caller. - */ -static int trampoline(dt_pcb_t *pcb, uint_t exitlbl) -{ - dt_cg_tramp_prologue(pcb); - - /* - * After the dt_cg_tramp_prologue() call, we have: - * // (%r7 = dctx->mst) - * // (%r8 = dctx->ctx) - */ - dt_cg_tramp_copy_regs(pcb); - if (strcmp(pcb->pcb_probe->desc->prb, "return") == 0) { - dt_irlist_t *dlp = &pcb->pcb_ir; - - dt_cg_tramp_copy_rval_from_regs(pcb); - - /* - * fbt:::return arg0 should be the function offset for - * return instruction. Since we use kretprobes, however, - * which do not fire until the function has returned to - * its caller, information about the returning instruction - * in the callee has been lost. - * - * Set arg0=-1 to indicate that we do not know the value. - */ - dt_cg_xsetx(dlp, NULL, DT_LBL_NONE, BPF_REG_0, -1); - emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(0), BPF_REG_0)); - } else - dt_cg_tramp_copy_args_from_regs(pcb, 1); - dt_cg_tramp_epilogue(pcb); - - return 0; -} - -static int attach(dtrace_hdl_t *dtp, const dt_probe_t *prp, int bpf_fd) -{ - char *prb = NULL; - - if (!dt_tp_probe_has_info(prp)) { - char *fn, *p; - FILE *f; - int fd, rc = -1; - - /* - * The tracepoint event we will be creating needs to have a - * valid name. We use a copy of the probe name, with . -> _ - * conversion. - */ - prb = strdup(prp->desc->fun); - for (p = prb; *p; p++) { - if (*p == '.') - *p = '_'; - } - - /* - * Register the kprobe with the tracing subsystem. This will - * create a tracepoint event. - */ - fd = open(KPROBE_EVENTS, O_WRONLY | O_APPEND); - if (fd == -1) - goto fail; - - rc = dprintf(fd, "%c:" FBT_GROUP_FMT "/%s %s\n", - prp->desc->prb[0] == 'e' ? 'p' : 'r', - FBT_GROUP_DATA, prb, prp->desc->fun); - close(fd); - if (rc == -1) - goto fail; - - /* create format file name */ - if (asprintf(&fn, "%s" FBT_GROUP_FMT "/%s/format", EVENTSFS, - FBT_GROUP_DATA, prb) == -1) - goto fail; - - /* open format file */ - f = fopen(fn, "r"); - free(fn); - if (f == NULL) - goto fail; - - /* read event id from format file */ - rc = dt_tp_probe_info(dtp, f, 0, prp, NULL, NULL); - fclose(f); - - if (rc < 0) - goto fail; - - free(prb); - } - - /* attach BPF program to the probe */ - return dt_tp_probe_attach(dtp, prp, bpf_fd); - -fail: - free(prb); - return -ENOENT; -} - -/* - * Try to clean up system resources that may have been allocated for this - * probe. - * - * If there is an event FD, we close it. - * - * We also try to remove any kprobe that may have been created for the probe. - * This is harmless for probes that didn't get created. If the removal fails - * for some reason we are out of luck - fortunately it is not harmful to the - * system as a whole. - */ -static void detach(dtrace_hdl_t *dtp, const dt_probe_t *prp) -{ - int fd; - char *prb, *p; - - if (!dt_tp_probe_has_info(prp)) - return; - - dt_tp_probe_detach(dtp, prp); - - fd = open(KPROBE_EVENTS, O_WRONLY | O_APPEND); - if (fd == -1) - return; - - /* The tracepoint event is the probe nam, with . -> _ conversion. */ - prb = strdup(prp->desc->fun); - for (p = prb; *p; p++) { - if (*p == '.') - *p = '_'; - } - - dprintf(fd, "-:" FBT_GROUP_FMT "/%s\n", FBT_GROUP_DATA, prb); - free(prb); - close(fd); -} - -dt_provimpl_t dt_rawfbt = { - .name = prvname, - .prog_type = BPF_PROG_TYPE_KPROBE, - .populate = &populate, - .provide = &provide, - .load_prog = &dt_bpf_prog_load, - .trampoline = &trampoline, - .attach = &attach, - .detach = &detach, - .probe_destroy = &dt_tp_probe_destroy, -}; -- 2.45.2 From eugene.loh at oracle.com Thu Mar 13 00:39:46 2025 From: eugene.loh at oracle.com (eugene.loh at oracle.com) Date: Wed, 12 Mar 2025 20:39:46 -0400 Subject: [DTrace-devel] [PATCH] test: Make tests more resilient to different prid widths Message-ID: <20250313003946.11074-1-eugene.loh@oracle.com> From: Eugene Loh Various tests convert run-dependent values -- like PIDs and probe IDs -- to run-independent strings before checking against their .r results files. But the conversions could be remarkably sensitive to the width of probe IDs. E.g., some conversions assumed probe IDs were flush with the beginning of the line, but if they were narrower they were preceded by white space and were not detected. This will be important in up-coming fbt work, where probe IDs for fbt probes can be much lower in value (fewer digits). Also, these conversions were being carried out by a hodgepodge of scripts -- sed, awk, and grep; some using run-independent strings like "NNN" or "XXXX" instead of more informative "PID" and "PRID" strings; some incorrectly using "PID" for PRIDs, etc. Replace these .r.p postprocessing scripts with a single script that is more resilient to PRID widths and is commented. Signed-off-by: Eugene Loh --- test/unittest/usdt/convert_PID_and_PRID.awk | 20 +++++++++++++++++ test/unittest/usdt/err.argmap-null.r | 2 +- test/unittest/usdt/err.argmap-null.r.p | 3 +-- test/unittest/usdt/tst.dlclose1.r | 8 +++---- test/unittest/usdt/tst.dlclose1.r.p | 13 +---------- test/unittest/usdt/tst.enable_pid.r | 22 +++++++++---------- test/unittest/usdt/tst.enable_pid.r.p | 8 +------ test/unittest/usdt/tst.exec-dof-replacement.r | 2 +- .../usdt/tst.exec-dof-replacement.r.p | 3 +-- .../usdt/tst.multiprov-dupprobe-fire.r.p | 3 +-- test/unittest/usdt/tst.multiprov-dupprobe.r.p | 6 +---- test/unittest/usdt/tst.multiprovider-fire.r.p | 3 +-- test/unittest/usdt/tst.multiprovider.r.p | 6 +---- 13 files changed, 44 insertions(+), 55 deletions(-) create mode 100755 test/unittest/usdt/convert_PID_and_PRID.awk mode change 100755 => 120000 test/unittest/usdt/err.argmap-null.r.p mode change 100755 => 120000 test/unittest/usdt/tst.dlclose1.r.p mode change 100755 => 120000 test/unittest/usdt/tst.enable_pid.r.p mode change 100755 => 120000 test/unittest/usdt/tst.exec-dof-replacement.r.p mode change 100755 => 120000 test/unittest/usdt/tst.multiprov-dupprobe-fire.r.p mode change 100755 => 120000 test/unittest/usdt/tst.multiprov-dupprobe.r.p mode change 100755 => 120000 test/unittest/usdt/tst.multiprovider-fire.r.p mode change 100755 => 120000 test/unittest/usdt/tst.multiprovider.r.p diff --git a/test/unittest/usdt/convert_PID_and_PRID.awk b/test/unittest/usdt/convert_PID_and_PRID.awk new file mode 100755 index 000000000..1dbb31301 --- /dev/null +++ b/test/unittest/usdt/convert_PID_and_PRID.awk @@ -0,0 +1,20 @@ +#!/usr/bin/gawk -f + +# ignore the banner +/^ *ID *PROVIDER *MODULE *FUNCTION *NAME *$/ { next; } + +# process other lines +{ + # convert run-dependent PID values to "PID" + $0 = gensub("prov([abc]?)[0-9]+", "prov\\1PID", "g"); + sub("pid [0-9]+", "pid PID"); + + # convert run-dependent probe ID values to "PRID" + sub("^ *[0-9]+", "PRID"); + + # squash blanks + gsub(" +", " "); + + # print + print; +} diff --git a/test/unittest/usdt/err.argmap-null.r b/test/unittest/usdt/err.argmap-null.r index 215475e39..97b1850de 100644 --- a/test/unittest/usdt/err.argmap-null.r +++ b/test/unittest/usdt/err.argmap-null.r @@ -1,2 +1,2 @@ -- @@stderr -- -dtrace: failed to compile script test/unittest/usdt/err.argmap-null.d: line 24: index 0 is out of range for test_provXXXX:::place4 args[ ] +dtrace: failed to compile script test/unittest/usdt/err.argmap-null.d: line 24: index 0 is out of range for test_provPID:::place4 args[ ] diff --git a/test/unittest/usdt/err.argmap-null.r.p b/test/unittest/usdt/err.argmap-null.r.p deleted file mode 100755 index c575983ad..000000000 --- a/test/unittest/usdt/err.argmap-null.r.p +++ /dev/null @@ -1,2 +0,0 @@ -#!/bin/sed -rf -s,test_prov[0-9]*,test_provXXXX,g; s,^ *[0-9]+, XX,g; diff --git a/test/unittest/usdt/err.argmap-null.r.p b/test/unittest/usdt/err.argmap-null.r.p new file mode 120000 index 000000000..11a06e058 --- /dev/null +++ b/test/unittest/usdt/err.argmap-null.r.p @@ -0,0 +1 @@ +convert_PID_and_PRID.awk \ No newline at end of file diff --git a/test/unittest/usdt/tst.dlclose1.r b/test/unittest/usdt/tst.dlclose1.r index 7873cb51f..70bb50d76 100644 --- a/test/unittest/usdt/tst.dlclose1.r +++ b/test/unittest/usdt/tst.dlclose1.r @@ -1,6 +1,4 @@ -started pid NNN - ID PROVIDER MODULE FUNCTION NAME -NNN test_provNNN livelib.so go go - ID PROVIDER MODULE FUNCTION NAME +started pid PID +PRID test_provPID livelib.so go go -- @@stderr -- -dtrace: failed to match test_provNNN:::: No probe matches description +dtrace: failed to match test_provPID:::: No probe matches description diff --git a/test/unittest/usdt/tst.dlclose1.r.p b/test/unittest/usdt/tst.dlclose1.r.p deleted file mode 100755 index 85725f3bb..000000000 --- a/test/unittest/usdt/tst.dlclose1.r.p +++ /dev/null @@ -1,12 +0,0 @@ -#!/usr/bin/gawk -f -{ - # ignore the specific probe ID or process ID - # (the script ensures the process ID is consistent) - gsub(/[0-9]+/, "NNN"); - - # ignore the numbers of spaces for alignment - # (they depend on the ID widths) - gsub(/ +/, " "); - - print; -} diff --git a/test/unittest/usdt/tst.dlclose1.r.p b/test/unittest/usdt/tst.dlclose1.r.p new file mode 120000 index 000000000..11a06e058 --- /dev/null +++ b/test/unittest/usdt/tst.dlclose1.r.p @@ -0,0 +1 @@ +convert_PID_and_PRID.awk \ No newline at end of file diff --git a/test/unittest/usdt/tst.enable_pid.r b/test/unittest/usdt/tst.enable_pid.r index 675fcdd6f..9241202d7 100644 --- a/test/unittest/usdt/tst.enable_pid.r +++ b/test/unittest/usdt/tst.enable_pid.r @@ -1,14 +1,14 @@ - FUNCTION:NAME - :tick-1s + FUNCTION:NAME + :tick-1s - FUNCTION:NAME - :tick-1s + FUNCTION:NAME + :tick-1s - FUNCTION:NAME - :tick-1s + FUNCTION:NAME + :tick-1s - FUNCTION:NAME - :tick-1s + FUNCTION:NAME + :tick-1s done ========== out 1 @@ -39,7 +39,7 @@ is not enabled === epoch === success -- @@stderr -- -dtrace: description 'test_provNNN:::go ' matched 1 probe -dtrace: description 'test_provNNN:::go ' matched 2 probes -dtrace: description 'test_provNNN:::go ' matched 2 probes +dtrace: description 'test_provPID:::go ' matched 1 probe +dtrace: description 'test_provPID:::go ' matched 2 probes +dtrace: description 'test_provPID:::go ' matched 2 probes dtrace: description 'test_prov*:::go ' matched 3 probes diff --git a/test/unittest/usdt/tst.enable_pid.r.p b/test/unittest/usdt/tst.enable_pid.r.p deleted file mode 100755 index baf9d2a90..000000000 --- a/test/unittest/usdt/tst.enable_pid.r.p +++ /dev/null @@ -1,7 +0,0 @@ -#!/usr/bin/awk -f -{ - # ignore the specific process ID - gsub(/test_prov[0-9]+/, "test_provNNN"); - - print; -} diff --git a/test/unittest/usdt/tst.enable_pid.r.p b/test/unittest/usdt/tst.enable_pid.r.p new file mode 120000 index 000000000..11a06e058 --- /dev/null +++ b/test/unittest/usdt/tst.enable_pid.r.p @@ -0,0 +1 @@ +convert_PID_and_PRID.awk \ No newline at end of file diff --git a/test/unittest/usdt/tst.exec-dof-replacement.r b/test/unittest/usdt/tst.exec-dof-replacement.r index 7547f85e5..226ab7c8a 100644 --- a/test/unittest/usdt/tst.exec-dof-replacement.r +++ b/test/unittest/usdt/tst.exec-dof-replacement.r @@ -1 +1 @@ -PID test_prov test2 main succeeded +PRID test_provPID test2 main succeeded diff --git a/test/unittest/usdt/tst.exec-dof-replacement.r.p b/test/unittest/usdt/tst.exec-dof-replacement.r.p deleted file mode 100755 index 1a5871f73..000000000 --- a/test/unittest/usdt/tst.exec-dof-replacement.r.p +++ /dev/null @@ -1,2 +0,0 @@ -#!/bin/sh -grep -v '^ *ID' | sed 's,^[0-9]*,PID,; s,prov[0-9]*,prov,g; s, *, ,g' diff --git a/test/unittest/usdt/tst.exec-dof-replacement.r.p b/test/unittest/usdt/tst.exec-dof-replacement.r.p new file mode 120000 index 000000000..11a06e058 --- /dev/null +++ b/test/unittest/usdt/tst.exec-dof-replacement.r.p @@ -0,0 +1 @@ +convert_PID_and_PRID.awk \ No newline at end of file diff --git a/test/unittest/usdt/tst.multiprov-dupprobe-fire.r.p b/test/unittest/usdt/tst.multiprov-dupprobe-fire.r.p deleted file mode 100755 index bdbce0189..000000000 --- a/test/unittest/usdt/tst.multiprov-dupprobe-fire.r.p +++ /dev/null @@ -1,2 +0,0 @@ -#!/bin/sh -sed 's,prov\(.\)[0-9]*,prov\1PID,; s, *, ,g' diff --git a/test/unittest/usdt/tst.multiprov-dupprobe-fire.r.p b/test/unittest/usdt/tst.multiprov-dupprobe-fire.r.p new file mode 120000 index 000000000..11a06e058 --- /dev/null +++ b/test/unittest/usdt/tst.multiprov-dupprobe-fire.r.p @@ -0,0 +1 @@ +convert_PID_and_PRID.awk \ No newline at end of file diff --git a/test/unittest/usdt/tst.multiprov-dupprobe.r.p b/test/unittest/usdt/tst.multiprov-dupprobe.r.p deleted file mode 100755 index 5d11db2d4..000000000 --- a/test/unittest/usdt/tst.multiprov-dupprobe.r.p +++ /dev/null @@ -1,5 +0,0 @@ -#!/bin/sh - -# Remove banner. -# Replace numerical values with generic PRID and PID labels. -grep -v '^ *ID' | sed 's,^[0-9][0-9]*,PRID,; s,prov\(.\)[0-9]*,prov\1PID,; s, *, ,g' diff --git a/test/unittest/usdt/tst.multiprov-dupprobe.r.p b/test/unittest/usdt/tst.multiprov-dupprobe.r.p new file mode 120000 index 000000000..11a06e058 --- /dev/null +++ b/test/unittest/usdt/tst.multiprov-dupprobe.r.p @@ -0,0 +1 @@ +convert_PID_and_PRID.awk \ No newline at end of file diff --git a/test/unittest/usdt/tst.multiprovider-fire.r.p b/test/unittest/usdt/tst.multiprovider-fire.r.p deleted file mode 100755 index bdbce0189..000000000 --- a/test/unittest/usdt/tst.multiprovider-fire.r.p +++ /dev/null @@ -1,2 +0,0 @@ -#!/bin/sh -sed 's,prov\(.\)[0-9]*,prov\1PID,; s, *, ,g' diff --git a/test/unittest/usdt/tst.multiprovider-fire.r.p b/test/unittest/usdt/tst.multiprovider-fire.r.p new file mode 120000 index 000000000..11a06e058 --- /dev/null +++ b/test/unittest/usdt/tst.multiprovider-fire.r.p @@ -0,0 +1 @@ +convert_PID_and_PRID.awk \ No newline at end of file diff --git a/test/unittest/usdt/tst.multiprovider.r.p b/test/unittest/usdt/tst.multiprovider.r.p deleted file mode 100755 index 5d11db2d4..000000000 --- a/test/unittest/usdt/tst.multiprovider.r.p +++ /dev/null @@ -1,5 +0,0 @@ -#!/bin/sh - -# Remove banner. -# Replace numerical values with generic PRID and PID labels. -grep -v '^ *ID' | sed 's,^[0-9][0-9]*,PRID,; s,prov\(.\)[0-9]*,prov\1PID,; s, *, ,g' diff --git a/test/unittest/usdt/tst.multiprovider.r.p b/test/unittest/usdt/tst.multiprovider.r.p new file mode 120000 index 000000000..11a06e058 --- /dev/null +++ b/test/unittest/usdt/tst.multiprovider.r.p @@ -0,0 +1 @@ +convert_PID_and_PRID.awk \ No newline at end of file -- 2.43.5 From kris.van.hees at oracle.com Mon Mar 17 18:51:05 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Mon, 17 Mar 2025 18:51:05 -0000 Subject: [DTrace-devel] [PATCH 1/2] error: report probe name on failed enabling error Message-ID: Signed-off-by: Kris Van Hees --- libdtrace/dt_bpf.c | 35 +++++++++++++++++++++++------------ libdtrace/dt_bpf.h | 2 ++ libdtrace/dt_error.c | 2 ++ libdtrace/dt_prov_uprobe.c | 15 +++------------ 4 files changed, 30 insertions(+), 24 deletions(-) diff --git a/libdtrace/dt_bpf.c b/libdtrace/dt_bpf.c index 662fd81a..e50bb536 100644 --- a/libdtrace/dt_bpf.c +++ b/libdtrace/dt_bpf.c @@ -64,6 +64,25 @@ dt_bpf_error(dtrace_hdl_t *dtp, const char *fmt, ...) return dt_set_errno(dtp, EDT_BPF); } +int +dt_attach_error(dtrace_hdl_t *dtp, int rc, ...) +{ + va_list ap, apc; + char *fmt; + + if (asprintf(&fmt, "Failed to enable %%s:%%s:%%s:%%s: %s", + dtrace_errmsg(dtp, -rc)) > 0) { + va_start(ap, rc); + va_copy(apc, ap); + dt_set_errmsg(dtp, NULL, NULL, NULL, 0, fmt, ap); + va_end(ap); + dt_debug_printf("bpf", "Failed to enable %s:%s:%s:%s", apc); + va_end(apc); + } + + return dt_set_errno(dtp, EDT_ENABLING_ERR); +} + int dt_bpf_lockmem_error(dtrace_hdl_t *dtp, const char *msg) { @@ -1335,19 +1354,11 @@ dt_bpf_load_progs(dtrace_hdl_t *dtp, uint_t cflags) if (prp->prov->impl->attach) rc = prp->prov->impl->attach(dtp, prp, fd); - if (rc == -ENOTSUPP) { - char *s; - - close(fd); - if (asprintf(&s, "Failed to enable %s:%s:%s:%s", - prp->desc->prv, prp->desc->mod, - prp->desc->fun, prp->desc->prb) == -1) - return dt_set_errno(dtp, EDT_ENABLING_ERR); - dt_handle_rawerr(dtp, s); - free(s); - } else if (rc < 0) { + if (rc < 0) { close(fd); - return dt_set_errno(dtp, EDT_ENABLING_ERR); + return dt_attach_error(dtp, rc, + prp->desc->prv, prp->desc->mod, + prp->desc->fun, prp->desc->prb); } } diff --git a/libdtrace/dt_bpf.h b/libdtrace/dt_bpf.h index 85934d2d..464f0189 100644 --- a/libdtrace/dt_bpf.h +++ b/libdtrace/dt_bpf.h @@ -67,6 +67,8 @@ extern int dt_perf_event_open(struct perf_event_attr *attr, pid_t pid, int cpu, int group_fd, unsigned long flags); extern int dt_bpf(enum bpf_cmd cmd, union bpf_attr *attr); +extern int dt_attach_error(struct dtrace_hdl *, int, ...); + extern int dt_bpf_gmap_create(struct dtrace_hdl *); extern int dt_bpf_lockmem_error(struct dtrace_hdl *dtp, const char *msg); diff --git a/libdtrace/dt_error.c b/libdtrace/dt_error.c index 213f0d9e..6721e8e4 100644 --- a/libdtrace/dt_error.c +++ b/libdtrace/dt_error.c @@ -111,6 +111,8 @@ dtrace_errmsg(dtrace_hdl_t *dtp, int error) if (error == EDT_COMPILER && dtp != NULL && dtp->dt_errmsg[0] != '\0') str = dtp->dt_errmsg; + if (error == EDT_ENABLING_ERR && dtp != NULL && dtp->dt_errmsg[0] != '\0') + str = dtp->dt_errmsg; else if (error == EDT_BPF && dtp != NULL && dtp->dt_errmsg[0] != '\0') str = dtp->dt_errmsg; else if (error == EDT_CTF && dtp != NULL && dtp->dt_ctferr != 0) diff --git a/libdtrace/dt_prov_uprobe.c b/libdtrace/dt_prov_uprobe.c index 17f2f7ee..96a59aef 100644 --- a/libdtrace/dt_prov_uprobe.c +++ b/libdtrace/dt_prov_uprobe.c @@ -385,19 +385,10 @@ static int add_probe_uprobe(dtrace_hdl_t *dtp, dt_probe_t *prp) if (prp->prov->impl->attach) rc = prp->prov->impl->attach(dtp, prp, fd); - if (rc == -ENOTSUPP) { - char *s; - - close(fd); - if (asprintf(&s, "Failed to enable %s:%s:%s:%s", - prp->desc->prv, prp->desc->mod, - prp->desc->fun, prp->desc->prb) == -1) - return dt_set_errno(dtp, EDT_ENABLING_ERR); - dt_handle_rawerr(dtp, s); - free(s); - } else if (rc < 0) { + if (rc < 0) { close(fd); - return dt_set_errno(dtp, EDT_ENABLING_ERR); + return dt_attach_error(dtp, rc, prp->desc->prv, prp->desc->mod, + prp->desc->fun, prp->desc->prb); } return 0; -- 2.43.5 From kris.van.hees at oracle.com Mon Mar 17 18:41:27 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Mon, 17 Mar 2025 14:41:27 -0400 Subject: [DTrace-devel] [PATCH 2/2] bpf: fix have_attach_type() detection Message-ID: <34f1beb7ed7b330dcc9031f23483f546.kris.van.hees@oracle.com> There are kernel versions that support the BPF_TRACE_FENTRY attach type at program load, but do not support opening the attachment point (a kernel symbol by BTF id) as a raw tracepoint. The cause is that the support for fentry as a raw tracepoint was initially only implemented on x86_64. We now test both program load *and* opening the raw tracepoint to know if BPF_TRACE_FENTRY as attach type is supported. Signed-off-by: Kris Van Hees --- libdtrace/dt_bpf.c | 29 ++++++++++++++++++++++------- libdtrace/dt_prov_fbt.c | 2 ++ 2 files changed, 24 insertions(+), 7 deletions(-) diff --git a/libdtrace/dt_bpf.c b/libdtrace/dt_bpf.c index e50bb536..9ee32e8b 100644 --- a/libdtrace/dt_bpf.c +++ b/libdtrace/dt_bpf.c @@ -486,19 +486,34 @@ have_attach_type(enum bpf_prog_type ptype, enum bpf_attach_type atype, BPF_RETURN() }; dtrace_difo_t dp; - int fd; + int pfd, tfd = -1; dp.dtdo_buf = insns; dp.dtdo_len = ARRAY_SIZE(insns); - fd = dt_bpf_prog_attach(ptype, atype, 0, btf_id, &dp, 0, NULL, 0); - /* If the program loads, we can use the attach type. */ - if (fd > 0) { - close(fd); - return 1; - } + pfd = dt_bpf_prog_attach(ptype, atype, 0, btf_id, &dp, 0, NULL, 0); + /* If the program load fails, we cannot iuse the attach type. */ + if (pfd < 0) + goto fail; + /* + * If the program loads, we still need to verify that probe can be + * opened as a raw tracepoint. Some kernels allow the program load + * but return -ENOTSUPP when you try to open the raw tracepoint. + */ + tfd = dt_bpf_raw_tracepoint_open(NULL, pfd); + if (tfd < 0) + goto fail; + + close(tfd); + close(pfd); + return 1; + +fail: /* Failed -> attach type not available to us */ + if (pfd >= 0) + close(pfd); + return 0; } diff --git a/libdtrace/dt_prov_fbt.c b/libdtrace/dt_prov_fbt.c index 1489275a..00f9174c 100644 --- a/libdtrace/dt_prov_fbt.c +++ b/libdtrace/dt_prov_fbt.c @@ -74,6 +74,8 @@ dt_provimpl_t dt_rawfbt; static int populate(dtrace_hdl_t *dtp) { dt_fbt = BPF_HAS(dtp, BPF_FEAT_FENTRY) ? dt_fbt_fprobe : dt_fbt_kprobe; + dt_dprintf("fbt: Using %s implementation\n", + BPF_HAS(dtp, BPF_FEAT_FENTRY) ? "fentry/fexit" : "kprobe"); if (dt_provider_create(dtp, dt_fbt.name, &dt_fbt, &pattr, NULL) == NULL || -- 2.43.5 From eugene.loh at oracle.com Mon Mar 17 19:57:18 2025 From: eugene.loh at oracle.com (Eugene Loh) Date: Mon, 17 Mar 2025 15:57:18 -0400 Subject: [DTrace-devel] [PATCH 1/2] error: report probe name on failed enabling error In-Reply-To: References: Message-ID: I don't 100% get our error reporting, but Reviewed-by: Eugene Loh A few things... The various files need updated copyright years. On 2/24/25 13:43, Kris Van Hees wrote: I don't know if anyone else sees this, but this is the second patch in a few days that has gotten buried in my inbox due to a stale date.? (On Friday, I got one dated 1/24.)? Anyhow, I suppose I now know to look for such emails. > diff --git a/libdtrace/dt_bpf.c b/libdtrace/dt_bpf.c > +int > +dt_attach_error(dtrace_hdl_t *dtp, int rc, ...) > +{ > + va_list ap, apc; > + char *fmt; If you asprintf(&fmt), do you want a matching free() to prevent a memory leak?? (Clearly not a big deal, but...) and finally... > diff --git a/libdtrace/dt_error.c b/libdtrace/dt_error.c > @@ -111,6 +111,8 @@ dtrace_errmsg(dtrace_hdl_t *dtp, int error) > > if (error == EDT_COMPILER && dtp != NULL && dtp->dt_errmsg[0] != '\0') > str = dtp->dt_errmsg; > + if (error == EDT_ENABLING_ERR && dtp != NULL && dtp->dt_errmsg[0] != '\0') > + str = dtp->dt_errmsg; > else if (error == EDT_BPF && dtp != NULL && dtp->dt_errmsg[0] != '\0') > str = dtp->dt_errmsg; > else if (error == EDT_CTF && dtp != NULL && dtp->dt_ctferr != 0) Should that be "else if"?? Otherwise, if error==EDT_COMPILER, you trigger both that clause and the later "else" clause. From kris.van.hees at oracle.com Mon Mar 17 20:03:40 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Mon, 17 Mar 2025 16:03:40 -0400 Subject: [DTrace-devel] [PATCH 1/2] error: report probe name on failed enabling error In-Reply-To: References: Message-ID: On Mon, Mar 17, 2025 at 03:57:18PM -0400, Eugene Loh wrote: > I don't 100% get our error reporting, but > Reviewed-by: Eugene Loh > > A few things... > > The various files need updated copyright years. Good point. > On 2/24/25 13:43, Kris Van Hees wrote: > > I don't know if anyone else sees this, but this is the second patch in a few > days that has gotten buried in my inbox due to a stale date.? (On Friday, I > got one dated 1/24.)? Anyhow, I suppose I now know to look for such emails. My fault - due to how I was sending the patches, it took the date stamp on the actual patch and this one was written end of last month, but never sent because I was still working on some other things and it wasn't needed yet. Will avoid that in the future. > > diff --git a/libdtrace/dt_bpf.c b/libdtrace/dt_bpf.c > > +int > > +dt_attach_error(dtrace_hdl_t *dtp, int rc, ...) > > +{ > > + va_list ap, apc; > > + char *fmt; > > If you asprintf(&fmt), do you want a matching free() to prevent a memory > leak?? (Clearly not a big deal, but...) > and finally... Good catch - fixed. > > diff --git a/libdtrace/dt_error.c b/libdtrace/dt_error.c > > @@ -111,6 +111,8 @@ dtrace_errmsg(dtrace_hdl_t *dtp, int error) > > if (error == EDT_COMPILER && dtp != NULL && dtp->dt_errmsg[0] != '\0') > > str = dtp->dt_errmsg; > > + if (error == EDT_ENABLING_ERR && dtp != NULL && dtp->dt_errmsg[0] != '\0') > > + str = dtp->dt_errmsg; > > else if (error == EDT_BPF && dtp != NULL && dtp->dt_errmsg[0] != '\0') > > str = dtp->dt_errmsg; > > else if (error == EDT_CTF && dtp != NULL && dtp->dt_ctferr != 0) > > Should that be "else if"?? Otherwise, if error==EDT_COMPILER, you trigger > both that clause and the later "else" clause. Absolutely - fixed. From eugene.loh at oracle.com Mon Mar 17 20:16:01 2025 From: eugene.loh at oracle.com (Eugene Loh) Date: Mon, 17 Mar 2025 16:16:01 -0400 Subject: [DTrace-devel] [PATCH 2/2] bpf: fix have_attach_type() detection In-Reply-To: <34f1beb7ed7b330dcc9031f23483f546.kris.van.hees@oracle.com> References: <34f1beb7ed7b330dcc9031f23483f546.kris.van.hees@oracle.com> Message-ID: Reviewed-by: Eugene Loh I'll try to run tests tonight. In what order will patches be applied?? I vote for these two ahead of the 8-patch perf series. dt_prov_fbt.c needs updated copyright year Would it be possible/practical in the commit message to mention kernel commits or version numbers perhaps?? Not a big deal, perhaps. On 3/17/25 14:41, Kris Van Hees wrote: > There are kernel versions that support the BPF_TRACE_FENTRY attach type > at program load, but do not support opening the attachment point (a > kernel symbol by BTF id) as a raw tracepoint. The cause is that the > support for fentry as a raw tracepoint was initially only implemented > on x86_64. > > We now test both program load *and* opening the raw tracepoint to know > if BPF_TRACE_FENTRY as attach type is supported. > > Signed-off-by: Kris Van Hees > --- > libdtrace/dt_bpf.c | 29 ++++++++++++++++++++++------- > libdtrace/dt_prov_fbt.c | 2 ++ > 2 files changed, 24 insertions(+), 7 deletions(-) > > diff --git a/libdtrace/dt_bpf.c b/libdtrace/dt_bpf.c > index e50bb536..9ee32e8b 100644 > --- a/libdtrace/dt_bpf.c > +++ b/libdtrace/dt_bpf.c > @@ -486,19 +486,34 @@ have_attach_type(enum bpf_prog_type ptype, enum bpf_attach_type atype, > BPF_RETURN() > }; > dtrace_difo_t dp; > - int fd; > + int pfd, tfd = -1; > > dp.dtdo_buf = insns; > dp.dtdo_len = ARRAY_SIZE(insns); > > - fd = dt_bpf_prog_attach(ptype, atype, 0, btf_id, &dp, 0, NULL, 0); > - /* If the program loads, we can use the attach type. */ > - if (fd > 0) { > - close(fd); > - return 1; > - } > + pfd = dt_bpf_prog_attach(ptype, atype, 0, btf_id, &dp, 0, NULL, 0); > + /* If the program load fails, we cannot iuse the attach type. */ > + if (pfd < 0) > + goto fail; > > + /* > + * If the program loads, we still need to verify that probe can be > + * opened as a raw tracepoint. Some kernels allow the program load > + * but return -ENOTSUPP when you try to open the raw tracepoint. > + */ > + tfd = dt_bpf_raw_tracepoint_open(NULL, pfd); > + if (tfd < 0) > + goto fail; > + > + close(tfd); > + close(pfd); > + return 1; > + > +fail: > /* Failed -> attach type not available to us */ > + if (pfd >= 0) > + close(pfd); > + > return 0; > } > > diff --git a/libdtrace/dt_prov_fbt.c b/libdtrace/dt_prov_fbt.c > index 1489275a..00f9174c 100644 > --- a/libdtrace/dt_prov_fbt.c > +++ b/libdtrace/dt_prov_fbt.c > @@ -74,6 +74,8 @@ dt_provimpl_t dt_rawfbt; > static int populate(dtrace_hdl_t *dtp) > { > dt_fbt = BPF_HAS(dtp, BPF_FEAT_FENTRY) ? dt_fbt_fprobe : dt_fbt_kprobe; > + dt_dprintf("fbt: Using %s implementation\n", > + BPF_HAS(dtp, BPF_FEAT_FENTRY) ? "fentry/fexit" : "kprobe"); > > if (dt_provider_create(dtp, dt_fbt.name, &dt_fbt, &pattr, > NULL) == NULL || From kris.van.hees at oracle.com Mon Mar 17 20:20:41 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Mon, 17 Mar 2025 16:20:41 -0400 Subject: [DTrace-devel] [PATCH 2/2] bpf: fix have_attach_type() detection In-Reply-To: References: <34f1beb7ed7b330dcc9031f23483f546.kris.van.hees@oracle.com> Message-ID: On Mon, Mar 17, 2025 at 04:16:01PM -0400, Eugene Loh wrote: > Reviewed-by: Eugene Loh Thanks. > I'll try to run tests tonight. > > In what order will patches be applied?? I vote for these two ahead of the > 8-patch perf series. I really prefer the opposite order, mainly because the performance improvements really help with speeding up the testsuite runs. > dt_prov_fbt.c needs updated copyright year The perf series handles that. > Would it be possible/practical in the commit message to mention kernel > commits or version numbers perhaps?? Not a big deal, perhaps. Yes, because it is a bit of work to hunt it down, and not really relevant. > On 3/17/25 14:41, Kris Van Hees wrote: > > There are kernel versions that support the BPF_TRACE_FENTRY attach type > > at program load, but do not support opening the attachment point (a > > kernel symbol by BTF id) as a raw tracepoint. The cause is that the > > support for fentry as a raw tracepoint was initially only implemented > > on x86_64. > > > > We now test both program load *and* opening the raw tracepoint to know > > if BPF_TRACE_FENTRY as attach type is supported. > > > > Signed-off-by: Kris Van Hees > > --- > > libdtrace/dt_bpf.c | 29 ++++++++++++++++++++++------- > > libdtrace/dt_prov_fbt.c | 2 ++ > > 2 files changed, 24 insertions(+), 7 deletions(-) > > > > diff --git a/libdtrace/dt_bpf.c b/libdtrace/dt_bpf.c > > index e50bb536..9ee32e8b 100644 > > --- a/libdtrace/dt_bpf.c > > +++ b/libdtrace/dt_bpf.c > > @@ -486,19 +486,34 @@ have_attach_type(enum bpf_prog_type ptype, enum bpf_attach_type atype, > > BPF_RETURN() > > }; > > dtrace_difo_t dp; > > - int fd; > > + int pfd, tfd = -1; > > dp.dtdo_buf = insns; > > dp.dtdo_len = ARRAY_SIZE(insns); > > - fd = dt_bpf_prog_attach(ptype, atype, 0, btf_id, &dp, 0, NULL, 0); > > - /* If the program loads, we can use the attach type. */ > > - if (fd > 0) { > > - close(fd); > > - return 1; > > - } > > + pfd = dt_bpf_prog_attach(ptype, atype, 0, btf_id, &dp, 0, NULL, 0); > > + /* If the program load fails, we cannot iuse the attach type. */ > > + if (pfd < 0) > > + goto fail; > > + /* > > + * If the program loads, we still need to verify that probe can be > > + * opened as a raw tracepoint. Some kernels allow the program load > > + * but return -ENOTSUPP when you try to open the raw tracepoint. > > + */ > > + tfd = dt_bpf_raw_tracepoint_open(NULL, pfd); > > + if (tfd < 0) > > + goto fail; > > + > > + close(tfd); > > + close(pfd); > > + return 1; > > + > > +fail: > > /* Failed -> attach type not available to us */ > > + if (pfd >= 0) > > + close(pfd); > > + > > return 0; > > } > > diff --git a/libdtrace/dt_prov_fbt.c b/libdtrace/dt_prov_fbt.c > > index 1489275a..00f9174c 100644 > > --- a/libdtrace/dt_prov_fbt.c > > +++ b/libdtrace/dt_prov_fbt.c > > @@ -74,6 +74,8 @@ dt_provimpl_t dt_rawfbt; > > static int populate(dtrace_hdl_t *dtp) > > { > > dt_fbt = BPF_HAS(dtp, BPF_FEAT_FENTRY) ? dt_fbt_fprobe : dt_fbt_kprobe; > > + dt_dprintf("fbt: Using %s implementation\n", > > + BPF_HAS(dtp, BPF_FEAT_FENTRY) ? "fentry/fexit" : "kprobe"); > > if (dt_provider_create(dtp, dt_fbt.name, &dt_fbt, &pattr, > > NULL) == NULL || From eugene.loh at oracle.com Mon Mar 17 21:00:39 2025 From: eugene.loh at oracle.com (Eugene Loh) Date: Mon, 17 Mar 2025 17:00:39 -0400 Subject: [DTrace-devel] [PATCH 6/8] fbt: performance improvements In-Reply-To: References: <20250307213441.9495-1-kris.van.hees@oracle.com> <20250307213441.9495-5-kris.van.hees@oracle.com> Message-ID: On 3/12/25 01:33, Kris Van Hees wrote: > No change should be needed to the lockstat provider. You uncovered a bug > in my patch - I'll fix it. Just curious why lockstat even used "_*" rather than "_slowpath". > Incidentally, I also just noticed that dt_modsym_mark_traceable(dtp); is being > done too early. We only really need that to be done once we get to looking at > function symbols. I'll move it - that way we avoid marking function traceable > for probes that cannot be FBT probes because of probe name or module name. > > I'll send out a v2 tomorrow with that fix. Okay, I think I see v2.? Just checking:? does it indeed have dt_modsym_mark_traceable() moved? From kris.van.hees at oracle.com Mon Mar 17 21:15:53 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Mon, 17 Mar 2025 17:15:53 -0400 Subject: [DTrace-devel] [PATCH 6/8] fbt: performance improvements In-Reply-To: References: <20250307213441.9495-1-kris.van.hees@oracle.com> <20250307213441.9495-5-kris.van.hees@oracle.com> Message-ID: On Mon, Mar 17, 2025 at 05:00:39PM -0400, Eugene Loh wrote: > On 3/12/25 01:33, Kris Van Hees wrote: > > > No change should be needed to the lockstat provider. You uncovered a bug > > in my patch - I'll fix it. > > Just curious why lockstat even used "_*" rather than "_slowpath". Not sure - maybe there was another variant at some point in time that is no longer there right now. > > Incidentally, I also just noticed that dt_modsym_mark_traceable(dtp); is being > > done too early. We only really need that to be done once we get to looking at > > function symbols. I'll move it - that way we avoid marking function traceable > > for probes that cannot be FBT probes because of probe name or module name. > > > > I'll send out a v2 tomorrow with that fix. > > Okay, I think I see v2.? Just checking:? does it indeed have > dt_modsym_mark_traceable() moved? No, that is being done in the consolidation patch. From eugene.loh at oracle.com Tue Mar 18 01:36:56 2025 From: eugene.loh at oracle.com (Eugene Loh) Date: Mon, 17 Mar 2025 21:36:56 -0400 Subject: [DTrace-devel] [PATCH 2/2] bpf: fix have_attach_type() detection In-Reply-To: References: <34f1beb7ed7b330dcc9031f23483f546.kris.van.hees@oracle.com> Message-ID: <8ec65b3f-4b6e-f6ef-fc3b-3a3117f581ab@oracle.com> On 3/17/25 16:20, Kris Van Hees wrote: > On Mon, Mar 17, 2025 at 04:16:01PM -0400, Eugene Loh wrote: > >> In what order will patches be applied?? I vote for these two ahead of the >> 8-patch perf series. > I really prefer the opposite order, mainly because the performance improvements > really help with speeding up the testsuite runs. Maybe it won't matter, but the question is what one is looking for when one patch set is in and the other is not.? One choice is that one gets bad tests results... but at least they'll be faster.? The other choice is that the test results will still be as slow as before, but at least they'll be meaningful.? I vote for the latter.? (I guess the other question is what platform one is talking about...)? Anyhow, I vote for correctness before speed, but maybe we'll only test all patches at once. From eugene.loh at oracle.com Tue Mar 18 01:58:29 2025 From: eugene.loh at oracle.com (Eugene Loh) Date: Mon, 17 Mar 2025 21:58:29 -0400 Subject: [DTrace-devel] [PATCH v2 6/8] fbt: performance improvements In-Reply-To: References: Message-ID: <16564042-dde6-d19b-ca57-783bbebcaf4a@oracle.com> Reviewed-by: Eugene Loh Also... One patch or another needs to update the dt_module.[c|h] copyright years. On 1/14/25 23:21, Kris Van Hees wrote: > Up until now, FBT probes were registered for every symbol that was > listed as traceable. Most tracing session do not use most or even s/session/sessions/ > any of these, and the process of registering them all was quite > slow. > > diff --git a/libdtrace/dt_module.c b/libdtrace/dt_module.c > @@ -1044,6 +1045,83 @@ dt_kern_module_find_ctf(dtrace_hdl_t *dtp, dt_module_t *dmp) > } > } > > +#define PROBE_LIST TRACEFS "available_filter_functions" > + > +/* > + * Determine which kernel functions are traceable and mark them. > + */ > +void > +dt_modsym_mark_traceable(dtrace_hdl_t *dtp) > +{ > + FILE *f; > + char *buf = NULL; > + size_t len = 0; > + > + if (dt_symtab_traceable(dtp->dt_exec->dm_kernsyms)) > + return; > + > + f = fopen(PROBE_LIST, "r"); > + if (f == NULL) > + return; > + > + while (getline(&buf, &len, f) >= 0) { > + char *p; > + dt_symbol_t *sym = NULL; > + > + /* > + * Here buf is either "funcname\n" or "funcname [modname]\n". > + * The last line may not have a linefeed. > + */ > + p = strchr(buf, '\n'); > + if (p) { > + *p = '\0'; > + if (p > buf && *(--p) == ']') > + *p = '\0'; > + } > + > + /* > + * Now buf is either "funcname" or "funcname [modname". If > + * there is no module name provided, we will use the default. > + */ Maybe that second sentence is now orphaned? > + p = strchr(buf, ' '); > + if (p) { > + *p++ = '\0'; > + if (*p == '[') > + p++; > + } > + > diff --git a/libdtrace/dt_prov_fbt.c b/libdtrace/dt_prov_fbt.c > @@ -81,79 +71,166 @@ static int populate(dtrace_hdl_t *dtp) > if (prv == NULL) > return -1; /* errno already set */ > > - f = fopen(PROBE_LIST, "r"); > - if (f == NULL) > + return 0; > +} > + > +/* Create a probe (if it does not exist yet). */ > +static int provide_probe(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) > +{ > + dt_provider_t *prv = dt_provider_lookup(dtp, pdp->prv); > + > + if (prv == NULL) > + return 0; > + if (dt_probe_lookup(dtp, pdp) != NULL) > return 0; > + if (dt_tp_probe_insert(dtp, prv, pdp->prv, pdp->mod, pdp->fun, pdp->prb)) > + return 1; > > - while (getline(&buf, &n, f) >= 0) { > - /* > - * Here buf is either "funcname\n" or "funcname [modname]\n". > - * The last line may not have a linefeed. > - */ > - p = strchr(buf, '\n'); > - if (p) { > - *p = '\0'; > - if (p > buf && *(--p) == ']') > - *p = '\0'; > + return 0; > +} > + > +/* > + * Try to provide probes for the given probe description. The caller ensures > + * that the provider name in probe desxcription (if any) is a match for this desxcription description > + * provider. When this is called, we already know that this provider matches > + * the provider component of the probe specification. > + */ From eugene.loh at oracle.com Tue Mar 18 02:06:40 2025 From: eugene.loh at oracle.com (Eugene Loh) Date: Mon, 17 Mar 2025 22:06:40 -0400 Subject: [DTrace-devel] [PATCH 7/8] rawfbt: performance improvements In-Reply-To: <20250307213441.9495-6-kris.van.hees@oracle.com> References: <20250307213441.9495-1-kris.van.hees@oracle.com> <20250307213441.9495-6-kris.van.hees@oracle.com> Message-ID: <37d5caec-24ee-e271-4da4-34c20eb266ea@oracle.com> Reviewed-by: Eugene Loh Honestly, I hardly looked at this one.? It looks much like the preceding patch, and any sins, if any, will presumably be uncovered in the next patch. On 3/7/25 16:34, Kris Van Hees via DTrace-devel wrote: > Signed-off-by: Kris Van Hees > --- > libdtrace/dt_prov_rawfbt.c | 223 +++++++++++++++++++++++++------------ > 1 file changed, 151 insertions(+), 72 deletions(-) > > diff --git a/libdtrace/dt_prov_rawfbt.c b/libdtrace/dt_prov_rawfbt.c > index 62f2f4f0..52152655 100644 > --- a/libdtrace/dt_prov_rawfbt.c > +++ b/libdtrace/dt_prov_rawfbt.c > @@ -44,10 +44,8 @@ > #include "dt_pt_regs.h" > > static const char prvname[] = "rawfbt"; > -static const char modname[] = "vmlinux"; > > #define KPROBE_EVENTS TRACEFS "kprobe_events" > -#define PROBE_LIST TRACEFS "available_filter_functions" > > #define FBT_GROUP_FMT GROUP_FMT "_%s" > #define FBT_GROUP_DATA GROUP_DATA, prp->desc->prb > @@ -61,98 +59,178 @@ static const dtrace_pattr_t pattr = { > }; > > /* > - * Scan the PROBE_LIST file and add entry and return probes for every function > - * that is listed. > + * Create the rawfbt provider. > */ > static int populate(dtrace_hdl_t *dtp) > { > dt_provider_t *prv; > - FILE *f; > - char *buf = NULL; > - size_t len = 0; > - size_t n = 0; > - dtrace_syminfo_t sip; > - dtrace_probedesc_t pd; > > prv = dt_provider_create(dtp, prvname, &dt_rawfbt, &pattr, NULL); > if (prv == NULL) > return -1; /* errno already set */ > > - f = fopen(PROBE_LIST, "r"); > - if (f == NULL) > + return 0; > +} > + > +/* Create a probe (if it does not exist yet). */ > +static int provide_probe(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) > +{ > + dt_provider_t *prv = dt_provider_lookup(dtp, pdp->prv); > + > + if (prv == NULL) > return 0; > + if (dt_probe_lookup(dtp, pdp) != NULL) > + return 0; > +#ifdef DEBUG_FBT > + if (dt_tp_probe_insert(dtp, prv, pdp->prv, pdp->mod, pdp->fun, pdp->prb)) { > + fprintf(stderr, "%s(..., PROVIDE %s:%s:%s:%s) - ...\n", __func__, pdp->prv, pdp->mod, pdp->fun, pdp->prb); > + return 1; > + } > +#else > + if (dt_tp_probe_insert(dtp, prv, pdp->prv, pdp->mod, pdp->fun, pdp->prb)) > + return 1; > +#endif > > - while (getline(&buf, &len, f) >= 0) { > - char *p; > - const char *mod = modname; > - dt_probe_t *prp; > + return 0; > +} > > - /* > - * Here buf is either "funcname\n" or "funcname [modname]\n". > - * The last line may not have a linefeed. > - */ > - p = strchr(buf, '\n'); > - if (p) { > - *p = '\0'; > - if (p > buf && *(--p) == ']') > - *p = '\0'; > +/* > + * Try to provide probes for the given probe description. The caller ensures > + * that the provider name in probe desxcription (if any) is a match for this > + * provider. When this is called, we already know that this provider matches > + * the provider component of the probe specification. > + */ > +#define FBT_ENTRY 1 > +#define FBT_RETURN 2 > + > +static int provide(dtrace_hdl_t *dtp, const dtrace_probedesc_t *pdp) > +{ > + int n = 0; > + int prb = 0; > + dt_module_t *dmp = NULL; > + dt_symbol_t *sym = NULL; > + dt_htab_next_t *it = NULL; > + dtrace_probedesc_t pd; > + > + dt_modsym_mark_traceable(dtp); > + > + /* > + * Nothing to do if a probe name is specified and cannot match 'entry' > + * or 'return'. > + */ > + if (dt_gmatch("entry", pdp->prb)) > + prb |= FBT_ENTRY; > + if (dt_gmatch("return", pdp->prb)) > + prb |= FBT_RETURN; > + if (prb == 0) > + return 0; > + > + /* > + * If we have an explicit module name, check it. If not found, we can > + * ignore this request. > + */ > + if (pdp->mod[0] != '\0' && strchr(pdp->mod, '*') == NULL) { > + dmp = dt_module_lookup_by_name(dtp, pdp->mod); > + if (dmp == NULL) > + return 0; > + } > + > + /* > + * If we have an explicit function name, we start with a basic symbol > + * name lookup. > + */ > + if (pdp->fun[0] != '\0' && strchr(pdp->fun, '*') == NULL) { > + /* If we have a module, use it. */ > + if (dmp != NULL) { > + sym = dt_module_symbol_by_name(dtp, dmp, pdp->fun); > + if (sym == NULL) > + return 0; > + if (!dt_symbol_traceable(sym)) > + return 0; > + > + pd.id = DTRACE_IDNONE; > + pd.prv = pdp->prv; > + pd.mod = dmp->dm_name; > + pd.fun = pdp->fun; > + > + if (prb & FBT_ENTRY) { > + pd.prb = "entry"; > + n += provide_probe(dtp, &pd); > + } > + if (prb & FBT_RETURN) { > + pd.prb = "return"; > + n += provide_probe(dtp, &pd); > + } > + > + return n; > } > > - /* > - * Now buf is either "funcname" or "funcname [modname". If > - * there is no module name provided, we will use the default. > - */ > - p = strchr(buf, ' '); > - if (p) { > - *p++ = '\0'; > - if (*p == '[') > - p++; > + sym = dt_symbol_by_name(dtp, pdp->fun); > + while (sym != NULL) { > + const char *mod = dt_symbol_module(sym)->dm_name; > + > + if (dt_symbol_traceable(sym) && > + dt_gmatch(mod, pdp->mod)) { > + pd.id = DTRACE_IDNONE; > + pd.prv = pdp->prv; > + pd.mod = mod; > + pd.fun = pdp->fun; > + > + if (prb & FBT_ENTRY) { > + pd.prb = "entry"; > + n += provide_probe(dtp, &pd); > + } > + if (prb & FBT_RETURN) { > + pd.prb = "return"; > + n += provide_probe(dtp, &pd); > + } > + > + } > + sym = dt_symbol_by_name_next(sym); > } > > -#define strstarts(var, x) (strncmp(var, x, strlen (x)) == 0) > - /* Weed out __ftrace_invalid_address___* entries. */ > - if (strstarts(buf, "__ftrace_invalid_address__") || > - strstarts(buf, "__probestub_") || > - strstarts(buf, "__traceiter_")) > + return n; > + } > + > + /* > + * No explicit function name. We need to go through all possible > + * symbol names and see if they match. > + */ > + while ((sym = dt_htab_next(dtp->dt_kernsyms, &it)) != NULL) { > + dt_module_t *smp; > + const char *fun; > + > + /* Ensure the symbol can be traced. */ > + if (!dt_symbol_traceable(sym)) > continue; > -#undef strstarts > > - /* > - * If we did not see a module name, perform a symbol lookup to > - * try to determine the module name. > - */ > - if (!p) { > - if (dtrace_lookup_by_name(dtp, DTRACE_OBJ_KMODS, buf, > - NULL, &sip) == 0) > - mod = sip.object; > - } else > - mod = p; > + /* Match the function name. */ > + fun = dt_symbol_name(sym); > + if (!dt_gmatch(fun, pdp->fun)) > + continue; > > - /* > - * Due to the lack of module names in > - * TRACEFS/available_filter_functions, there are some duplicate > - * function names. The kernel does not let us trace functions > - * that have duplicates, so we need to remove the existing one. > - */ > - pd.id = DTRACE_IDNONE; > - pd.prv = prvname; > - pd.mod = mod; > - pd.fun = buf; > - pd.prb = "entry"; > - prp = dt_probe_lookup(dtp, &pd); > - if (prp != NULL) { > - dt_probe_destroy(prp); > + /* Validate the module name. */ > + smp = dt_symbol_module(sym); > + if (dmp) { > + if (smp != dmp) > + continue; > + } else if (!dt_gmatch(smp->dm_name, pdp->mod)) > continue; > - } > > - if (dt_tp_probe_insert(dtp, prv, prvname, mod, buf, "entry")) > - n++; > - if (dt_tp_probe_insert(dtp, prv, prvname, mod, buf, "return")) > - n++; > - } > + pd.id = DTRACE_IDNONE; > + pd.prv = pdp->prv; > + pd.mod = smp->dm_name; > + pd.fun = fun; > > - free(buf); > - fclose(f); > + if (prb & FBT_ENTRY) { > + pd.prb = "entry"; > + n += provide_probe(dtp, &pd); > + } > + if (prb & FBT_RETURN) { > + pd.prb = "return"; > + n += provide_probe(dtp, &pd); > + } > + } > > return n; > } > @@ -306,6 +384,7 @@ dt_provimpl_t dt_rawfbt = { > .name = prvname, > .prog_type = BPF_PROG_TYPE_KPROBE, > .populate = &populate, > + .provide = &provide, > .load_prog = &dt_bpf_prog_load, > .trampoline = &trampoline, > .attach = &attach, From kris.van.hees at oracle.com Tue Mar 18 02:49:57 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Mon, 17 Mar 2025 22:49:57 -0400 Subject: [DTrace-devel] [PATCH 2/2] bpf: fix have_attach_type() detection In-Reply-To: <8ec65b3f-4b6e-f6ef-fc3b-3a3117f581ab@oracle.com> References: <34f1beb7ed7b330dcc9031f23483f546.kris.van.hees@oracle.com> <8ec65b3f-4b6e-f6ef-fc3b-3a3117f581ab@oracle.com> Message-ID: On Mon, Mar 17, 2025 at 09:36:56PM -0400, Eugene Loh wrote: > On 3/17/25 16:20, Kris Van Hees wrote: > > > On Mon, Mar 17, 2025 at 04:16:01PM -0400, Eugene Loh wrote: > > > > > In what order will patches be applied?? I vote for these two ahead of the > > > 8-patch perf series. > > I really prefer the opposite order, mainly because the performance improvements > > really help with speeding up the testsuite runs. > > Maybe it won't matter, but the question is what one is looking for when one > patch set is in and the other is not.? One choice is that one gets bad tests > results... but at least they'll be faster.? The other choice is that the > test results will still be as slow as before, but at least they'll be > meaningful.? I vote for the latter.? (I guess the other question is what > platform one is talking about...)? Anyhow, I vote for correctness before > speed, but maybe we'll only test all patches at once. There is not really an option here of having one patch set in and not the other, so it doesn't really matter. The perf series does not introduce any new regressions as far as I can see, so there is no downside to applying it first yet there is a significant benefit in terms of speeding up testing. The attach-type patches work around a problem with a range of kernels, which is a problem that preceeds the perf-series already, and that fix will definitely be in the tree right after the perf-series so the end result is good. If easier, feel free to consider them altogether a single series rather than two series. From eugene.loh at oracle.com Tue Mar 18 05:10:21 2025 From: eugene.loh at oracle.com (Eugene Loh) Date: Tue, 18 Mar 2025 01:10:21 -0400 Subject: [DTrace-devel] [PATCH 8/8] fbt, rawfbt: consolidate code to avoid duplication In-Reply-To: <20250307213441.9495-7-kris.van.hees@oracle.com> References: <20250307213441.9495-1-kris.van.hees@oracle.com> <20250307213441.9495-7-kris.van.hees@oracle.com> Message-ID: Reviewed-by: Eugene Loh A few minor notes... On 3/7/25 16:34, Kris Van Hees via DTrace-devel wrote: > diff --git a/libdtrace/dt_prov_fbt.c b/libdtrace/dt_prov_fbt.c > @@ -6,17 +6,26 @@ > * > * The Function Boundary Tracing (FBT) provider for DTrace. > * > - * FBT probes are exposed by the kernel as kprobes. They are listed in the > - * TRACEFS/available_filter_functions file. Some kprobes are associated with > - * a specific kernel module, while most are in the core kernel. > + * Kernnel functions can be traced through fentry/fexit probes (when available) s/Kernnel/Kernel/ > + * and kprobes. The FBT provider supports both implementations and will use > + * fentry/fexit probes if the kernel supports them, and fallback to kprobes > + * otherwise. The FBT provider does not support tracing synthetic functions > + * (i.e. compiler-generated functions with a . in their name). > + * > + * The rawfbt provider implements a variant of the FBT provider and always uses > + * kprobes. This provider allow tracing of synthetic function. s/allow/allows/ > * > * Mapping from event name to DTrace probe name: > * > * fbt:vmlinux::entry > * fbt:vmlinux::return > + * rawfbt:vmlinux::entry > + * rawfbt:vmlinux::return > * or > * [] fbt:::entry > * fbt:::return > + * rawfbt:::entry > + * rawfbt:::return > */ > #include > #include > @@ -514,9 +552,20 @@ static void kprobe_detach(dtrace_hdl_t *dtp, const dt_probe_t *prp) > if (fd == -1) > return; > > - dprintf(fd, "-:" FBT_GROUP_FMT "/%s\n", FBT_GROUP_DATA, > - prp->desc->fun); > + if (strcmp(prp->desc->prv, dt_rawfbt.name) == 0) { > + char *p; > + Do you want to make tpn a strdup() here before modifying it? Also, how about a comment mentioning the .->_ conversion, analogous to the comment that was in rawtp detach() or that is still in kprobe_attach()? > + for (p = tpn; *p; p++) { > + if (*p == '.') > + *p = '_'; > + } > + } > + > + dprintf(fd, "-:" FBT_GROUP_FMT "/%s\n", FBT_GROUP_DATA, tpn); > close(fd); > + > + if (tpn != prp->desc->fun) > + free(tpn); > } From eugene.loh at oracle.com Tue Mar 18 05:26:32 2025 From: eugene.loh at oracle.com (Eugene Loh) Date: Tue, 18 Mar 2025 01:26:32 -0400 Subject: [DTrace-devel] [PATCH 5/8] symtab: add support for 'traceable' flag In-Reply-To: <20250307213441.9495-4-kris.van.hees@oracle.com> References: <20250307213441.9495-1-kris.van.hees@oracle.com> <20250307213441.9495-4-kris.van.hees@oracle.com> Message-ID: Reviewed-by: Eugene Loh Note... Both files need updated copyright years? On 3/7/25 16:34, Kris Van Hees via DTrace-devel wrote: > Signed-off-by: Kris Van Hees > --- > libdtrace/dt_symtab.c | 53 +++++++++++++++++++++++++++++++++++++++---- > libdtrace/dt_symtab.h | 6 +++++ > 2 files changed, 55 insertions(+), 4 deletions(-) > > diff --git a/libdtrace/dt_symtab.c b/libdtrace/dt_symtab.c > index db63cc88..4e46f280 100644 > --- a/libdtrace/dt_symtab.c > +++ b/libdtrace/dt_symtab.c > @@ -23,9 +23,12 @@ > #include > #include > > -#define DT_ST_SORTED 0x01 /* Sorted, ready for searching. */ > -#define DT_ST_PACKED 0x02 /* Symbol table packed > +#define DT_ST_SORTED 0x01 /* Sorted, ready for searching. */ > +#define DT_ST_PACKED 0x02 /* Symbol table packed > * (necessarily sorted too) */ > +#define DT_ST_TRACEABLE 0x04 /* Symbols have traceable flag */ > + > +#define DT_STB_TRACE 8 /* traceable symbol */ > > struct dt_symbol { > dt_list_t dts_list; /* list forward/back pointers */ > @@ -275,6 +278,12 @@ dt_symbol_by_name(dtrace_hdl_t *dtp, const char *name) > return dt_htab_lookup(dtp->dt_kernsyms, &tmpl); > } > > +dt_symbol_t * > +dt_symbol_by_name_next(const dt_symbol_t *symbol) > +{ > + return symbol ? (dt_symbol_t *)symbol->dts_he.next : NULL; > +} > + Obviously, not a big deal, but this function seems to have nothing really to do with this patch.? Arguably, its introduction should be deferred to the next patch.? But, I'm fine with leaving it here. > /* Find a symbol in a given module. */ > dt_symbol_t * > dt_module_symbol_by_name(dtrace_hdl_t *dtp, dt_module_t *dmp, const char *name) > @@ -548,7 +557,7 @@ dt_symbol_name(const dt_symbol_t *symbol) > void > dt_symbol_to_elfsym64(dtrace_hdl_t *dtp, dt_symbol_t *symbol, Elf64_Sym *elf_symp) > { > - elf_symp->st_info = symbol->dts_info; > + elf_symp->st_info = symbol->dts_info & ~GELF_ST_INFO(DT_STB_TRACE, 0); > elf_symp->st_value = symbol->dts_addr; > elf_symp->st_size = symbol->dts_size; > elf_symp->st_shndx = 1; /* 'not SHN_UNDEF' is all we guarantee */ > @@ -557,7 +566,7 @@ dt_symbol_to_elfsym64(dtrace_hdl_t *dtp, dt_symbol_t *symbol, Elf64_Sym *elf_sym > void > dt_symbol_to_elfsym32(dtrace_hdl_t *dtp, dt_symbol_t *symbol, Elf32_Sym *elf_symp) > { > - elf_symp->st_info = symbol->dts_info; > + elf_symp->st_info = symbol->dts_info & ~GELF_ST_INFO(DT_STB_TRACE, 0); > elf_symp->st_value = symbol->dts_addr; > elf_symp->st_size = symbol->dts_size; > elf_symp->st_shndx = 1; /* 'not SHN_UNDEF' is all we guarantee */ > @@ -581,3 +590,39 @@ dt_symbol_module(dt_symbol_t *symbol) > { > return symbol->dts_dmp; > } > + > +/* > + * Mark the symtab annotated with traceable flags on symbols. > + */ > +void > +dt_symtab_set_traceable(dt_symtab_t *symtab) > +{ > + symtab->dtst_flags |= DT_ST_TRACEABLE; > +} > + > +/* > + * Return whether symbols have traceable flags. > + */ > +int > +dt_symtab_traceable(const dt_symtab_t *symtab) > +{ > + return symtab->dtst_flags & DT_ST_TRACEABLE; > +} > + > +/* > + * Mark a symbol as traceable. > + */ > +void > +dt_symbol_set_traceable(dt_symbol_t *symbol) > +{ > + symbol->dts_info |= GELF_ST_INFO(DT_STB_TRACE, 0); > +} > + > +/* > + * Return true if the symbol can be traced. > + */ > +int > +dt_symbol_traceable(const dt_symbol_t *symbol) > +{ > + return GELF_ST_BIND(symbol->dts_info) & DT_STB_TRACE; > +} Very minor, but these four functions are very similar and yet their comments seem to have spurious differences.? I would think the comments would be boringly similar.? Also, the first comment ("Mark the symtab annotated with traceable flags on symbols.") makes sense to me only if I read the code and think about the comment a lot. How about these comments instead? ??????? /* Mark a symtab as having traceable flags on symbols. */ ??????? /* Return whether a symtab has traceable flags on symbols. */ ??????? /* Mark a symbol as traceable. */ ??????? /* Return whether a symbol is traceable. */ Or even have "set" and "get" functions and reduce the number of comments altogether.? I mean, one can explain what "traceable" means in each context, but then people can easily enough understand what the "set" and "get" variants do. From kris.van.hees at oracle.com Tue Mar 18 12:59:10 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Tue, 18 Mar 2025 08:59:10 -0400 Subject: [DTrace-devel] [PATCH 8/8] fbt, rawfbt: consolidate code to avoid duplication In-Reply-To: References: <20250307213441.9495-1-kris.van.hees@oracle.com> <20250307213441.9495-7-kris.van.hees@oracle.com> Message-ID: On Tue, Mar 18, 2025 at 01:10:21AM -0400, Eugene Loh via DTrace-devel wrote: > Reviewed-by: Eugene Loh > > A few minor notes... > > On 3/7/25 16:34, Kris Van Hees via DTrace-devel wrote: > > diff --git a/libdtrace/dt_prov_fbt.c b/libdtrace/dt_prov_fbt.c > > @@ -6,17 +6,26 @@ > > * > > * The Function Boundary Tracing (FBT) provider for DTrace. > > * > > - * FBT probes are exposed by the kernel as kprobes. They are listed in the > > - * TRACEFS/available_filter_functions file. Some kprobes are associated with > > - * a specific kernel module, while most are in the core kernel. > > + * Kernnel functions can be traced through fentry/fexit probes (when available) > s/Kernnel/Kernel/ > > + * and kprobes. The FBT provider supports both implementations and will use > > + * fentry/fexit probes if the kernel supports them, and fallback to kprobes > > + * otherwise. The FBT provider does not support tracing synthetic functions > > + * (i.e. compiler-generated functions with a . in their name). > > + * > > + * The rawfbt provider implements a variant of the FBT provider and always uses > > + * kprobes. This provider allow tracing of synthetic function. > s/allow/allows/ > > * > > * Mapping from event name to DTrace probe name: > > * > > * fbt:vmlinux::entry > > * fbt:vmlinux::return > > + * rawfbt:vmlinux::entry > > + * rawfbt:vmlinux::return > > * or > > * [] fbt:::entry > > * fbt:::return > > + * rawfbt:::entry > > + * rawfbt:::return > > */ > > #include > > #include > > @@ -514,9 +552,20 @@ static void kprobe_detach(dtrace_hdl_t *dtp, const dt_probe_t *prp) > > if (fd == -1) > > return; > > - dprintf(fd, "-:" FBT_GROUP_FMT "/%s\n", FBT_GROUP_DATA, > > - prp->desc->fun); > > + if (strcmp(prp->desc->prv, dt_rawfbt.name) == 0) { > > + char *p; > > + > Do you want to make tpn a strdup() here before modifying it? Ah yes indeed. It doesn't break anything as it is now because detach is done when things are shutting down, but correctness would demand doing a strdup() here. Thanks. > Also, how about a comment mentioning the .->_ conversion, analogous to the > comment that was in rawtp detach() or that is still in kprobe_attach()? Sure. > > + for (p = tpn; *p; p++) { > > + if (*p == '.') > > + *p = '_'; > > + } > > + } > > + > > + dprintf(fd, "-:" FBT_GROUP_FMT "/%s\n", FBT_GROUP_DATA, tpn); > > close(fd); > > + > > + if (tpn != prp->desc->fun) > > + free(tpn); > > } > > _______________________________________________ > DTrace-devel mailing list > DTrace-devel at oss.oracle.com > https://oss.oracle.com/mailman/listinfo/dtrace-devel From noreply at github.com Tue Mar 18 18:00:38 2025 From: noreply at github.com (Kris Van Hees) Date: Tue, 18 Mar 2025 11:00:38 -0700 Subject: [DTrace-devel] [oracle/dtrace-utils] cfd279: proc: convert to use standard SDT provider impleme... Message-ID: Branch: refs/heads/devel Home: https://github.com/oracle/dtrace-utils Commit: cfd279acd47237ac58883724c3a8dc93aa4e362d https://github.com/oracle/dtrace-utils/commit/cfd279acd47237ac58883724c3a8dc93aa4e362d Author: Kris Van Hees Date: 2025-03-18 (Tue, 18 Mar 2025) Changed paths: M libdtrace/dt_prov_proc.c Log Message: ----------- proc: convert to use standard SDT provider implementation The proc provider was the first SDT-based provider implememted in DTrace based on BPF, and therefore handled the enabling of probes with custom code. When the other SDT-based providers (sched, ...) were implemented, a generic SDT-framework was developed. Update the proc provider to use the generic SDT-framework. Signed-off-by: Kris Van Hees Reviewed-by: Eugene Loh Commit: b71bbf7a94acbcba96eac10a23f2d2864b6b03d6 https://github.com/oracle/dtrace-utils/commit/b71bbf7a94acbcba96eac10a23f2d2864b6b03d6 Author: Kris Van Hees Date: 2025-03-18 (Tue, 18 Mar 2025) Changed paths: M libdtrace/dt_prov_sched.c Log Message: ----------- sched: clean up unnecessary includes and functions Signed-off-by: Kris Van Hees Reviewed-by: Eugene Loh Commit: d9dac3fc40e1789e0d43f3e91d6a4cad2f287b68 https://github.com/oracle/dtrace-utils/commit/d9dac3fc40e1789e0d43f3e91d6a4cad2f287b68 Author: Kris Van Hees Date: 2025-03-18 (Tue, 18 Mar 2025) Changed paths: M libdtrace/dt_prov_rawfbt.c Log Message: ----------- rawfbt: perform lookup on true symbol names When encountering a . symbol, a symbol lookup was done for instead of . under the assumption that names with . in them were not listed in kallsyms. But that is not true. Signed-off-by: Kris Van Hees Reviewed-by: Eugene Loh Commit: c1b7d2016f7eb555a4f5d15eb5f6aa081b174643 https://github.com/oracle/dtrace-utils/commit/c1b7d2016f7eb555a4f5d15eb5f6aa081b174643 Author: Kris Van Hees Date: 2025-03-18 (Tue, 18 Mar 2025) Changed paths: M libdtrace/dt_module.c Log Message: ----------- ksyms: make symbol name filters less picky Some symbols were being filtered out even though they represent symbols that can actually be probed. Signed-off-by: Kris Van Hees Reviewed-by: Eugene Loh Commit: a53f33fb5acf287376bc527fd7eacf991110538c https://github.com/oracle/dtrace-utils/commit/a53f33fb5acf287376bc527fd7eacf991110538c Author: Kris Van Hees Date: 2025-03-18 (Tue, 18 Mar 2025) Changed paths: M libdtrace/dt_symtab.c M libdtrace/dt_symtab.h Log Message: ----------- symtab: add support for 'traceable' flag and add dt_symbol_by_name_next() The traceable flag can be set on symbols to indicate that the symbol can be traced using kernel facilities. The traceable flag can be set on a symtab to indicte that its symbols have traceable flags associated with them. When the same symbol exists in multiple modules, dt_symbol_by_name() returns reference to the first one. The dt_symbol_by_name_next() function is used to walk through the rest of the list (if any). Signed-off-by: Kris Van Hees Reviewed-by: Eugene Loh Commit: 87e199975769f57387ce9a0cbebb653574c578f4 https://github.com/oracle/dtrace-utils/commit/87e199975769f57387ce9a0cbebb653574c578f4 Author: Kris Van Hees Date: 2025-03-18 (Tue, 18 Mar 2025) Changed paths: M libdtrace/dt_module.c M libdtrace/dt_module.h M libdtrace/dt_prov_fbt.c Log Message: ----------- fbt: performance improvements Up until now, FBT probes were registered for every symbol that was listed as traceable. Most tracing sessions do not use most or even any of these, and the process of registering them all was quite slow. Going forward, FBT probes are registered on demand. If any FBT probes are to be registered, the first will incur the cost of reading the entire list of traceable symbols. Any further FBT probe registration will be able to be satisfied based on that initial processing. The performance improvement is therefore quite significant for tracing sessions that do not trigger any FBT probe registration, and if FBT probes are used, the improvement is still quite noticable because only the probes that are actually needed get registered. Signed-off-by: Kris Van Hees Reviewed-by: Eugene Loh Commit: 2f40aa3dbcf4c0697230d41dcfb0d64e4448b3ad https://github.com/oracle/dtrace-utils/commit/2f40aa3dbcf4c0697230d41dcfb0d64e4448b3ad Author: Kris Van Hees Date: 2025-03-18 (Tue, 18 Mar 2025) Changed paths: M libdtrace/dt_prov_rawfbt.c Log Message: ----------- rawfbt: performance improvements Signed-off-by: Kris Van Hees Reviewed-by: Eugene Loh Commit: 0b7c5a6327dc0f3a4f6e1bed401e17442a931d52 https://github.com/oracle/dtrace-utils/commit/0b7c5a6327dc0f3a4f6e1bed401e17442a931d52 Author: Kris Van Hees Date: 2025-03-18 (Tue, 18 Mar 2025) Changed paths: M libdtrace/Build M libdtrace/dt_prov_fbt.c R libdtrace/dt_prov_rawfbt.c Log Message: ----------- fbt, rawfbt: consolidate code to avoid duplication After optimizing both fbt and rawfbt providers, the resulting code has a significant amount of duplication. The rawfbt provider can now be defined in terms of the kprobe-based fbt provider functions. Signed-off-by: Kris Van Hees Reviewed-by: Eugene Loh Commit: 4bb6327a2de6611e91095b5e6177285f4bb00f0d https://github.com/oracle/dtrace-utils/commit/4bb6327a2de6611e91095b5e6177285f4bb00f0d Author: Kris Van Hees Date: 2025-03-18 (Tue, 18 Mar 2025) Changed paths: M libdtrace/dt_bpf.c M libdtrace/dt_bpf.h M libdtrace/dt_error.c M libdtrace/dt_prov_uprobe.c Log Message: ----------- error: report probe name on failed enabling error Signed-off-by: Kris Van Hees Reviewed-by: Eugene Loh Commit: ad68224b00187f7345284e92ba79b6b99540b8bb https://github.com/oracle/dtrace-utils/commit/ad68224b00187f7345284e92ba79b6b99540b8bb Author: Kris Van Hees Date: 2025-03-18 (Tue, 18 Mar 2025) Changed paths: M libdtrace/dt_bpf.c M libdtrace/dt_prov_fbt.c Log Message: ----------- bpf: fix have_attach_type() detection There are kernel versions that support the BPF_TRACE_FENTRY attach type at program load, but do not support opening the attachment point (a kernel symbol by BTF id) as a raw tracepoint. The cause is that the support for fentry as a raw tracepoint was initially only implemented on x86_64. We now test both program load *and* opening the raw tracepoint to know if BPF_TRACE_FENTRY as attach type is supported. Signed-off-by: Kris Van Hees Reviewed-by: Eugene Loh Compare: https://github.com/oracle/dtrace-utils/compare/39a5e0a8866b...ad68224b0018 To unsubscribe from these emails, change your notification settings at https://github.com/oracle/dtrace-utils/settings/notifications From kris.van.hees at oracle.com Tue Mar 18 18:17:58 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Tue, 18 Mar 2025 14:17:58 -0400 Subject: [DTrace-devel] [PATCH] tests: add test for buggy deduplicator In-Reply-To: <20250109164711.232013-1-nick.alcock@oracle.com> References: <20250109164711.232013-1-nick.alcock@oracle.com> Message-ID: This seems like a test that would belong in the kernel tree rather than in DTrace? It seems like a weird tst to have included in DTrace because if it fails, we cannot do anything about it in DTrace anyway. I'd rather leave out tests that exercise things that are not DTrace itself. On Thu, Jan 09, 2025 at 04:47:10PM +0000, Nick Alcock wrote: > Some early prototype deduplicators dedupped types in one dict (more or less) > rather than putting conflicting types and module types into > sub-dictionaries. Fail if the running kernel is such a kernel. > > Signed-off-by: Nick Alcock > --- > dtrace.spec | 2 +- > test/smoke/tst.ctf-intact.sh | 58 ++++++++++++++++++++++++++++++++++++ > 2 files changed, 59 insertions(+), 1 deletion(-) > create mode 100755 test/smoke/tst.ctf-intact.sh > > diff --git a/dtrace.spec b/dtrace.spec > index 902ad7d8bb980..cf960f14b55c7 100644 > --- a/dtrace.spec > +++ b/dtrace.spec > @@ -151,7 +151,7 @@ Requires: %{name}-devel = %{version}-%{release} perl gcc java > Requires: java-1.8.0-openjdk-devel perl-IO-Socket-IP xfsprogs > Requires: exportfs vim-minimal %{name}%{?_isa} = %{version}-%{release} > Requires: coreutils wireshark %{glibc32} > -Requires: perf time bc nfs-utils > +Requires: perf time bc nfs-utils binutils > Autoreq: 0 > Group: Development/System > > diff --git a/test/smoke/tst.ctf-intact.sh b/test/smoke/tst.ctf-intact.sh > new file mode 100755 > index 0000000000000..d737a2b162fcb > --- /dev/null > +++ b/test/smoke/tst.ctf-intact.sh > @@ -0,0 +1,58 @@ > +#!/bin/bash > +# > +# Oracle Linux DTrace. > +# Copyright (c) 2024, Oracle and/or its affiliates. All rights reserved. 2024 -> 2025 > +# Licensed under the Universal Permissive License v 1.0 as shown at > +# http://oss.oracle.com/licenses/upl. > +# > + > +# > +# This script verifies that the CTF, if present, is non-corrupt: in > +# particular, that it has at least one child with The rest of the comment is missing? > +# > + > +ctf="/lib/modules/$(uname -r)/kernel/vmlinux.ctfa" > + > +if [[ ! -f "$ctf" ]]; then > + echo "CTF not found in expected location of $ctf" >&2 > + exit 67 > +fi > + > +# If this is not an ELF file, turn it into one so objdump works. > +if ! [[ "$(file "$ctf")" =~ ELF ]]; then > + objcopy --add-section=.ctf="$ctf" /bin/true $tmpdir/ctf > + ctf=$tmpdir/ctf > +fi > + > +# Dump the CTF > +objdump --ctf --ctf-parent=shared_ctf "$ctf" 2>/dev/null | \ > + awk ' > +BEGIN { > + intypes=0; > +} > + > +/^ Strings:|^CTF archive member:/ { > + intypes = 0; > +} > +# Scan for each member, capture its name. > +/^CTF archive member: / { > + member=gensub (/CTF archive member: (.*):/,"\\1", "g"); > + next; > +} > +# See if any non-shared dicts have any types in. > +/^ Types:/ { > + if (member != "shared_ctf") { > + intypes=1; > + } > +} > +/^ 0x/ { > + if (intypes) { > + exit (0); > + } > +} > +END { > + if (!intypes) { > + printf ("No non-shared-dict types found: probably buggy deduplicator.\n"); > + exit (1); > + } > +}' > -- > 2.47.1.279.g84c5f4e78e > From eugene.loh at oracle.com Tue Mar 18 18:59:45 2025 From: eugene.loh at oracle.com (Eugene Loh) Date: Tue, 18 Mar 2025 14:59:45 -0400 Subject: [DTrace-devel] [PATCH] tests: add test for buggy deduplicator In-Reply-To: References: <20250109164711.232013-1-nick.alcock@oracle.com> Message-ID: <38e69a45-1a32-50c1-0eef-bcdf8cf10348@oracle.com> But if it is a dependency, it would be nice to know there is a problem. On 3/18/25 14:17, Kris Van Hees wrote: > This seems like a test that would belong in the kernel tree rather than in > DTrace? It seems like a weird tst to have included in DTrace because if it > fails, we cannot do anything about it in DTrace anyway. > > I'd rather leave out tests that exercise things that are not DTrace itself. > > On Thu, Jan 09, 2025 at 04:47:10PM +0000, Nick Alcock wrote: >> Some early prototype deduplicators dedupped types in one dict (more or less) >> rather than putting conflicting types and module types into >> sub-dictionaries. Fail if the running kernel is such a kernel. >> >> Signed-off-by: Nick Alcock >> --- >> dtrace.spec | 2 +- >> test/smoke/tst.ctf-intact.sh | 58 ++++++++++++++++++++++++++++++++++++ >> 2 files changed, 59 insertions(+), 1 deletion(-) >> create mode 100755 test/smoke/tst.ctf-intact.sh >> >> diff --git a/dtrace.spec b/dtrace.spec >> index 902ad7d8bb980..cf960f14b55c7 100644 >> --- a/dtrace.spec >> +++ b/dtrace.spec >> @@ -151,7 +151,7 @@ Requires: %{name}-devel = %{version}-%{release} perl gcc java >> Requires: java-1.8.0-openjdk-devel perl-IO-Socket-IP xfsprogs >> Requires: exportfs vim-minimal %{name}%{?_isa} = %{version}-%{release} >> Requires: coreutils wireshark %{glibc32} >> -Requires: perf time bc nfs-utils >> +Requires: perf time bc nfs-utils binutils >> Autoreq: 0 >> Group: Development/System >> >> diff --git a/test/smoke/tst.ctf-intact.sh b/test/smoke/tst.ctf-intact.sh >> new file mode 100755 >> index 0000000000000..d737a2b162fcb >> --- /dev/null >> +++ b/test/smoke/tst.ctf-intact.sh >> @@ -0,0 +1,58 @@ >> +#!/bin/bash >> +# >> +# Oracle Linux DTrace. >> +# Copyright (c) 2024, Oracle and/or its affiliates. All rights reserved. > 2024 -> 2025 > >> +# Licensed under the Universal Permissive License v 1.0 as shown at >> +# http://oss.oracle.com/licenses/upl. >> +# >> + >> +# >> +# This script verifies that the CTF, if present, is non-corrupt: in >> +# particular, that it has at least one child with > The rest of the comment is missing? > >> +# >> + >> +ctf="/lib/modules/$(uname -r)/kernel/vmlinux.ctfa" >> + >> +if [[ ! -f "$ctf" ]]; then >> + echo "CTF not found in expected location of $ctf" >&2 >> + exit 67 >> +fi >> + >> +# If this is not an ELF file, turn it into one so objdump works. >> +if ! [[ "$(file "$ctf")" =~ ELF ]]; then >> + objcopy --add-section=.ctf="$ctf" /bin/true $tmpdir/ctf >> + ctf=$tmpdir/ctf >> +fi >> + >> +# Dump the CTF >> +objdump --ctf --ctf-parent=shared_ctf "$ctf" 2>/dev/null | \ >> + awk ' >> +BEGIN { >> + intypes=0; >> +} >> + >> +/^ Strings:|^CTF archive member:/ { >> + intypes = 0; >> +} >> +# Scan for each member, capture its name. >> +/^CTF archive member: / { >> + member=gensub (/CTF archive member: (.*):/,"\\1", "g"); >> + next; >> +} >> +# See if any non-shared dicts have any types in. >> +/^ Types:/ { >> + if (member != "shared_ctf") { >> + intypes=1; >> + } >> +} >> +/^ 0x/ { >> + if (intypes) { >> + exit (0); >> + } >> +} >> +END { >> + if (!intypes) { >> + printf ("No non-shared-dict types found: probably buggy deduplicator.\n"); >> + exit (1); >> + } >> +}' >> -- >> 2.47.1.279.g84c5f4e78e >> From kris.van.hees at oracle.com Tue Mar 18 19:01:39 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Tue, 18 Mar 2025 15:01:39 -0400 Subject: [DTrace-devel] [PATCH] tests: add test for buggy deduplicator In-Reply-To: <38e69a45-1a32-50c1-0eef-bcdf8cf10348@oracle.com> References: <20250109164711.232013-1-nick.alcock@oracle.com> <38e69a45-1a32-50c1-0eef-bcdf8cf10348@oracle.com> Message-ID: On Tue, Mar 18, 2025 at 02:59:45PM -0400, Eugene Loh wrote: > But if it is a dependency, it would be nice to know there is a problem. I believe we would already know there is a problem because of other tests failing. > On 3/18/25 14:17, Kris Van Hees wrote: > > This seems like a test that would belong in the kernel tree rather than in > > DTrace? It seems like a weird tst to have included in DTrace because if it > > fails, we cannot do anything about it in DTrace anyway. > > > > I'd rather leave out tests that exercise things that are not DTrace itself. > > > > On Thu, Jan 09, 2025 at 04:47:10PM +0000, Nick Alcock wrote: > > > Some early prototype deduplicators dedupped types in one dict (more or less) > > > rather than putting conflicting types and module types into > > > sub-dictionaries. Fail if the running kernel is such a kernel. > > > > > > Signed-off-by: Nick Alcock > > > --- > > > dtrace.spec | 2 +- > > > test/smoke/tst.ctf-intact.sh | 58 ++++++++++++++++++++++++++++++++++++ > > > 2 files changed, 59 insertions(+), 1 deletion(-) > > > create mode 100755 test/smoke/tst.ctf-intact.sh > > > > > > diff --git a/dtrace.spec b/dtrace.spec > > > index 902ad7d8bb980..cf960f14b55c7 100644 > > > --- a/dtrace.spec > > > +++ b/dtrace.spec > > > @@ -151,7 +151,7 @@ Requires: %{name}-devel = %{version}-%{release} perl gcc java > > > Requires: java-1.8.0-openjdk-devel perl-IO-Socket-IP xfsprogs > > > Requires: exportfs vim-minimal %{name}%{?_isa} = %{version}-%{release} > > > Requires: coreutils wireshark %{glibc32} > > > -Requires: perf time bc nfs-utils > > > +Requires: perf time bc nfs-utils binutils > > > Autoreq: 0 > > > Group: Development/System > > > diff --git a/test/smoke/tst.ctf-intact.sh b/test/smoke/tst.ctf-intact.sh > > > new file mode 100755 > > > index 0000000000000..d737a2b162fcb > > > --- /dev/null > > > +++ b/test/smoke/tst.ctf-intact.sh > > > @@ -0,0 +1,58 @@ > > > +#!/bin/bash > > > +# > > > +# Oracle Linux DTrace. > > > +# Copyright (c) 2024, Oracle and/or its affiliates. All rights reserved. > > 2024 -> 2025 > > > > > +# Licensed under the Universal Permissive License v 1.0 as shown at > > > +# http://oss.oracle.com/licenses/upl. > > > +# > > > + > > > +# > > > +# This script verifies that the CTF, if present, is non-corrupt: in > > > +# particular, that it has at least one child with > > The rest of the comment is missing? > > > > > +# > > > + > > > +ctf="/lib/modules/$(uname -r)/kernel/vmlinux.ctfa" > > > + > > > +if [[ ! -f "$ctf" ]]; then > > > + echo "CTF not found in expected location of $ctf" >&2 > > > + exit 67 > > > +fi > > > + > > > +# If this is not an ELF file, turn it into one so objdump works. > > > +if ! [[ "$(file "$ctf")" =~ ELF ]]; then > > > + objcopy --add-section=.ctf="$ctf" /bin/true $tmpdir/ctf > > > + ctf=$tmpdir/ctf > > > +fi > > > + > > > +# Dump the CTF > > > +objdump --ctf --ctf-parent=shared_ctf "$ctf" 2>/dev/null | \ > > > + awk ' > > > +BEGIN { > > > + intypes=0; > > > +} > > > + > > > +/^ Strings:|^CTF archive member:/ { > > > + intypes = 0; > > > +} > > > +# Scan for each member, capture its name. > > > +/^CTF archive member: / { > > > + member=gensub (/CTF archive member: (.*):/,"\\1", "g"); > > > + next; > > > +} > > > +# See if any non-shared dicts have any types in. > > > +/^ Types:/ { > > > + if (member != "shared_ctf") { > > > + intypes=1; > > > + } > > > +} > > > +/^ 0x/ { > > > + if (intypes) { > > > + exit (0); > > > + } > > > +} > > > +END { > > > + if (!intypes) { > > > + printf ("No non-shared-dict types found: probably buggy deduplicator.\n"); > > > + exit (1); > > > + } > > > +}' > > > -- > > > 2.47.1.279.g84c5f4e78e > > > From kris.van.hees at oracle.com Tue Mar 18 19:18:27 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Tue, 18 Mar 2025 15:18:27 -0400 Subject: [DTrace-devel] [PATCH 4/4] test: Add test for predefined preprocessor definitions In-Reply-To: <20250208190622.23484-4-eugene.loh@oracle.com> References: <20250208190622.23484-1-eugene.loh@oracle.com> <20250208190622.23484-4-eugene.loh@oracle.com> Message-ID: On Sat, Feb 08, 2025 at 02:06:22PM -0500, eugene.loh at oracle.com wrote: > From: Eugene Loh > > Orabug: 28763074 > Signed-off-by: Eugene Loh Reviewed-by: Kris Van Hees ... and incidentally, should we add defines with ORCL instead of SUNW (but keep SUNW variants for backwards compatibility)? Or some other forms that do not include SUNW. Things like __DTRACE? > --- > COMMANDLINE-OPTIONS | 10 +- > test/unittest/preprocessor/tst.predefined.r | 1 + > test/unittest/preprocessor/tst.predefined.sh | 119 +++++++++++++++++++ > 3 files changed, 125 insertions(+), 5 deletions(-) > create mode 100644 test/unittest/preprocessor/tst.predefined.r > create mode 100755 test/unittest/preprocessor/tst.predefined.sh > > diff --git a/COMMANDLINE-OPTIONS b/COMMANDLINE-OPTIONS > index 40561af91..73be89b1f 100644 > --- a/COMMANDLINE-OPTIONS > +++ b/COMMANDLINE-OPTIONS > @@ -321,12 +321,12 @@ definitions are always specified and valid in all modes: > * __sparcv9 (on SPARC? systems only when 64?bit programs are compiled) > * __i386 (on x86 systems only when 32?bit programs are compiled) > * __amd64 (on x86 systems only when 64?bit programs are compiled) > - * _`uname -s` (for example, __Linux) > + * __`uname -s` (for example, __Linux) > * __SUNW_D=1 > - * _SUNW_D_VERSION=0x_MMmmmuuu (where MM is the Major release value > - in hexadecimal, mmm is the Minor release value in hexadecimal, > - and uuu is the Micro release value in hexadecimal; see Chapter > - 41, Versioning for more information about DTrace versioning) > + * _SUNW_D_VERSION=(MM << 24 | mmm << 12 | uuu), where > + MM is the Major release value > + mmm is the Minor release value > + uuu is the Micro release value > > -Z > Permit probe descriptions that match zero probes. If the -Z option is > diff --git a/test/unittest/preprocessor/tst.predefined.r b/test/unittest/preprocessor/tst.predefined.r > new file mode 100644 > index 000000000..2e9ba477f > --- /dev/null > +++ b/test/unittest/preprocessor/tst.predefined.r > @@ -0,0 +1 @@ > +success > diff --git a/test/unittest/preprocessor/tst.predefined.sh b/test/unittest/preprocessor/tst.predefined.sh > new file mode 100755 > index 000000000..79caf17ac > --- /dev/null > +++ b/test/unittest/preprocessor/tst.predefined.sh > @@ -0,0 +1,119 @@ > +#!/bin/bash > +# > +# Oracle Linux DTrace. > +# Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved. > +# Licensed under the Universal Permissive License v 1.0 as shown at > +# http://oss.oracle.com/licenses/upl. > +# > +# Confirm preprocessor pre-definitions. > + > +dtrace=$1 > + > +DIRNAME=$tmpdir/predefined.$$.$RANDOM > +mkdir -p $DIRNAME > +cd $DIRNAME > + > +# Arg 1 is macro that we check is defined. > + > +function check_defined() { > + # Add to script: #ifdef is okay, else is ERROR. > + echo '#ifdef' $1 >> D.d > + echo 'printf("'$1' okay\n");' >> D.d > + echo '#else' >> D.d > + echo 'printf("ERROR! missing '$1'\n");' >> D.d > + echo '#endif' >> D.d > + > + # Add to check file: expect "okay" message. > + echo $1 okay >> chk.txt > +} > + > +# Arg 1 is macro whose value we check to be arg 2. > + > +function check_value() { > + # Add to script: print value. > + echo 'printf("'$1'=%x\n", '$1');' >> D.d > + > + # Add to check file: expected value. > + echo $1=$2 >> chk.txt > +} > + > +# Arg 1 is macro that we check is not defined. > + > +function check_undef() { > + # Add to script: #ifdef is ERROR, else is okay. > + echo '#ifdef' $1 >> D.d > + echo 'printf("ERROR! found '$1'\n");' >> D.d > + echo '#else' >> D.d > + echo 'printf("missing '$1' is okay\n");' >> D.d > + echo '#endif' >> D.d > + > + # Add to check file: expect "okay" message. > + echo missing $1 is okay >> chk.txt > +} > + > +# Construct version string (major, minor, micro). > + > +read MM mmm uuu <<< `dtrace -vV | awk '/^This is DTrace / { gsub("\\\.", " "); print $(NF-2), $(NF-1), $NF }'` > +vers=`printf "%x" $(($MM << 24 | $mmm << 12 | $uuu))` > + > +# Start setting up the D script. > + > +echo 'BEGIN {' > D.d > + > +# Check for the preprocessor definitions of COMMANDLINE-OPTIONS. > + > +check_defined __linux > +check_defined __unix > +check_defined __SVR4 > +if [ `uname -m` == x86_64 ]; then > +check_defined __amd64 > +else > +check_undef __amd64 > +fi > +check_defined __`uname -s` > +check_value __SUNW_D 1 > +check_value __SUNW_D_VERSION $vers > + > +# Confirm other preprocessor definitions. > + > +check_defined __SUNW_D_64 > + > +# Confirm that __GNUC__ is not present. > + > +check_undef __GNUC__ > + > +# Finish setting up the D script. > + > +echo 'exit(0); }' >> D.d > +echo >> chk.txt > + > +# Run the D script. > + > +$dtrace $dt_flags -qCs D.d -o out.txt > +if [ $? -ne 0 ]; then > + echo ERROR: DTrace failed > + echo "==== D.d" > + cat D.d > + echo "==== out.txt" > + cat out.txt > + exit 1 > +fi > + > +# Check. > + > +if ! diff -q chk.txt out.txt; then > + echo ERROR output disagrees > + echo === expect === > + cat chk.txt > + echo === actual === > + cat out.txt > + echo === diff === > + diff chk.txt out.txt > + exit 1 > +fi > + > +# Indicate success. > + > +echo success > + > +exit 0 > -- > 2.43.5 > From kris.van.hees at oracle.com Tue Mar 18 19:29:12 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Tue, 18 Mar 2025 15:29:12 -0400 Subject: [DTrace-devel] [PATCH v2 1/4] Rename _DTRACE_VERSION In-Reply-To: <20250227190154.23241-1-eugene.loh@oracle.com> References: <20250227190154.23241-1-eugene.loh@oracle.com> Message-ID: On Thu, Feb 27, 2025 at 02:01:54PM -0500, eugene.loh at oracle.com wrote: > From: Eugene Loh > > There are many DTrace version numbers (for version, API version, > package version, etc.). Meanwhile, _DTRACE_VERSION is not a > version number at all. It's a preprocessor macro in USDT .h header > files. Prior to commit e2fb0ecd9 > ("Ensure multiple passes through dtrace -G work."), it was perhaps > not even set. With that commit, it was always set to 1, with > the rationale: > > Also add an explicit define for _DTRACE__VERSION in the generated __ -> _ (will fix when I merge) > header file from 'dtrace -h' invocations. This seems silly, but > it is there to give people a skeleton to work with if they want to > pre-generate header files and select whether to actually compile > on the probes at a later time. > > Rename to _DTRACE_USE_USDT for better clarity. Define it only once > per file. Place the definition inside an #ifndef test so that a > developer could set the value without manually changing the file. > > Signed-off-by: Eugene Loh Reviewed-by: Kris Van Hees > --- > libdtrace/dt_program.c | 10 +++++++--- > 1 file changed, 7 insertions(+), 3 deletions(-) > > diff --git a/libdtrace/dt_program.c b/libdtrace/dt_program.c > index 23b91fb2e..cfb7a5fc3 100644 > --- a/libdtrace/dt_program.c > +++ b/libdtrace/dt_program.c > @@ -505,13 +505,12 @@ dt_header_provider(dtrace_hdl_t *dtp, dt_provider_t *pvp, FILE *out) > info.dthi_pfname = alloca(strlen(pvp->desc.dtvd_name) + 1 + i); > dt_header_fmt_func(info.dthi_pfname, pvp->desc.dtvd_name); > > - if (fprintf(out, "#define _DTRACE_VERSION 1\n\n" > - "#if _DTRACE_VERSION\n\n") < 0) > + if (fprintf(out, "#if _DTRACE_USE_USDT\n\n") < 0) > return dt_set_errno(dtp, errno); > > if (dt_idhash_iter(pvp->pv_probes, dt_header_probe, &info) != 0) > return -1; /* dt_errno is set for us */ > - if (fprintf(out, "\n\n") < 0) > + if (fprintf(out, "\n") < 0) > return dt_set_errno(dtp, errno); > if (dt_idhash_iter(pvp->pv_probes, dt_header_decl, &info) != 0) > return -1; /* dt_errno is set for us */ > @@ -560,6 +559,11 @@ dtrace_program_header(dtrace_hdl_t *dtp, FILE *out, const char *fname) > "#endif\n\n") < 0) > return -1; > > + if (fprintf(out, "#ifndef _DTRACE_USE_USDT\n" > + "# define _DTRACE_USE_USDT 1\n" > + "#endif\n\n") < 0) > + return -1; > + > while ((pvp = dt_htab_next(dtp->dt_provs, &it)) != NULL) { > if (dt_header_provider(dtp, pvp, out) != 0) { > dt_htab_next_destroy(it); > -- > 2.43.5 > From eugene.loh at oracle.com Tue Mar 18 20:35:08 2025 From: eugene.loh at oracle.com (Eugene Loh) Date: Tue, 18 Mar 2025 16:35:08 -0400 Subject: [DTrace-devel] [PATCH 4/4] test: Add test for predefined preprocessor definitions In-Reply-To: References: <20250208190622.23484-1-eugene.loh@oracle.com> <20250208190622.23484-4-eugene.loh@oracle.com> Message-ID: <44b61914-f25d-c89c-5d9e-085bbab2fdb6@oracle.com> On 3/18/25 15:18, Kris Van Hees wrote: > On Sat, Feb 08, 2025 at 02:06:22PM -0500, eugene.loh at oracle.com wrote: >> From: Eugene Loh >> >> Orabug: 28763074 >> Signed-off-by: Eugene Loh > Reviewed-by: Kris Van Hees > > ... and incidentally, should we add defines with ORCL instead of SUNW > (but keep SUNW variants for backwards compatibility)? Or some other > forms that do not include SUNW. Things like __DTRACE? Separate patch, right?? And, we need to coordinate documentation. >> --- >> COMMANDLINE-OPTIONS | 10 +- >> test/unittest/preprocessor/tst.predefined.r | 1 + >> test/unittest/preprocessor/tst.predefined.sh | 119 +++++++++++++++++++ >> 3 files changed, 125 insertions(+), 5 deletions(-) >> create mode 100644 test/unittest/preprocessor/tst.predefined.r >> create mode 100755 test/unittest/preprocessor/tst.predefined.sh >> >> diff --git a/COMMANDLINE-OPTIONS b/COMMANDLINE-OPTIONS >> index 40561af91..73be89b1f 100644 >> --- a/COMMANDLINE-OPTIONS >> +++ b/COMMANDLINE-OPTIONS >> @@ -321,12 +321,12 @@ definitions are always specified and valid in all modes: >> * __sparcv9 (on SPARC? systems only when 64?bit programs are compiled) >> * __i386 (on x86 systems only when 32?bit programs are compiled) >> * __amd64 (on x86 systems only when 64?bit programs are compiled) >> - * _`uname -s` (for example, __Linux) >> + * __`uname -s` (for example, __Linux) >> * __SUNW_D=1 >> - * _SUNW_D_VERSION=0x_MMmmmuuu (where MM is the Major release value >> - in hexadecimal, mmm is the Minor release value in hexadecimal, >> - and uuu is the Micro release value in hexadecimal; see Chapter >> - 41, Versioning for more information about DTrace versioning) >> + * _SUNW_D_VERSION=(MM << 24 | mmm << 12 | uuu), where >> + MM is the Major release value >> + mmm is the Minor release value >> + uuu is the Micro release value >> >> -Z >> Permit probe descriptions that match zero probes. If the -Z option is >> diff --git a/test/unittest/preprocessor/tst.predefined.r b/test/unittest/preprocessor/tst.predefined.r >> new file mode 100644 >> index 000000000..2e9ba477f >> --- /dev/null >> +++ b/test/unittest/preprocessor/tst.predefined.r >> @@ -0,0 +1 @@ >> +success >> diff --git a/test/unittest/preprocessor/tst.predefined.sh b/test/unittest/preprocessor/tst.predefined.sh >> new file mode 100755 >> index 000000000..79caf17ac >> --- /dev/null >> +++ b/test/unittest/preprocessor/tst.predefined.sh >> @@ -0,0 +1,119 @@ >> +#!/bin/bash >> +# >> +# Oracle Linux DTrace. >> +# Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved. >> +# Licensed under the Universal Permissive License v 1.0 as shown at >> +# http://oss.oracle.com/licenses/upl. >> +# >> +# Confirm preprocessor pre-definitions. >> + >> +dtrace=$1 >> + >> +DIRNAME=$tmpdir/predefined.$$.$RANDOM >> +mkdir -p $DIRNAME >> +cd $DIRNAME >> + >> +# Arg 1 is macro that we check is defined. >> + >> +function check_defined() { >> + # Add to script: #ifdef is okay, else is ERROR. >> + echo '#ifdef' $1 >> D.d >> + echo 'printf("'$1' okay\n");' >> D.d >> + echo '#else' >> D.d >> + echo 'printf("ERROR! missing '$1'\n");' >> D.d >> + echo '#endif' >> D.d >> + >> + # Add to check file: expect "okay" message. >> + echo $1 okay >> chk.txt >> +} >> + >> +# Arg 1 is macro whose value we check to be arg 2. >> + >> +function check_value() { >> + # Add to script: print value. >> + echo 'printf("'$1'=%x\n", '$1');' >> D.d >> + >> + # Add to check file: expected value. >> + echo $1=$2 >> chk.txt >> +} >> + >> +# Arg 1 is macro that we check is not defined. >> + >> +function check_undef() { >> + # Add to script: #ifdef is ERROR, else is okay. >> + echo '#ifdef' $1 >> D.d >> + echo 'printf("ERROR! found '$1'\n");' >> D.d >> + echo '#else' >> D.d >> + echo 'printf("missing '$1' is okay\n");' >> D.d >> + echo '#endif' >> D.d >> + >> + # Add to check file: expect "okay" message. >> + echo missing $1 is okay >> chk.txt >> +} >> + >> +# Construct version string (major, minor, micro). >> + >> +read MM mmm uuu <<< `dtrace -vV | awk '/^This is DTrace / { gsub("\\\.", " "); print $(NF-2), $(NF-1), $NF }'` >> +vers=`printf "%x" $(($MM << 24 | $mmm << 12 | $uuu))` >> + >> +# Start setting up the D script. >> + >> +echo 'BEGIN {' > D.d >> + >> +# Check for the preprocessor definitions of COMMANDLINE-OPTIONS. >> + >> +check_defined __linux >> +check_defined __unix >> +check_defined __SVR4 >> +if [ `uname -m` == x86_64 ]; then >> +check_defined __amd64 >> +else >> +check_undef __amd64 >> +fi >> +check_defined __`uname -s` >> +check_value __SUNW_D 1 >> +check_value __SUNW_D_VERSION $vers >> + >> +# Confirm other preprocessor definitions. >> + >> +check_defined __SUNW_D_64 >> + >> +# Confirm that __GNUC__ is not present. >> + >> +check_undef __GNUC__ >> + >> +# Finish setting up the D script. >> + >> +echo 'exit(0); }' >> D.d >> +echo >> chk.txt >> + >> +# Run the D script. >> + >> +$dtrace $dt_flags -qCs D.d -o out.txt >> +if [ $? -ne 0 ]; then >> + echo ERROR: DTrace failed >> + echo "==== D.d" >> + cat D.d >> + echo "==== out.txt" >> + cat out.txt >> + exit 1 >> +fi >> + >> +# Check. >> + >> +if ! diff -q chk.txt out.txt; then >> + echo ERROR output disagrees >> + echo === expect === >> + cat chk.txt >> + echo === actual === >> + cat out.txt >> + echo === diff === >> + diff chk.txt out.txt >> + exit 1 >> +fi >> + >> +# Indicate success. >> + >> +echo success >> + >> +exit 0 >> -- >> 2.43.5 >> From kris.van.hees at oracle.com Tue Mar 18 20:42:58 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Tue, 18 Mar 2025 16:42:58 -0400 Subject: [DTrace-devel] [PATCH 4/4] test: Add test for predefined preprocessor definitions In-Reply-To: <44b61914-f25d-c89c-5d9e-085bbab2fdb6@oracle.com> References: <20250208190622.23484-1-eugene.loh@oracle.com> <20250208190622.23484-4-eugene.loh@oracle.com> <44b61914-f25d-c89c-5d9e-085bbab2fdb6@oracle.com> Message-ID: On Tue, Mar 18, 2025 at 04:35:08PM -0400, Eugene Loh wrote: > On 3/18/25 15:18, Kris Van Hees wrote: > > > On Sat, Feb 08, 2025 at 02:06:22PM -0500, eugene.loh at oracle.com wrote: > > > From: Eugene Loh > > > > > > Orabug: 28763074 > > > Signed-off-by: Eugene Loh > > Reviewed-by: Kris Van Hees > > > > ... and incidentally, should we add defines with ORCL instead of SUNW > > (but keep SUNW variants for backwards compatibility)? Or some other > > forms that do not include SUNW. Things like __DTRACE? > > Separate patch, right?? And, we need to coordinate documentation. Yes, definitely separate patch. > > > --- > > > COMMANDLINE-OPTIONS | 10 +- > > > test/unittest/preprocessor/tst.predefined.r | 1 + > > > test/unittest/preprocessor/tst.predefined.sh | 119 +++++++++++++++++++ > > > 3 files changed, 125 insertions(+), 5 deletions(-) > > > create mode 100644 test/unittest/preprocessor/tst.predefined.r > > > create mode 100755 test/unittest/preprocessor/tst.predefined.sh > > > > > > diff --git a/COMMANDLINE-OPTIONS b/COMMANDLINE-OPTIONS > > > index 40561af91..73be89b1f 100644 > > > --- a/COMMANDLINE-OPTIONS > > > +++ b/COMMANDLINE-OPTIONS > > > @@ -321,12 +321,12 @@ definitions are always specified and valid in all modes: > > > * __sparcv9 (on SPARC? systems only when 64???bit programs are compiled) > > > * __i386 (on x86 systems only when 32???bit programs are compiled) > > > * __amd64 (on x86 systems only when 64???bit programs are compiled) > > > - * _`uname -s` (for example, __Linux) > > > + * __`uname -s` (for example, __Linux) > > > * __SUNW_D=1 > > > - * _SUNW_D_VERSION=0x_MMmmmuuu (where MM is the Major release value > > > - in hexadecimal, mmm is the Minor release value in hexadecimal, > > > - and uuu is the Micro release value in hexadecimal; see Chapter > > > - 41, Versioning for more information about DTrace versioning) > > > + * _SUNW_D_VERSION=(MM << 24 | mmm << 12 | uuu), where > > > + MM is the Major release value > > > + mmm is the Minor release value > > > + uuu is the Micro release value > > > -Z > > > Permit probe descriptions that match zero probes. If the -Z option is > > > diff --git a/test/unittest/preprocessor/tst.predefined.r b/test/unittest/preprocessor/tst.predefined.r > > > new file mode 100644 > > > index 000000000..2e9ba477f > > > --- /dev/null > > > +++ b/test/unittest/preprocessor/tst.predefined.r > > > @@ -0,0 +1 @@ > > > +success > > > diff --git a/test/unittest/preprocessor/tst.predefined.sh b/test/unittest/preprocessor/tst.predefined.sh > > > new file mode 100755 > > > index 000000000..79caf17ac > > > --- /dev/null > > > +++ b/test/unittest/preprocessor/tst.predefined.sh > > > @@ -0,0 +1,119 @@ > > > +#!/bin/bash > > > +# > > > +# Oracle Linux DTrace. > > > +# Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved. > > > +# Licensed under the Universal Permissive License v 1.0 as shown at > > > +# http://oss.oracle.com/licenses/upl. > > > +# > > > +# Confirm preprocessor pre-definitions. > > > + > > > +dtrace=$1 > > > + > > > +DIRNAME=$tmpdir/predefined.$$.$RANDOM > > > +mkdir -p $DIRNAME > > > +cd $DIRNAME > > > + > > > +# Arg 1 is macro that we check is defined. > > > + > > > +function check_defined() { > > > + # Add to script: #ifdef is okay, else is ERROR. > > > + echo '#ifdef' $1 >> D.d > > > + echo 'printf("'$1' okay\n");' >> D.d > > > + echo '#else' >> D.d > > > + echo 'printf("ERROR! missing '$1'\n");' >> D.d > > > + echo '#endif' >> D.d > > > + > > > + # Add to check file: expect "okay" message. > > > + echo $1 okay >> chk.txt > > > +} > > > + > > > +# Arg 1 is macro whose value we check to be arg 2. > > > + > > > +function check_value() { > > > + # Add to script: print value. > > > + echo 'printf("'$1'=%x\n", '$1');' >> D.d > > > + > > > + # Add to check file: expected value. > > > + echo $1=$2 >> chk.txt > > > +} > > > + > > > +# Arg 1 is macro that we check is not defined. > > > + > > > +function check_undef() { > > > + # Add to script: #ifdef is ERROR, else is okay. > > > + echo '#ifdef' $1 >> D.d > > > + echo 'printf("ERROR! found '$1'\n");' >> D.d > > > + echo '#else' >> D.d > > > + echo 'printf("missing '$1' is okay\n");' >> D.d > > > + echo '#endif' >> D.d > > > + > > > + # Add to check file: expect "okay" message. > > > + echo missing $1 is okay >> chk.txt > > > +} > > > + > > > +# Construct version string (major, minor, micro). > > > + > > > +read MM mmm uuu <<< `dtrace -vV | awk '/^This is DTrace / { gsub("\\\.", " "); print $(NF-2), $(NF-1), $NF }'` > > > +vers=`printf "%x" $(($MM << 24 | $mmm << 12 | $uuu))` > > > + > > > +# Start setting up the D script. > > > + > > > +echo 'BEGIN {' > D.d > > > + > > > +# Check for the preprocessor definitions of COMMANDLINE-OPTIONS. > > > + > > > +check_defined __linux > > > +check_defined __unix > > > +check_defined __SVR4 > > > +if [ `uname -m` == x86_64 ]; then > > > +check_defined __amd64 > > > +else > > > +check_undef __amd64 > > > +fi > > > +check_defined __`uname -s` > > > +check_value __SUNW_D 1 > > > +check_value __SUNW_D_VERSION $vers > > > + > > > +# Confirm other preprocessor definitions. > > > + > > > +check_defined __SUNW_D_64 > > > + > > > +# Confirm that __GNUC__ is not present. > > > + > > > +check_undef __GNUC__ > > > + > > > +# Finish setting up the D script. > > > + > > > +echo 'exit(0); }' >> D.d > > > +echo >> chk.txt > > > + > > > +# Run the D script. > > > + > > > +$dtrace $dt_flags -qCs D.d -o out.txt > > > +if [ $? -ne 0 ]; then > > > + echo ERROR: DTrace failed > > > + echo "==== D.d" > > > + cat D.d > > > + echo "==== out.txt" > > > + cat out.txt > > > + exit 1 > > > +fi > > > + > > > +# Check. > > > + > > > +if ! diff -q chk.txt out.txt; then > > > + echo ERROR output disagrees > > > + echo === expect === > > > + cat chk.txt > > > + echo === actual === > > > + cat out.txt > > > + echo === diff === > > > + diff chk.txt out.txt > > > + exit 1 > > > +fi > > > + > > > +# Indicate success. > > > + > > > +echo success > > > + > > > +exit 0 > > > -- > > > 2.43.5 > > > From eugene.loh at oracle.com Wed Mar 19 06:32:26 2025 From: eugene.loh at oracle.com (eugene.loh at oracle.com) Date: Wed, 19 Mar 2025 02:32:26 -0400 Subject: [DTrace-devel] [PATCH] test: Account for pid:::entry ucaller being correct Message-ID: <20250319063230.28171-1-eugene.loh@oracle.com> From: Eugene Loh In commit f38bdf9ea ("test: Account for pid:::entry ustack() being correct") we accounted for x86-specific heuristics introduced in Linux 6.11 that dealt with pid:::entry uprobes firing so early in the function preamble that the frame pointer is not yet set and the caller is not (yet) correctly identified. Update a related test to account for the same effect with ucaller. Signed-off-by: Eugene Loh --- test/unittest/vars/tst.ucaller.r.p | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) create mode 100755 test/unittest/vars/tst.ucaller.r.p diff --git a/test/unittest/vars/tst.ucaller.r.p b/test/unittest/vars/tst.ucaller.r.p new file mode 100755 index 000000000..8e03f110d --- /dev/null +++ b/test/unittest/vars/tst.ucaller.r.p @@ -0,0 +1,28 @@ +#!/bin/sh + +# A pid entry probe places a uprobe on the first instruction of a function. +# Unfortunately, this is so early in the function preamble that the function +# frame pointer has not yet been established and the actual caller of the +# traced function is missed. +# +# In Linux 6.11, x86-specific heuristics are introduced to fix this problem. +# See commit cfa7f3d +# ("perf,x86: avoid missing caller address in stack traces captured in uprobe") +# for both a description of the problem and an explanation of the heuristics. +# +# Add post processing to these test results to allow for both cases: +# caller frame is missing or not missing. + +if [ $(uname -m) == "x86_64" ]; then + read MAJOR MINOR <<< `uname -r | grep -Eo '^[0-9]+\.[0-9]+' | tr '.' ' '` + + if [ $MAJOR -ge 6 ]; then + if [ $MAJOR -gt 6 -o $MINOR -ge 11 ]; then + awk '{ sub("myfunc_w", "myfunc_v"); print; }' + exit 0 + fi + fi +fi + +# Otherwise, just pass the output through. +cat -- 2.43.5 From eugene.loh at oracle.com Wed Mar 19 06:32:27 2025 From: eugene.loh at oracle.com (eugene.loh at oracle.com) Date: Wed, 19 Mar 2025 02:32:27 -0400 Subject: [DTrace-devel] [PATCH] Fix dt_bvar_probedesc() for late USDT processes In-Reply-To: <20250319063230.28171-1-eugene.loh@oracle.com> References: <20250319063230.28171-1-eugene.loh@oracle.com> Message-ID: <20250319063230.28171-2-eugene.loh@oracle.com> From: Eugene Loh With commit 8bd26415b ("bpf: separate bvar implementation into separate functions"), test/unittest/usdt/tst.nusdtprobes.sh started failing reproducibly on all platforms. In that patch, the get_bvar() function is factored into separate functions. It includes a change that looks basically like this: uint32_t key = mst->prid; if (key < ((uint64_t)&NPROBES)) { [...] } else { char *s = bpf_map_lookup_elem(&usdt_names, &key); switch (idx) { - case DIF_VAR_PROBENAME: s += DTRACE_FUNCNAMELEN; + case DIF_VAR_PROBEPROV: s += DTRACE_FUNCNAMELEN; - case DIF_VAR_PROBEFUNC: s += DTRACE_MODNAMELEN; + case DIF_VAR_PROBEMOD : s += DTRACE_MODNAMELEN; - case DIF_VAR_PROBEMOD : s += DTRACE_PROVNAMELEN; + case DIF_VAR_PROBEFUNC: s += DTRACE_PROVNAMELEN; - case DIF_VAR_PROBEPROV: + case DIF_VAR_PROBENAME: } return (uint64_t)s; } That is, for the case of key>=NPROBES (that is, for USDT probes that were added after the dtrace session was started), the meanings of prov, mod, func, and name were exchanged. Restore the correct meanings. Signed-off-by: Eugene Loh --- bpf/get_bvar.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/bpf/get_bvar.c b/bpf/get_bvar.c index c760126da..fadb06c00 100644 --- a/bpf/get_bvar.c +++ b/bpf/get_bvar.c @@ -185,13 +185,13 @@ noinline uint64_t dt_bvar_probedesc(const dt_dctx_t *dctx, uint32_t idx) return (uint64_t)dctx->strtab; switch (idx) { - case DIF_VAR_PROBEPROV: + case DIF_VAR_PROBENAME: s += DTRACE_FUNCNAMELEN; - case DIF_VAR_PROBEMOD: - s += DTRACE_MODNAMELEN; case DIF_VAR_PROBEFUNC: + s += DTRACE_MODNAMELEN; + case DIF_VAR_PROBEMOD: s += DTRACE_PROVNAMELEN; - case DIF_VAR_PROBENAME: + case DIF_VAR_PROBEPROV: } return (uint64_t)s; -- 2.43.5 From eugene.loh at oracle.com Wed Mar 19 06:32:28 2025 From: eugene.loh at oracle.com (eugene.loh at oracle.com) Date: Wed, 19 Mar 2025 02:32:28 -0400 Subject: [DTrace-devel] [PATCH] Copy fprobes entry args with BPF helper function In-Reply-To: <20250319063230.28171-1-eugene.loh@oracle.com> References: <20250319063230.28171-1-eugene.loh@oracle.com> Message-ID: <20250319063230.28171-3-eugene.loh@oracle.com> From: Eugene Loh With commit a6b626a89 ("Fix fprobe/kprobe selection"), fprobes were effectively turned on. Unfortunately, with this fix, some tests like test/unittest/stack/tst.stack_fbt.sh encountered problems on UEK7 since the BPF verifier would complain about the prototypes of some of the probe arguments. E.g., when loading arg3 in fprobe_trampoline() from fbt::vfs_write:entry from %r8+24, the BPF verifier complains func 'vfs_write' arg3 type INT is not a struct invalid bpf_context access off=24 size=8 We can bypass this problem by using a BPF helper function to copy the arguments onto the BPF stack and then load the arguments into mstate from there. There is also a BPF get_func_arg() helper function, but it is not introduced until 5.17 -- that is, it appears after UEK7. See Linux commit f92c1e1 ("bpf: Add get_func_[arg|ret|arg_cnt] helpers"). While the already mentioned test signals the problem and the fix, we also add an additional test that actually checks the correctness of the arguments. Signed-off-by: Eugene Loh --- libdtrace/dt_prov_fbt.c | 14 ++- test/unittest/fbtprovider/tst.entryargs2.r | 29 ++++++ test/unittest/fbtprovider/tst.entryargs2.sh | 105 ++++++++++++++++++++ 3 files changed, 147 insertions(+), 1 deletion(-) create mode 100644 test/unittest/fbtprovider/tst.entryargs2.r create mode 100755 test/unittest/fbtprovider/tst.entryargs2.sh diff --git a/libdtrace/dt_prov_fbt.c b/libdtrace/dt_prov_fbt.c index 8aa53d643..50fa0d9dc 100644 --- a/libdtrace/dt_prov_fbt.c +++ b/libdtrace/dt_prov_fbt.c @@ -285,8 +285,20 @@ static int fprobe_trampoline(dt_pcb_t *pcb, uint_t exitlbl) if (strcmp(pcb->pcb_probe->desc->prb, "entry") == 0) { int i; + /* + * We want to copy entry args from %r8 to %r7 (plus offsets). + * Unfortunately, for fprobes, the BPF verifier can reject + * certain argument types. We work around this by copying + * the arguments onto the BPF stack and loading them from there. + */ + emit(dlp, BPF_MOV_REG(BPF_REG_1, BPF_REG_FP)); + emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, DT_TRAMP_SP_SLOT(prp->argc - 1))); + emit(dlp, BPF_MOV_IMM(BPF_REG_2, 8 * prp->argc)); + emit(dlp, BPF_MOV_REG(BPF_REG_3, BPF_REG_8)); + emit(dlp, BPF_CALL_HELPER(dtp->dt_bpfhelper[BPF_FUNC_probe_read_kernel])); + for (i = 0; i < prp->argc; i++) { - emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_0, BPF_REG_8, i * 8)); + emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_0, BPF_REG_FP, DT_TRAMP_SP_SLOT(prp->argc - 1) + i * 8)); emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(i), BPF_REG_0)); } } else { diff --git a/test/unittest/fbtprovider/tst.entryargs2.r b/test/unittest/fbtprovider/tst.entryargs2.r new file mode 100644 index 000000000..efc4685f9 --- /dev/null +++ b/test/unittest/fbtprovider/tst.entryargs2.r @@ -0,0 +1,29 @@ +mode READ : no +mode WRITE : yes +mode LSEEK : yes +mode PREAD : yes +mode PWRITE : yes +mode WRITER : yes +mode CAN_READ : no +mode CAN_WRITE : yes +mode OPENED : yes +buf: ========================= +count: 8 +pos: 20 +abcdefghijklmnopqrst========CDEFGHIJKLMNOPQRSTUVWXYZ0123456789 + +mode READ : yes +mode WRITE : yes +mode LSEEK : yes +mode PREAD : yes +mode PWRITE : yes +mode WRITER : yes +mode CAN_READ : yes +mode CAN_WRITE : yes +mode OPENED : yes +buf: ========================= +count: 8 +pos: 20 +abcdefghijklmnopqrst========CDEFGHIJKLMNOPQRSTUVWXYZ0123456789 + +success diff --git a/test/unittest/fbtprovider/tst.entryargs2.sh b/test/unittest/fbtprovider/tst.entryargs2.sh new file mode 100755 index 000000000..f5b435f56 --- /dev/null +++ b/test/unittest/fbtprovider/tst.entryargs2.sh @@ -0,0 +1,105 @@ +#!/bin/bash +# +# Oracle Linux DTrace. +# Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved. +# Licensed under the Universal Permissive License v 1.0 as shown at +# http://oss.oracle.com/licenses/upl. +# +# Another test of entry args. +# + +dtrace=$1 +CC=${CC:-/usr/bin/gcc} + +# Set up test directory. + +DIRNAME=$tmpdir/entryargs2.$$.$RANDOM +mkdir -p $DIRNAME +cd $DIRNAME + +# Make the trigger. + +cat << EOF > main.c +#include +#include // open() +#include // lseek(), write(), close() + +int main(int c, char **v) { + int fd = open("tmp.txt", c == 1 ? O_WRONLY : O_RDWR); + + if (fd == -1) + return 1; + + /* Move the offset, then write to the file. */ + /* (We will overwrite some "middle section" of the file with "========".) */ + lseek(fd, 20, SEEK_SET); + write(fd, "=========================", 8); + close(fd); + + return 0; +} +EOF + +# Build the trigger. FIXME do consistent with Sam's changes. + +$CC $test_cppflags $test_ldflags main.c +if [ $? -ne 0 ]; then + echo "failed to link final executable" >&2 + exit 1 +fi + +# Prepare the D script. + +cat << EOF > D.d +/* these definitions come from kernel header include/linux/fs.h */ +#define FMODE_READ (1 << 0) +#define FMODE_WRITE (1 << 1) +#define FMODE_LSEEK (1 << 2) +#define FMODE_PREAD (1 << 3) +#define FMODE_PWRITE (1 << 4) + +#define FMODE_WRITER (1 << 16) +#define FMODE_CAN_READ (1 << 17) +#define FMODE_CAN_WRITE (1 << 18) +#define FMODE_OPENED (1 << 19) + +fbt::vfs_write:entry +/pid == \$target/ +{ + mode = ((struct file *)arg0)->f_mode; + printf("mode READ : %s\n", mode & FMODE_READ ? "yes" : "no"); + printf("mode WRITE : %s\n", mode & FMODE_WRITE ? "yes" : "no"); + printf("mode LSEEK : %s\n", mode & FMODE_LSEEK ? "yes" : "no"); + printf("mode PREAD : %s\n", mode & FMODE_PREAD ? "yes" : "no"); + printf("mode PWRITE : %s\n", mode & FMODE_PWRITE ? "yes" : "no"); + printf("mode WRITER : %s\n", mode & FMODE_WRITER ? "yes" : "no"); + printf("mode CAN_READ : %s\n", mode & FMODE_CAN_READ ? "yes" : "no"); + printf("mode CAN_WRITE : %s\n", mode & FMODE_CAN_WRITE ? "yes" : "no"); + printf("mode OPENED : %s\n", mode & FMODE_OPENED ? "yes" : "no"); + + printf("buf: %s\n", stringof(copyinstr(arg1))); + printf("count: %d\n", arg2); + printf("pos: %d", *((loff_t *)arg3)); + exit(0); +} +EOF + +# Run the D script and trigger twice, once with O_WRONLY and then O_RDWR. + +for args in "" "dummy"; do + + # Prepare the file to be (over)written. + rm -f tmp.txt + echo abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789 > tmp.txt + + # Run the D script and trigger. + $dtrace $dt_flags -c "./a.out $args" -Cqs D.d + + # Report the output file. + cat tmp.txt + echo + +done + +echo success +exit 0 -- 2.43.5 From eugene.loh at oracle.com Wed Mar 19 06:32:29 2025 From: eugene.loh at oracle.com (eugene.loh at oracle.com) Date: Wed, 19 Mar 2025 02:32:29 -0400 Subject: [DTrace-devel] [PATCH] test: Expect USDT argmap to fail on ARM on older kernels In-Reply-To: <20250319063230.28171-1-eugene.loh@oracle.com> References: <20250319063230.28171-1-eugene.loh@oracle.com> Message-ID: <20250319063230.28171-4-eugene.loh@oracle.com> From: Eugene Loh Signed-off-by: Eugene Loh --- test/unittest/usdt/skip_arm_uek6.x | 25 +++++++++++++++++++ .../usdt/tst.argmap-typed-partial.aarch64.x | 1 + test/unittest/usdt/tst.argmap-typed.aarch64.x | 1 + .../tst.multiprov-dupprobe-fire.aarch64.x | 1 + .../tst.multiprov-dupprobe-shlibs.aarch64.x | 1 + .../usdt/tst.multiprovider-fire.aarch64.x | 1 + 6 files changed, 30 insertions(+) create mode 100755 test/unittest/usdt/skip_arm_uek6.x create mode 120000 test/unittest/usdt/tst.argmap-typed-partial.aarch64.x create mode 120000 test/unittest/usdt/tst.argmap-typed.aarch64.x create mode 120000 test/unittest/usdt/tst.multiprov-dupprobe-fire.aarch64.x create mode 120000 test/unittest/usdt/tst.multiprov-dupprobe-shlibs.aarch64.x create mode 120000 test/unittest/usdt/tst.multiprovider-fire.aarch64.x diff --git a/test/unittest/usdt/skip_arm_uek6.x b/test/unittest/usdt/skip_arm_uek6.x new file mode 100755 index 000000000..252cbebb5 --- /dev/null +++ b/test/unittest/usdt/skip_arm_uek6.x @@ -0,0 +1,25 @@ +#!/bin/bash +# Licensed under the Universal Permissive License v 1.0 as shown at +# http://oss.oracle.com/licenses/upl. +# +# @@skip: not run directly by test harness +# +# Tests that depend on USDT argument translation fail on ARM for UEK6. +# They're fine for UEK7. It is unclear in exactly which kernel they +# start working. + +if [[ `uname -m` != "aarch64" ]]; then + exit 0 +fi + +read MAJOR MINOR <<< `uname -r | grep -Eo '^[0-9]+\.[0-9]+' | tr '.' ' '` + +if [ $MAJOR -gt 5 ]; then + exit 0 +fi +if [ $MAJOR -eq 5 -a $MINOR -ge 10 ]; then + exit 0 +fi + +echo "USDT argmap not working on ARM on older kernels" +exit 1 diff --git a/test/unittest/usdt/tst.argmap-typed-partial.aarch64.x b/test/unittest/usdt/tst.argmap-typed-partial.aarch64.x new file mode 120000 index 000000000..8d462f98f --- /dev/null +++ b/test/unittest/usdt/tst.argmap-typed-partial.aarch64.x @@ -0,0 +1 @@ +skip_arm_uek6.x \ No newline at end of file diff --git a/test/unittest/usdt/tst.argmap-typed.aarch64.x b/test/unittest/usdt/tst.argmap-typed.aarch64.x new file mode 120000 index 000000000..8d462f98f --- /dev/null +++ b/test/unittest/usdt/tst.argmap-typed.aarch64.x @@ -0,0 +1 @@ +skip_arm_uek6.x \ No newline at end of file diff --git a/test/unittest/usdt/tst.multiprov-dupprobe-fire.aarch64.x b/test/unittest/usdt/tst.multiprov-dupprobe-fire.aarch64.x new file mode 120000 index 000000000..8d462f98f --- /dev/null +++ b/test/unittest/usdt/tst.multiprov-dupprobe-fire.aarch64.x @@ -0,0 +1 @@ +skip_arm_uek6.x \ No newline at end of file diff --git a/test/unittest/usdt/tst.multiprov-dupprobe-shlibs.aarch64.x b/test/unittest/usdt/tst.multiprov-dupprobe-shlibs.aarch64.x new file mode 120000 index 000000000..8d462f98f --- /dev/null +++ b/test/unittest/usdt/tst.multiprov-dupprobe-shlibs.aarch64.x @@ -0,0 +1 @@ +skip_arm_uek6.x \ No newline at end of file diff --git a/test/unittest/usdt/tst.multiprovider-fire.aarch64.x b/test/unittest/usdt/tst.multiprovider-fire.aarch64.x new file mode 120000 index 000000000..8d462f98f --- /dev/null +++ b/test/unittest/usdt/tst.multiprovider-fire.aarch64.x @@ -0,0 +1 @@ +skip_arm_uek6.x \ No newline at end of file -- 2.43.5 From eugene.loh at oracle.com Wed Mar 19 06:32:30 2025 From: eugene.loh at oracle.com (eugene.loh at oracle.com) Date: Wed, 19 Mar 2025 02:32:30 -0400 Subject: [DTrace-devel] [PATCH] Get execargs from user space In-Reply-To: <20250319063230.28171-1-eugene.loh@oracle.com> References: <20250319063230.28171-1-eugene.loh@oracle.com> Message-ID: <20250319063230.28171-5-eugene.loh@oracle.com> From: Eugene Loh Signed-off-by: Eugene Loh --- bpf/bvar_execargs.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/bpf/bvar_execargs.S b/bpf/bvar_execargs.S index 1c47cafb2..08844f15f 100644 --- a/bpf/bvar_execargs.S +++ b/bpf/bvar_execargs.S @@ -65,7 +65,7 @@ dt_bvar_execargs: mov %r1, %r9 mov %r2, %r8 mov %r3, %r7 - call BPF_FUNC_probe_read /* bpf_probe_read(&args, len + 1, arg_start) */ + call BPF_FUNC_probe_read_user /* bpf_probe_read(&args, len + 1, arg_start) */ jne %r0, 0, .Lerror /* loop over args and replace '\0' with ' ' */ -- 2.43.5 From kris.van.hees at oracle.com Wed Mar 19 14:40:34 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Wed, 19 Mar 2025 10:40:34 -0400 Subject: [DTrace-devel] [PATCH 1/2] Clarify how the usdt_prids key is stored on the BPF stack In-Reply-To: <20250220044350.14953-1-eugene.loh@oracle.com> References: <20250220044350.14953-1-eugene.loh@oracle.com> Message-ID: On Wed, Feb 19, 2025 at 11:43:49PM -0500, eugene.loh at oracle.com wrote: > > While one can access the BPF stack relative to %r9, the whole > point of DT_TRAMP_SP_SLOT(0) is to make trampoline code more > readable. So use it. > > Signed-off-by: Eugene Loh Reviewed-by: Kris Van Hees > --- > libdtrace/dt_prov_uprobe.c | 21 +++++++-------------- > 1 file changed, 7 insertions(+), 14 deletions(-) > > diff --git a/libdtrace/dt_prov_uprobe.c b/libdtrace/dt_prov_uprobe.c > index 5d9f74244..f1323cc31 100644 > --- a/libdtrace/dt_prov_uprobe.c > +++ b/libdtrace/dt_prov_uprobe.c > @@ -1015,22 +1015,15 @@ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl) > emit(dlp, BPF_ALU64_IMM(BPF_RSH, BPF_REG_0, 32)); > > /* > - * Look up in the BPF 'usdt_prids' map. Space for the look-up key > - * will be used on the BPF stack: > - * > - * offset value > - * > - * -sizeof(usdt_prids_map_key_t) pid (in %r0) > - * > - * -sizeof(usdt_prids_map_key_t) + sizeof(pid_t) > - * == > - * -sizeof(dtrace_id_t) underlying-probe prid > + * Look up in the BPF 'usdt_prids' map. The key should fit into > + * trampoline stack slot 0. > */ > - emit(dlp, BPF_STORE(BPF_W, BPF_REG_9, (int)(-sizeof(usdt_prids_map_key_t)), BPF_REG_0)); > - emit(dlp, BPF_STORE_IMM(BPF_W, BPF_REG_9, (int)(-sizeof(dtrace_id_t)), uprp->desc->id)); > + assert(sizeof(usdt_prids_map_key_t) <= DT_STK_SLOT_SZ); > + emit(dlp, BPF_STORE(BPF_W, BPF_REG_FP, DT_TRAMP_SP_SLOT(0), BPF_REG_0)); > + emit(dlp, BPF_STORE_IMM(BPF_W, BPF_REG_FP, DT_TRAMP_SP_SLOT(0) + sizeof(pid_t), uprp->desc->id)); > dt_cg_xsetx(dlp, usdt_prids, DT_LBL_NONE, BPF_REG_1, usdt_prids->di_id); > - emit(dlp, BPF_MOV_REG(BPF_REG_2, BPF_REG_9)); > - emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, (int)(-sizeof(usdt_prids_map_key_t)))); > + emit(dlp, BPF_MOV_REG(BPF_REG_2, BPF_REG_FP)); > + emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, DT_TRAMP_SP_SLOT(0))); > emit(dlp, BPF_CALL_HELPER(BPF_FUNC_map_lookup_elem)); > emit(dlp, BPF_BRANCH_IMM(BPF_JEQ, BPF_REG_0, 0, lbl_exit)); > > -- > 2.43.5 > > From kris.van.hees at oracle.com Wed Mar 19 14:48:41 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Wed, 19 Mar 2025 10:48:41 -0400 Subject: [DTrace-devel] [PATCH] Fix format specifier in dtprobed.c In-Reply-To: <20250220232730.25029-1-eugene.loh@oracle.com> References: <20250220232730.25029-1-eugene.loh@oracle.com> Message-ID: On Thu, Feb 20, 2025 at 06:27:30PM -0500, eugene.loh at oracle.com wrote: > > The format specifier is %i but nprobes is size_t. Some compilers > issue warnings. Change the format specifier to match the type. > > Signed-off-by: Eugene Loh Reviewed-by: Kris Van Hees > --- > dtprobed/dtprobed.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/dtprobed/dtprobed.c b/dtprobed/dtprobed.c > index 7857b3200..5f260f0a4 100644 > --- a/dtprobed/dtprobed.c > +++ b/dtprobed/dtprobed.c > @@ -787,7 +787,7 @@ process_dof(pid_t pid, int out, int in, dev_t dev, ino_t inum, dev_t exec_dev, > if (dof_stash_push_parsed(&accum, provider) < 0) > goto oom; > > - fuse_log(FUSE_LOG_DEBUG, "Parser read: provider %s, %i probes\n", > + fuse_log(FUSE_LOG_DEBUG, "Parser read: provider %s, %li probes\n", > provider->provider.name, provider->provider.nprobes); > > for (i = 0; i < provider->provider.nprobes; i++) { > -- > 2.43.5 > > From kris.van.hees at oracle.com Wed Mar 19 15:18:03 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Wed, 19 Mar 2025 11:18:03 -0400 Subject: [DTrace-devel] [PATCH 1/2] Clarify how the usdt_prids key is stored on the BPF stack In-Reply-To: References: <20250220044350.14953-1-eugene.loh@oracle.com> Message-ID: On Wed, Mar 19, 2025 at 10:40:34AM -0400, Kris Van Hees via DTrace-devel wrote: > On Wed, Feb 19, 2025 at 11:43:49PM -0500, eugene.loh at oracle.com wrote: > > > > While one can access the BPF stack relative to %r9, the whole > > point of DT_TRAMP_SP_SLOT(0) is to make trampoline code more > > readable. So use it. > > > > Signed-off-by: Eugene Loh > > Reviewed-by: Kris Van Hees Still applies but see below... > > --- > > libdtrace/dt_prov_uprobe.c | 21 +++++++-------------- > > 1 file changed, 7 insertions(+), 14 deletions(-) > > > > diff --git a/libdtrace/dt_prov_uprobe.c b/libdtrace/dt_prov_uprobe.c > > index 5d9f74244..f1323cc31 100644 > > --- a/libdtrace/dt_prov_uprobe.c > > +++ b/libdtrace/dt_prov_uprobe.c > > @@ -1015,22 +1015,15 @@ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl) > > emit(dlp, BPF_ALU64_IMM(BPF_RSH, BPF_REG_0, 32)); > > > > /* > > - * Look up in the BPF 'usdt_prids' map. Space for the look-up key > > - * will be used on the BPF stack: > > - * > > - * offset value > > - * > > - * -sizeof(usdt_prids_map_key_t) pid (in %r0) > > - * > > - * -sizeof(usdt_prids_map_key_t) + sizeof(pid_t) > > - * == > > - * -sizeof(dtrace_id_t) underlying-probe prid > > + * Look up in the BPF 'usdt_prids' map. The key should fit into > > + * trampoline stack slot 0. > > */ > > - emit(dlp, BPF_STORE(BPF_W, BPF_REG_9, (int)(-sizeof(usdt_prids_map_key_t)), BPF_REG_0)); > > - emit(dlp, BPF_STORE_IMM(BPF_W, BPF_REG_9, (int)(-sizeof(dtrace_id_t)), uprp->desc->id)); > > + assert(sizeof(usdt_prids_map_key_t) <= DT_STK_SLOT_SZ); > > + emit(dlp, BPF_STORE(BPF_W, BPF_REG_FP, DT_TRAMP_SP_SLOT(0), BPF_REG_0)); > > + emit(dlp, BPF_STORE_IMM(BPF_W, BPF_REG_FP, DT_TRAMP_SP_SLOT(0) + sizeof(pid_t), uprp->desc->id)); I get a compiler warning here: libdtrace/dt_prov_uprobe.c: In function ??~trampoline??T: include/bpf_asm.h:119:24: warning: overflow in conversion from ??~long unsigned int??T to ??~short int??T changes value from ??~18446744073709551524??T to ??~-92??T [-Woverflo] 119 | .off = (ofs), \ | ^ libdtrace/dt_as.h:42:69: note: in definition of macro ??~emitle??T 42 | dt_irnode_t *dip = dt_cg_node_alloc((lbl), (instr)); \ | ^~~~~ libdtrace/dt_prov_uprobe.c:1013:9: note: in expansion of macro ??~emit??T 1013 | emit(dlp, BPF_STORE_IMM(BPF_W, BPF_REG_FP, DT_TRAMP_SP_SLOT(0) + sizeof(pid_t), uprp->desc->id)); | ^~~~ libdtrace/dt_prov_uprobe.c:1013:20: note: in expansion of macro ??~BPF_STORE_IMM??T 1013 | emit(dlp, BPF_STORE_IMM(BPF_W, BPF_REG_FP, DT_TRAMP_SP_SLOT(0) + sizeof(pid_t), uprp->desc->id)); You need a (int) cast for sizeof(pid_t) similar to the casts that were in the code before. I'll add it in as I merge. > > dt_cg_xsetx(dlp, usdt_prids, DT_LBL_NONE, BPF_REG_1, usdt_prids->di_id); > > - emit(dlp, BPF_MOV_REG(BPF_REG_2, BPF_REG_9)); > > - emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, (int)(-sizeof(usdt_prids_map_key_t)))); > > + emit(dlp, BPF_MOV_REG(BPF_REG_2, BPF_REG_FP)); > > + emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, DT_TRAMP_SP_SLOT(0))); > > emit(dlp, BPF_CALL_HELPER(BPF_FUNC_map_lookup_elem)); > > emit(dlp, BPF_BRANCH_IMM(BPF_JEQ, BPF_REG_0, 0, lbl_exit)); > > > > -- > > 2.43.5 > > > > > > _______________________________________________ > DTrace-devel mailing list > DTrace-devel at oss.oracle.com > https://oss.oracle.com/mailman/listinfo/dtrace-devel From kris.van.hees at oracle.com Wed Mar 19 15:23:23 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Wed, 19 Mar 2025 11:23:23 -0400 Subject: [DTrace-devel] [PATCH] test: Check tid value In-Reply-To: <20250224201922.12992-1-eugene.loh@oracle.com> References: <20250224201922.12992-1-eugene.loh@oracle.com> Message-ID: On Mon, Feb 24, 2025 at 03:19:22PM -0500, eugene.loh at oracle.com wrote: > > We were checking the built-in variable tid simply by testing > that we could print it and its value was not -1. > > Add a test that confirms the value is actually correct; > compare to C output of gettid(). > > In line with other similar tests, also check for the profile > provider. > > While we're at it, check the pid value and the pthread_t value > returned via pthread_create(). > > Signed-off-by: Eugene Loh Reviewed-by: Kris Van Hees > --- > test/unittest/builtinvar/tst.tid_pid.r | 1 + > test/unittest/builtinvar/tst.tid_pid.sh | 126 ++++++++++++++++++++++++ > 2 files changed, 127 insertions(+) > create mode 100644 test/unittest/builtinvar/tst.tid_pid.r > create mode 100755 test/unittest/builtinvar/tst.tid_pid.sh > > diff --git a/test/unittest/builtinvar/tst.tid_pid.r b/test/unittest/builtinvar/tst.tid_pid.r > new file mode 100644 > index 000000000..2e9ba477f > --- /dev/null > +++ b/test/unittest/builtinvar/tst.tid_pid.r > @@ -0,0 +1 @@ > +success > diff --git a/test/unittest/builtinvar/tst.tid_pid.sh b/test/unittest/builtinvar/tst.tid_pid.sh > new file mode 100755 > index 000000000..7ff0227fe > --- /dev/null > +++ b/test/unittest/builtinvar/tst.tid_pid.sh > @@ -0,0 +1,126 @@ > +#!/bin/bash > +# > +# Oracle Linux DTrace. > +# Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved. > +# Licensed under the Universal Permissive License v 1.0 as shown at > +# http://oss.oracle.com/licenses/upl. > +# > + > +dtrace=$1 > +CC=/usr/bin/gcc > + > +DIRNAME="$tmpdir/builtinvar-tid_pid.$$.$RANDOM" > +mkdir -p $DIRNAME > +cd $DIRNAME > + > +# Create trigger program. > + > +cat << EOF > main.c > +#include > +#include > +#include > +#include > + > +/* Provide an implementation in case glibc is too old. */ > +pid_t gettid(void) > +{ > + return syscall(__NR_gettid); > +} > + > +static void * foo(void *arg) { > + int i = 0; > + > + /* > + * Each thread reports the pid and tid values it expects. > + * (Expect the same values for both the pid and profile probes.) > + */ > + printf("pid probe expect pid %d tid %d\n", getpid(), gettid()); > + printf("profile probe expect pid %d tid %d\n", getpid(), gettid()); > + fflush(stdout); > + > + /* Wait endlessly. DTrace will kill me when it is done. */ > + while (i < 2) > + i ^= 1; > + > + return 0; > +} > + > +int main(int c, char **v) { > + pthread_t mythr; > + > + /* Create a thread. */ > + pthread_create(&mythr, NULL, &foo, NULL); > + > + /* Also report the pthread_t. */ > + printf("created pthread_t %lld\n\n", mythr); > + fflush(stdout); > + > + /* Wait endlessly. DTrace will kill me when it is done. */ > + pthread_join(mythr, NULL); > + > + return 0; > +} > +EOF > + > +# Compile the trigger program. > + > +$CC $test_cppflags main.c -lpthread > +if [ $? -ne 0 ]; then > + echo compilation failed > + exit 1 > +fi > + > +# Run DTrace. > + > +rm -f C.out D.out > +$dtrace $dt_flags -o D.out -c ./a.out -qn ' > +/* Report pid and tid from a pid-provider probe. */ > +pid$target:a.out:foo:entry > +{ > + self->mypid = pid; > + printf("pid probe expect pid %d tid %d\n", pid, tid); > +} > + > +/* Report pid and tid from a profile-provider probe. Look for the thread we created. */ > +profile:::profile-1s > +/self->mypid != 0/ > +{ > + printf("profile probe expect pid %d tid %d\n", pid, tid); > + exit(0); > +} > + > +/* While we are at it, check the pthread_t returned via pthread_create(). */ > +pid$target::pthread_create:entry > +{ > + self->thrid_p = (uintptr_t) arg0; > +} > +pid$target::pthread_create:return > +{ > + printf("created pthread_t %lld\n", *((long long *)copyin(self->thrid_p, sizeof(long long *)))); > + self->thrid_p = 0 > +}' |& sort > C.out > +if [ $? -ne 0 ]; then > + echo DTrace failed > + echo ==== C.out > + cat C.out > + echo ==== D.out > + cat D.out > + exit 1 > +fi > + > +# Compare the C and D output. > + > +sort D.out > D.out.sorted > +if ! diff -q C.out D.out.sorted ; then > + echo ERROR: mismatch > + echo ==== C.out > + cat C.out > + echo ==== D.out > + cat D.out.sorted > + echo ==== diff > + diff C.out D.out.sorted > + exit 1 > +fi > + > +echo success > +exit 0 > -- > 2.43.5 > > From eugene.loh at oracle.com Wed Mar 19 16:30:06 2025 From: eugene.loh at oracle.com (Eugene Loh) Date: Wed, 19 Mar 2025 12:30:06 -0400 Subject: [DTrace-devel] [PATCH 1/2] Clarify how the usdt_prids key is stored on the BPF stack In-Reply-To: References: <20250220044350.14953-1-eugene.loh@oracle.com> Message-ID: <94c60235-005e-ecc8-497d-3d388fa8819f@oracle.com> On 3/19/25 11:18, Kris Van Hees wrote: > On Wed, Mar 19, 2025 at 10:40:34AM -0400, Kris Van Hees via DTrace-devel wrote: >> On Wed, Feb 19, 2025 at 11:43:49PM -0500, eugene.loh at oracle.com wrote: >>> While one can access the BPF stack relative to %r9, the whole >>> point of DT_TRAMP_SP_SLOT(0) is to make trampoline code more >>> readable. So use it. >>> >>> Signed-off-by: Eugene Loh >> Reviewed-by: Kris Van Hees > Still applies but see below... > >>> --- >>> libdtrace/dt_prov_uprobe.c | 21 +++++++-------------- >>> 1 file changed, 7 insertions(+), 14 deletions(-) >>> >>> diff --git a/libdtrace/dt_prov_uprobe.c b/libdtrace/dt_prov_uprobe.c >>> index 5d9f74244..f1323cc31 100644 >>> --- a/libdtrace/dt_prov_uprobe.c >>> +++ b/libdtrace/dt_prov_uprobe.c >>> @@ -1015,22 +1015,15 @@ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl) >>> emit(dlp, BPF_ALU64_IMM(BPF_RSH, BPF_REG_0, 32)); >>> >>> /* >>> - * Look up in the BPF 'usdt_prids' map. Space for the look-up key >>> - * will be used on the BPF stack: >>> - * >>> - * offset value >>> - * >>> - * -sizeof(usdt_prids_map_key_t) pid (in %r0) >>> - * >>> - * -sizeof(usdt_prids_map_key_t) + sizeof(pid_t) >>> - * == >>> - * -sizeof(dtrace_id_t) underlying-probe prid >>> + * Look up in the BPF 'usdt_prids' map. The key should fit into >>> + * trampoline stack slot 0. >>> */ >>> - emit(dlp, BPF_STORE(BPF_W, BPF_REG_9, (int)(-sizeof(usdt_prids_map_key_t)), BPF_REG_0)); >>> - emit(dlp, BPF_STORE_IMM(BPF_W, BPF_REG_9, (int)(-sizeof(dtrace_id_t)), uprp->desc->id)); >>> + assert(sizeof(usdt_prids_map_key_t) <= DT_STK_SLOT_SZ); >>> + emit(dlp, BPF_STORE(BPF_W, BPF_REG_FP, DT_TRAMP_SP_SLOT(0), BPF_REG_0)); >>> + emit(dlp, BPF_STORE_IMM(BPF_W, BPF_REG_FP, DT_TRAMP_SP_SLOT(0) + sizeof(pid_t), uprp->desc->id)); > I get a compiler warning here: > > libdtrace/dt_prov_uprobe.c: In function ??~trampoline??T: > include/bpf_asm.h:119:24: warning: overflow in conversion from ??~long unsigned int??T to ??~short int??T changes value from ??~18446744073709551524??T to ??~-92??T [-Woverflo] > 119 | .off = (ofs), \ > | ^ > libdtrace/dt_as.h:42:69: note: in definition of macro ??~emitle??T > 42 | dt_irnode_t *dip = dt_cg_node_alloc((lbl), (instr)); \ > | ^~~~~ > > libdtrace/dt_prov_uprobe.c:1013:9: note: in expansion of macro ??~emit??T > 1013 | emit(dlp, BPF_STORE_IMM(BPF_W, BPF_REG_FP, DT_TRAMP_SP_SLOT(0) + sizeof(pid_t), uprp->desc->id)); > | ^~~~ > libdtrace/dt_prov_uprobe.c:1013:20: note: in expansion of macro ??~BPF_STORE_IMM??T > 1013 | emit(dlp, BPF_STORE_IMM(BPF_W, BPF_REG_FP, DT_TRAMP_SP_SLOT(0) + sizeof(pid_t), uprp->desc->id)); > > You need a (int) cast for sizeof(pid_t) similar to the casts that were in > the code before. I'll add it in as I merge. Thanks.? Might this correction already be in the 2/2 patch?? (Not that that's the right place for it, but...) >>> dt_cg_xsetx(dlp, usdt_prids, DT_LBL_NONE, BPF_REG_1, usdt_prids->di_id); >>> - emit(dlp, BPF_MOV_REG(BPF_REG_2, BPF_REG_9)); >>> - emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, (int)(-sizeof(usdt_prids_map_key_t)))); >>> + emit(dlp, BPF_MOV_REG(BPF_REG_2, BPF_REG_FP)); >>> + emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, DT_TRAMP_SP_SLOT(0))); >>> emit(dlp, BPF_CALL_HELPER(BPF_FUNC_map_lookup_elem)); >>> emit(dlp, BPF_BRANCH_IMM(BPF_JEQ, BPF_REG_0, 0, lbl_exit)); >>> >>> -- >>> 2.43.5 >>> >>> >> _______________________________________________ >> DTrace-devel mailing list >> DTrace-devel at oss.oracle.com >> https://oss.oracle.com/mailman/listinfo/dtrace-devel From eugene.loh at oracle.com Wed Mar 19 17:15:25 2025 From: eugene.loh at oracle.com (eugene.loh at oracle.com) Date: Wed, 19 Mar 2025 13:15:25 -0400 Subject: [DTrace-devel] [PATCH v2] test: Make tests more resilient to different prid widths Message-ID: <20250319171525.1877-1-eugene.loh@oracle.com> From: Eugene Loh Various tests convert run-dependent values -- like PIDs and probe IDs -- to run-independent strings before checking against their .r results files. But the conversions could be remarkably sensitive to the width of probe IDs. E.g., some conversions assumed probe IDs were flush with the beginning of the line, but if they were narrower they were preceded by white space and were not detected. E.g., this happened in recent fbt work, where probe IDs for fbt probes became much smaller in value. Also, these conversions were being carried out by a hodgepodge of scripts -- sed, awk, and grep; some using run-independent strings like "NNN" or "XXXX" instead of more informative "PID" and "PRID" strings; some incorrectly using "PID" for PRIDs, etc. Replace these .r.p postprocessing scripts with a single script that is more resilient to PRID widths and is commented. Signed-off-by: Eugene Loh --- test/unittest/usdt/convert_PID_and_PRID.awk | 20 +++++++++++++++++ test/unittest/usdt/err.argmap-null.r | 2 +- test/unittest/usdt/err.argmap-null.r.p | 3 +-- test/unittest/usdt/tst.dlclose1.r | 8 +++---- test/unittest/usdt/tst.dlclose1.r.p | 13 +---------- test/unittest/usdt/tst.enable_pid.r | 22 +++++++++---------- test/unittest/usdt/tst.enable_pid.r.p | 8 +------ test/unittest/usdt/tst.exec-dof-replacement.r | 2 +- .../usdt/tst.exec-dof-replacement.r.p | 3 +-- .../usdt/tst.multiprov-dupprobe-fire.r.p | 3 +-- test/unittest/usdt/tst.multiprov-dupprobe.r.p | 6 +---- test/unittest/usdt/tst.multiprovider-fire.r.p | 3 +-- test/unittest/usdt/tst.multiprovider.r.p | 6 +---- 13 files changed, 44 insertions(+), 55 deletions(-) create mode 100755 test/unittest/usdt/convert_PID_and_PRID.awk mode change 100755 => 120000 test/unittest/usdt/err.argmap-null.r.p mode change 100755 => 120000 test/unittest/usdt/tst.dlclose1.r.p mode change 100755 => 120000 test/unittest/usdt/tst.enable_pid.r.p mode change 100755 => 120000 test/unittest/usdt/tst.exec-dof-replacement.r.p mode change 100755 => 120000 test/unittest/usdt/tst.multiprov-dupprobe-fire.r.p mode change 100755 => 120000 test/unittest/usdt/tst.multiprov-dupprobe.r.p mode change 100755 => 120000 test/unittest/usdt/tst.multiprovider-fire.r.p mode change 100755 => 120000 test/unittest/usdt/tst.multiprovider.r.p diff --git a/test/unittest/usdt/convert_PID_and_PRID.awk b/test/unittest/usdt/convert_PID_and_PRID.awk new file mode 100755 index 000000000..1dbb31301 --- /dev/null +++ b/test/unittest/usdt/convert_PID_and_PRID.awk @@ -0,0 +1,20 @@ +#!/usr/bin/gawk -f + +# ignore the banner +/^ *ID *PROVIDER *MODULE *FUNCTION *NAME *$/ { next; } + +# process other lines +{ + # convert run-dependent PID values to "PID" + $0 = gensub("prov([abc]?)[0-9]+", "prov\\1PID", "g"); + sub("pid [0-9]+", "pid PID"); + + # convert run-dependent probe ID values to "PRID" + sub("^ *[0-9]+", "PRID"); + + # squash blanks + gsub(" +", " "); + + # print + print; +} diff --git a/test/unittest/usdt/err.argmap-null.r b/test/unittest/usdt/err.argmap-null.r index 215475e39..97b1850de 100644 --- a/test/unittest/usdt/err.argmap-null.r +++ b/test/unittest/usdt/err.argmap-null.r @@ -1,2 +1,2 @@ -- @@stderr -- -dtrace: failed to compile script test/unittest/usdt/err.argmap-null.d: line 24: index 0 is out of range for test_provXXXX:::place4 args[ ] +dtrace: failed to compile script test/unittest/usdt/err.argmap-null.d: line 24: index 0 is out of range for test_provPID:::place4 args[ ] diff --git a/test/unittest/usdt/err.argmap-null.r.p b/test/unittest/usdt/err.argmap-null.r.p deleted file mode 100755 index c575983ad..000000000 --- a/test/unittest/usdt/err.argmap-null.r.p +++ /dev/null @@ -1,2 +0,0 @@ -#!/bin/sed -rf -s,test_prov[0-9]*,test_provXXXX,g; s,^ *[0-9]+, XX,g; diff --git a/test/unittest/usdt/err.argmap-null.r.p b/test/unittest/usdt/err.argmap-null.r.p new file mode 120000 index 000000000..11a06e058 --- /dev/null +++ b/test/unittest/usdt/err.argmap-null.r.p @@ -0,0 +1 @@ +convert_PID_and_PRID.awk \ No newline at end of file diff --git a/test/unittest/usdt/tst.dlclose1.r b/test/unittest/usdt/tst.dlclose1.r index 7873cb51f..70bb50d76 100644 --- a/test/unittest/usdt/tst.dlclose1.r +++ b/test/unittest/usdt/tst.dlclose1.r @@ -1,6 +1,4 @@ -started pid NNN - ID PROVIDER MODULE FUNCTION NAME -NNN test_provNNN livelib.so go go - ID PROVIDER MODULE FUNCTION NAME +started pid PID +PRID test_provPID livelib.so go go -- @@stderr -- -dtrace: failed to match test_provNNN:::: No probe matches description +dtrace: failed to match test_provPID:::: No probe matches description diff --git a/test/unittest/usdt/tst.dlclose1.r.p b/test/unittest/usdt/tst.dlclose1.r.p deleted file mode 100755 index 85725f3bb..000000000 --- a/test/unittest/usdt/tst.dlclose1.r.p +++ /dev/null @@ -1,12 +0,0 @@ -#!/usr/bin/gawk -f -{ - # ignore the specific probe ID or process ID - # (the script ensures the process ID is consistent) - gsub(/[0-9]+/, "NNN"); - - # ignore the numbers of spaces for alignment - # (they depend on the ID widths) - gsub(/ +/, " "); - - print; -} diff --git a/test/unittest/usdt/tst.dlclose1.r.p b/test/unittest/usdt/tst.dlclose1.r.p new file mode 120000 index 000000000..11a06e058 --- /dev/null +++ b/test/unittest/usdt/tst.dlclose1.r.p @@ -0,0 +1 @@ +convert_PID_and_PRID.awk \ No newline at end of file diff --git a/test/unittest/usdt/tst.enable_pid.r b/test/unittest/usdt/tst.enable_pid.r index 675fcdd6f..9241202d7 100644 --- a/test/unittest/usdt/tst.enable_pid.r +++ b/test/unittest/usdt/tst.enable_pid.r @@ -1,14 +1,14 @@ - FUNCTION:NAME - :tick-1s + FUNCTION:NAME + :tick-1s - FUNCTION:NAME - :tick-1s + FUNCTION:NAME + :tick-1s - FUNCTION:NAME - :tick-1s + FUNCTION:NAME + :tick-1s - FUNCTION:NAME - :tick-1s + FUNCTION:NAME + :tick-1s done ========== out 1 @@ -39,7 +39,7 @@ is not enabled === epoch === success -- @@stderr -- -dtrace: description 'test_provNNN:::go ' matched 1 probe -dtrace: description 'test_provNNN:::go ' matched 2 probes -dtrace: description 'test_provNNN:::go ' matched 2 probes +dtrace: description 'test_provPID:::go ' matched 1 probe +dtrace: description 'test_provPID:::go ' matched 2 probes +dtrace: description 'test_provPID:::go ' matched 2 probes dtrace: description 'test_prov*:::go ' matched 3 probes diff --git a/test/unittest/usdt/tst.enable_pid.r.p b/test/unittest/usdt/tst.enable_pid.r.p deleted file mode 100755 index baf9d2a90..000000000 --- a/test/unittest/usdt/tst.enable_pid.r.p +++ /dev/null @@ -1,7 +0,0 @@ -#!/usr/bin/awk -f -{ - # ignore the specific process ID - gsub(/test_prov[0-9]+/, "test_provNNN"); - - print; -} diff --git a/test/unittest/usdt/tst.enable_pid.r.p b/test/unittest/usdt/tst.enable_pid.r.p new file mode 120000 index 000000000..11a06e058 --- /dev/null +++ b/test/unittest/usdt/tst.enable_pid.r.p @@ -0,0 +1 @@ +convert_PID_and_PRID.awk \ No newline at end of file diff --git a/test/unittest/usdt/tst.exec-dof-replacement.r b/test/unittest/usdt/tst.exec-dof-replacement.r index 7547f85e5..226ab7c8a 100644 --- a/test/unittest/usdt/tst.exec-dof-replacement.r +++ b/test/unittest/usdt/tst.exec-dof-replacement.r @@ -1 +1 @@ -PID test_prov test2 main succeeded +PRID test_provPID test2 main succeeded diff --git a/test/unittest/usdt/tst.exec-dof-replacement.r.p b/test/unittest/usdt/tst.exec-dof-replacement.r.p deleted file mode 100755 index 1a5871f73..000000000 --- a/test/unittest/usdt/tst.exec-dof-replacement.r.p +++ /dev/null @@ -1,2 +0,0 @@ -#!/bin/sh -grep -v '^ *ID' | sed 's,^[0-9]*,PID,; s,prov[0-9]*,prov,g; s, *, ,g' diff --git a/test/unittest/usdt/tst.exec-dof-replacement.r.p b/test/unittest/usdt/tst.exec-dof-replacement.r.p new file mode 120000 index 000000000..11a06e058 --- /dev/null +++ b/test/unittest/usdt/tst.exec-dof-replacement.r.p @@ -0,0 +1 @@ +convert_PID_and_PRID.awk \ No newline at end of file diff --git a/test/unittest/usdt/tst.multiprov-dupprobe-fire.r.p b/test/unittest/usdt/tst.multiprov-dupprobe-fire.r.p deleted file mode 100755 index bdbce0189..000000000 --- a/test/unittest/usdt/tst.multiprov-dupprobe-fire.r.p +++ /dev/null @@ -1,2 +0,0 @@ -#!/bin/sh -sed 's,prov\(.\)[0-9]*,prov\1PID,; s, *, ,g' diff --git a/test/unittest/usdt/tst.multiprov-dupprobe-fire.r.p b/test/unittest/usdt/tst.multiprov-dupprobe-fire.r.p new file mode 120000 index 000000000..11a06e058 --- /dev/null +++ b/test/unittest/usdt/tst.multiprov-dupprobe-fire.r.p @@ -0,0 +1 @@ +convert_PID_and_PRID.awk \ No newline at end of file diff --git a/test/unittest/usdt/tst.multiprov-dupprobe.r.p b/test/unittest/usdt/tst.multiprov-dupprobe.r.p deleted file mode 100755 index 5d11db2d4..000000000 --- a/test/unittest/usdt/tst.multiprov-dupprobe.r.p +++ /dev/null @@ -1,5 +0,0 @@ -#!/bin/sh - -# Remove banner. -# Replace numerical values with generic PRID and PID labels. -grep -v '^ *ID' | sed 's,^[0-9][0-9]*,PRID,; s,prov\(.\)[0-9]*,prov\1PID,; s, *, ,g' diff --git a/test/unittest/usdt/tst.multiprov-dupprobe.r.p b/test/unittest/usdt/tst.multiprov-dupprobe.r.p new file mode 120000 index 000000000..11a06e058 --- /dev/null +++ b/test/unittest/usdt/tst.multiprov-dupprobe.r.p @@ -0,0 +1 @@ +convert_PID_and_PRID.awk \ No newline at end of file diff --git a/test/unittest/usdt/tst.multiprovider-fire.r.p b/test/unittest/usdt/tst.multiprovider-fire.r.p deleted file mode 100755 index bdbce0189..000000000 --- a/test/unittest/usdt/tst.multiprovider-fire.r.p +++ /dev/null @@ -1,2 +0,0 @@ -#!/bin/sh -sed 's,prov\(.\)[0-9]*,prov\1PID,; s, *, ,g' diff --git a/test/unittest/usdt/tst.multiprovider-fire.r.p b/test/unittest/usdt/tst.multiprovider-fire.r.p new file mode 120000 index 000000000..11a06e058 --- /dev/null +++ b/test/unittest/usdt/tst.multiprovider-fire.r.p @@ -0,0 +1 @@ +convert_PID_and_PRID.awk \ No newline at end of file diff --git a/test/unittest/usdt/tst.multiprovider.r.p b/test/unittest/usdt/tst.multiprovider.r.p deleted file mode 100755 index 5d11db2d4..000000000 --- a/test/unittest/usdt/tst.multiprovider.r.p +++ /dev/null @@ -1,5 +0,0 @@ -#!/bin/sh - -# Remove banner. -# Replace numerical values with generic PRID and PID labels. -grep -v '^ *ID' | sed 's,^[0-9][0-9]*,PRID,; s,prov\(.\)[0-9]*,prov\1PID,; s, *, ,g' diff --git a/test/unittest/usdt/tst.multiprovider.r.p b/test/unittest/usdt/tst.multiprovider.r.p new file mode 120000 index 000000000..11a06e058 --- /dev/null +++ b/test/unittest/usdt/tst.multiprovider.r.p @@ -0,0 +1 @@ +convert_PID_and_PRID.awk \ No newline at end of file -- 2.43.5 From kris.van.hees at oracle.com Wed Mar 19 17:20:43 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Wed, 19 Mar 2025 13:20:43 -0400 Subject: [DTrace-devel] [PATCH] dt_pid: pid grabs should be shortlived In-Reply-To: <20250307144300.230034-1-nick.alcock@oracle.com> References: <20250307144300.230034-1-nick.alcock@oracle.com> Message-ID: On Fri, Mar 07, 2025 at 02:43:00PM +0000, Nick Alcock wrote: > If we use long-lived grabs for this, we are requiring that the process is > ptraceable, and thus preventing pid tracing of system daemons, init, > processes already being debugged or traced by others, etc. > > Signed-off-by: Nick Alcock Reviewed-by: Kris Van Hees ... with minor indentation fixes to make it more similar to the other instacne already in the code. > --- > libdtrace/dt_pid.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/libdtrace/dt_pid.c b/libdtrace/dt_pid.c > index 76608f6904fee..4135c3ea656ec 100644 > --- a/libdtrace/dt_pid.c > +++ b/libdtrace/dt_pid.c > @@ -1243,7 +1243,8 @@ dt_pid_create_pid_probes(dtrace_probedesc_t *pdp, dtrace_hdl_t *dtp, dt_pcb_t *p > return 0; > > /* Grab the process. */ > - if (dt_proc_grab_lock(dtp, pid, DTRACE_PROC_WAITING) < 0) { > + if (dt_proc_grab_lock(dtp, pid, DTRACE_PROC_WAITING > + | DTRACE_PROC_SHORTLIVED) < 0) { > dt_pid_error(dtp, pcb, NULL, D_PROC_GRAB, > "failed to grab process %d", (int)pid); > return -1; > > base-commit: 39a5e0a8866b38679619fa357bb3082bc245aada > -- > 2.48.1.283.g18c60a128c > > From eugene.loh at oracle.com Wed Mar 19 17:42:25 2025 From: eugene.loh at oracle.com (eugene.loh at oracle.com) Date: Wed, 19 Mar 2025 13:42:25 -0400 Subject: [DTrace-devel] [PATCH] test: Allow duplicate lines for rawfbt synthetic tests Message-ID: <20250319174225.2203-1-eugene.loh@oracle.com> From: Eugene Loh It is possible for the probe in these tests to fire "concurrently" on multiple CPUs, leading to duplicate lines of output. Add post processing to reduce multiple lines of duplicate output. Signed-off-by: Eugene Loh --- test/unittest/providers/rawfbt/tst.synthetic-entry.r.p | 3 +++ test/unittest/providers/rawfbt/tst.synthetic-return.r.p | 3 +++ 2 files changed, 6 insertions(+) create mode 100755 test/unittest/providers/rawfbt/tst.synthetic-entry.r.p create mode 100755 test/unittest/providers/rawfbt/tst.synthetic-return.r.p diff --git a/test/unittest/providers/rawfbt/tst.synthetic-entry.r.p b/test/unittest/providers/rawfbt/tst.synthetic-entry.r.p new file mode 100755 index 000000000..18171f044 --- /dev/null +++ b/test/unittest/providers/rawfbt/tst.synthetic-entry.r.p @@ -0,0 +1,3 @@ +#!/bin/sh + +uniq diff --git a/test/unittest/providers/rawfbt/tst.synthetic-return.r.p b/test/unittest/providers/rawfbt/tst.synthetic-return.r.p new file mode 100755 index 000000000..18171f044 --- /dev/null +++ b/test/unittest/providers/rawfbt/tst.synthetic-return.r.p @@ -0,0 +1,3 @@ +#!/bin/sh + +uniq -- 2.43.5 From kris.van.hees at oracle.com Wed Mar 19 17:46:02 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Wed, 19 Mar 2025 13:46:02 -0400 Subject: [DTrace-devel] [PATCH v2] test: Make tests more resilient to different prid widths In-Reply-To: <20250319171525.1877-1-eugene.loh@oracle.com> References: <20250319171525.1877-1-eugene.loh@oracle.com> Message-ID: On Wed, Mar 19, 2025 at 01:15:25PM -0400, eugene.loh at oracle.com wrote: > From: Eugene Loh > > Various tests convert run-dependent values -- like PIDs and probe IDs > -- to run-independent strings before checking against their .r results > files. But the conversions could be remarkably sensitive to the width > of probe IDs. E.g., some conversions assumed probe IDs were flush with > the beginning of the line, but if they were narrower they were preceded > by white space and were not detected. E.g., this happened in recent fbt > work, where probe IDs for fbt probes became much smaller in value. > > Also, these conversions were being carried out by a hodgepodge of scripts > -- sed, awk, and grep; some using run-independent strings like "NNN" or > "XXXX" instead of more informative "PID" and "PRID" strings; some > incorrectly using "PID" for PRIDs, etc. > > Replace these .r.p postprocessing scripts with a single script that is > more resilient to PRID widths and is commented. > > Signed-off-by: Eugene Loh > --- > test/unittest/usdt/convert_PID_and_PRID.awk | 20 +++++++++++++++++ > test/unittest/usdt/err.argmap-null.r | 2 +- > test/unittest/usdt/err.argmap-null.r.p | 3 +-- > test/unittest/usdt/tst.dlclose1.r | 8 +++---- > test/unittest/usdt/tst.dlclose1.r.p | 13 +---------- > test/unittest/usdt/tst.enable_pid.r | 22 +++++++++---------- > test/unittest/usdt/tst.enable_pid.r.p | 8 +------ > test/unittest/usdt/tst.exec-dof-replacement.r | 2 +- > .../usdt/tst.exec-dof-replacement.r.p | 3 +-- > .../usdt/tst.multiprov-dupprobe-fire.r.p | 3 +-- > test/unittest/usdt/tst.multiprov-dupprobe.r.p | 6 +---- > test/unittest/usdt/tst.multiprovider-fire.r.p | 3 +-- > test/unittest/usdt/tst.multiprovider.r.p | 6 +---- > 13 files changed, 44 insertions(+), 55 deletions(-) > create mode 100755 test/unittest/usdt/convert_PID_and_PRID.awk > mode change 100755 => 120000 test/unittest/usdt/err.argmap-null.r.p > mode change 100755 => 120000 test/unittest/usdt/tst.dlclose1.r.p > mode change 100755 => 120000 test/unittest/usdt/tst.enable_pid.r.p > mode change 100755 => 120000 test/unittest/usdt/tst.exec-dof-replacement.r.p > mode change 100755 => 120000 test/unittest/usdt/tst.multiprov-dupprobe-fire.r.p > mode change 100755 => 120000 test/unittest/usdt/tst.multiprov-dupprobe.r.p > mode change 100755 => 120000 test/unittest/usdt/tst.multiprovider-fire.r.p > mode change 100755 => 120000 test/unittest/usdt/tst.multiprovider.r.p > > diff --git a/test/unittest/usdt/convert_PID_and_PRID.awk b/test/unittest/usdt/convert_PID_and_PRID.awk > new file mode 100755 > index 000000000..1dbb31301 > --- /dev/null > +++ b/test/unittest/usdt/convert_PID_and_PRID.awk > @@ -0,0 +1,20 @@ > +#!/usr/bin/gawk -f > + > +# ignore the banner > +/^ *ID *PROVIDER *MODULE *FUNCTION *NAME *$/ { next; } I think that using / +/ for the whitespace instances might be better, or perhaps even /[ \t]+/ if we want to guard against future use of tabs. But at a minimum / +/ seems most prudent. > + > +# process other lines > +{ > + # convert run-dependent PID values to "PID" > + $0 = gensub("prov([abc]?)[0-9]+", "prov\\1PID", "g"); > + sub("pid [0-9]+", "pid PID"); First arg of each function is a regexp, so should be specified as /.../ rather than "...". > + > + # convert run-dependent probe ID values to "PRID" > + sub("^ *[0-9]+", "PRID"); Same. > + > + # squash blanks > + gsub(" +", " "); Same. > + > + # print > + print; > +} > diff --git a/test/unittest/usdt/err.argmap-null.r b/test/unittest/usdt/err.argmap-null.r > index 215475e39..97b1850de 100644 > --- a/test/unittest/usdt/err.argmap-null.r > +++ b/test/unittest/usdt/err.argmap-null.r > @@ -1,2 +1,2 @@ > -- @@stderr -- > -dtrace: failed to compile script test/unittest/usdt/err.argmap-null.d: line 24: index 0 is out of range for test_provXXXX:::place4 args[ ] > +dtrace: failed to compile script test/unittest/usdt/err.argmap-null.d: line 24: index 0 is out of range for test_provPID:::place4 args[ ] > diff --git a/test/unittest/usdt/err.argmap-null.r.p b/test/unittest/usdt/err.argmap-null.r.p > deleted file mode 100755 > index c575983ad..000000000 > --- a/test/unittest/usdt/err.argmap-null.r.p > +++ /dev/null > @@ -1,2 +0,0 @@ > -#!/bin/sed -rf > -s,test_prov[0-9]*,test_provXXXX,g; s,^ *[0-9]+, XX,g; > diff --git a/test/unittest/usdt/err.argmap-null.r.p b/test/unittest/usdt/err.argmap-null.r.p > new file mode 120000 > index 000000000..11a06e058 > --- /dev/null > +++ b/test/unittest/usdt/err.argmap-null.r.p > @@ -0,0 +1 @@ > +convert_PID_and_PRID.awk > \ No newline at end of file > diff --git a/test/unittest/usdt/tst.dlclose1.r b/test/unittest/usdt/tst.dlclose1.r > index 7873cb51f..70bb50d76 100644 > --- a/test/unittest/usdt/tst.dlclose1.r > +++ b/test/unittest/usdt/tst.dlclose1.r > @@ -1,6 +1,4 @@ > -started pid NNN > - ID PROVIDER MODULE FUNCTION NAME > -NNN test_provNNN livelib.so go go > - ID PROVIDER MODULE FUNCTION NAME > +started pid PID > +PRID test_provPID livelib.so go go > -- @@stderr -- > -dtrace: failed to match test_provNNN:::: No probe matches description > +dtrace: failed to match test_provPID:::: No probe matches description > diff --git a/test/unittest/usdt/tst.dlclose1.r.p b/test/unittest/usdt/tst.dlclose1.r.p > deleted file mode 100755 > index 85725f3bb..000000000 > --- a/test/unittest/usdt/tst.dlclose1.r.p > +++ /dev/null > @@ -1,12 +0,0 @@ > -#!/usr/bin/gawk -f > -{ > - # ignore the specific probe ID or process ID > - # (the script ensures the process ID is consistent) > - gsub(/[0-9]+/, "NNN"); > - > - # ignore the numbers of spaces for alignment > - # (they depend on the ID widths) > - gsub(/ +/, " "); > - > - print; > -} > diff --git a/test/unittest/usdt/tst.dlclose1.r.p b/test/unittest/usdt/tst.dlclose1.r.p > new file mode 120000 > index 000000000..11a06e058 > --- /dev/null > +++ b/test/unittest/usdt/tst.dlclose1.r.p > @@ -0,0 +1 @@ > +convert_PID_and_PRID.awk > \ No newline at end of file > diff --git a/test/unittest/usdt/tst.enable_pid.r b/test/unittest/usdt/tst.enable_pid.r > index 675fcdd6f..9241202d7 100644 > --- a/test/unittest/usdt/tst.enable_pid.r > +++ b/test/unittest/usdt/tst.enable_pid.r > @@ -1,14 +1,14 @@ > - FUNCTION:NAME > - :tick-1s > + FUNCTION:NAME > + :tick-1s > > - FUNCTION:NAME > - :tick-1s > + FUNCTION:NAME > + :tick-1s > > - FUNCTION:NAME > - :tick-1s > + FUNCTION:NAME > + :tick-1s > > - FUNCTION:NAME > - :tick-1s > + FUNCTION:NAME > + :tick-1s > > done > ========== out 1 > @@ -39,7 +39,7 @@ is not enabled > === epoch === > success > -- @@stderr -- > -dtrace: description 'test_provNNN:::go ' matched 1 probe > -dtrace: description 'test_provNNN:::go ' matched 2 probes > -dtrace: description 'test_provNNN:::go ' matched 2 probes > +dtrace: description 'test_provPID:::go ' matched 1 probe > +dtrace: description 'test_provPID:::go ' matched 2 probes > +dtrace: description 'test_provPID:::go ' matched 2 probes > dtrace: description 'test_prov*:::go ' matched 3 probes > diff --git a/test/unittest/usdt/tst.enable_pid.r.p b/test/unittest/usdt/tst.enable_pid.r.p > deleted file mode 100755 > index baf9d2a90..000000000 > --- a/test/unittest/usdt/tst.enable_pid.r.p > +++ /dev/null > @@ -1,7 +0,0 @@ > -#!/usr/bin/awk -f > -{ > - # ignore the specific process ID > - gsub(/test_prov[0-9]+/, "test_provNNN"); > - > - print; > -} > diff --git a/test/unittest/usdt/tst.enable_pid.r.p b/test/unittest/usdt/tst.enable_pid.r.p > new file mode 120000 > index 000000000..11a06e058 > --- /dev/null > +++ b/test/unittest/usdt/tst.enable_pid.r.p > @@ -0,0 +1 @@ > +convert_PID_and_PRID.awk > \ No newline at end of file > diff --git a/test/unittest/usdt/tst.exec-dof-replacement.r b/test/unittest/usdt/tst.exec-dof-replacement.r > index 7547f85e5..226ab7c8a 100644 > --- a/test/unittest/usdt/tst.exec-dof-replacement.r > +++ b/test/unittest/usdt/tst.exec-dof-replacement.r > @@ -1 +1 @@ > -PID test_prov test2 main succeeded > +PRID test_provPID test2 main succeeded > diff --git a/test/unittest/usdt/tst.exec-dof-replacement.r.p b/test/unittest/usdt/tst.exec-dof-replacement.r.p > deleted file mode 100755 > index 1a5871f73..000000000 > --- a/test/unittest/usdt/tst.exec-dof-replacement.r.p > +++ /dev/null > @@ -1,2 +0,0 @@ > -#!/bin/sh > -grep -v '^ *ID' | sed 's,^[0-9]*,PID,; s,prov[0-9]*,prov,g; s, *, ,g' > diff --git a/test/unittest/usdt/tst.exec-dof-replacement.r.p b/test/unittest/usdt/tst.exec-dof-replacement.r.p > new file mode 120000 > index 000000000..11a06e058 > --- /dev/null > +++ b/test/unittest/usdt/tst.exec-dof-replacement.r.p > @@ -0,0 +1 @@ > +convert_PID_and_PRID.awk > \ No newline at end of file > diff --git a/test/unittest/usdt/tst.multiprov-dupprobe-fire.r.p b/test/unittest/usdt/tst.multiprov-dupprobe-fire.r.p > deleted file mode 100755 > index bdbce0189..000000000 > --- a/test/unittest/usdt/tst.multiprov-dupprobe-fire.r.p > +++ /dev/null > @@ -1,2 +0,0 @@ > -#!/bin/sh > -sed 's,prov\(.\)[0-9]*,prov\1PID,; s, *, ,g' > diff --git a/test/unittest/usdt/tst.multiprov-dupprobe-fire.r.p b/test/unittest/usdt/tst.multiprov-dupprobe-fire.r.p > new file mode 120000 > index 000000000..11a06e058 > --- /dev/null > +++ b/test/unittest/usdt/tst.multiprov-dupprobe-fire.r.p > @@ -0,0 +1 @@ > +convert_PID_and_PRID.awk > \ No newline at end of file > diff --git a/test/unittest/usdt/tst.multiprov-dupprobe.r.p b/test/unittest/usdt/tst.multiprov-dupprobe.r.p > deleted file mode 100755 > index 5d11db2d4..000000000 > --- a/test/unittest/usdt/tst.multiprov-dupprobe.r.p > +++ /dev/null > @@ -1,5 +0,0 @@ > -#!/bin/sh > - > -# Remove banner. > -# Replace numerical values with generic PRID and PID labels. > -grep -v '^ *ID' | sed 's,^[0-9][0-9]*,PRID,; s,prov\(.\)[0-9]*,prov\1PID,; s, *, ,g' > diff --git a/test/unittest/usdt/tst.multiprov-dupprobe.r.p b/test/unittest/usdt/tst.multiprov-dupprobe.r.p > new file mode 120000 > index 000000000..11a06e058 > --- /dev/null > +++ b/test/unittest/usdt/tst.multiprov-dupprobe.r.p > @@ -0,0 +1 @@ > +convert_PID_and_PRID.awk > \ No newline at end of file > diff --git a/test/unittest/usdt/tst.multiprovider-fire.r.p b/test/unittest/usdt/tst.multiprovider-fire.r.p > deleted file mode 100755 > index bdbce0189..000000000 > --- a/test/unittest/usdt/tst.multiprovider-fire.r.p > +++ /dev/null > @@ -1,2 +0,0 @@ > -#!/bin/sh > -sed 's,prov\(.\)[0-9]*,prov\1PID,; s, *, ,g' > diff --git a/test/unittest/usdt/tst.multiprovider-fire.r.p b/test/unittest/usdt/tst.multiprovider-fire.r.p > new file mode 120000 > index 000000000..11a06e058 > --- /dev/null > +++ b/test/unittest/usdt/tst.multiprovider-fire.r.p > @@ -0,0 +1 @@ > +convert_PID_and_PRID.awk > \ No newline at end of file > diff --git a/test/unittest/usdt/tst.multiprovider.r.p b/test/unittest/usdt/tst.multiprovider.r.p > deleted file mode 100755 > index 5d11db2d4..000000000 > --- a/test/unittest/usdt/tst.multiprovider.r.p > +++ /dev/null > @@ -1,5 +0,0 @@ > -#!/bin/sh > - > -# Remove banner. > -# Replace numerical values with generic PRID and PID labels. > -grep -v '^ *ID' | sed 's,^[0-9][0-9]*,PRID,; s,prov\(.\)[0-9]*,prov\1PID,; s, *, ,g' > diff --git a/test/unittest/usdt/tst.multiprovider.r.p b/test/unittest/usdt/tst.multiprovider.r.p > new file mode 120000 > index 000000000..11a06e058 > --- /dev/null > +++ b/test/unittest/usdt/tst.multiprovider.r.p > @@ -0,0 +1 @@ > +convert_PID_and_PRID.awk > \ No newline at end of file > -- > 2.43.5 > From kris.van.hees at oracle.com Wed Mar 19 17:47:05 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Wed, 19 Mar 2025 13:47:05 -0400 Subject: [DTrace-devel] [PATCH] test: Allow duplicate lines for rawfbt synthetic tests In-Reply-To: <20250319174225.2203-1-eugene.loh@oracle.com> References: <20250319174225.2203-1-eugene.loh@oracle.com> Message-ID: On Wed, Mar 19, 2025 at 01:42:25PM -0400, eugene.loh at oracle.com wrote: > From: Eugene Loh > > It is possible for the probe in these tests to fire "concurrently" > on multiple CPUs, leading to duplicate lines of output. Add > post processing to reduce multiple lines of duplicate output. > > Signed-off-by: Eugene Loh Reviewed-by: Kris Van Hees > --- > test/unittest/providers/rawfbt/tst.synthetic-entry.r.p | 3 +++ > test/unittest/providers/rawfbt/tst.synthetic-return.r.p | 3 +++ > 2 files changed, 6 insertions(+) > create mode 100755 test/unittest/providers/rawfbt/tst.synthetic-entry.r.p > create mode 100755 test/unittest/providers/rawfbt/tst.synthetic-return.r.p > > diff --git a/test/unittest/providers/rawfbt/tst.synthetic-entry.r.p b/test/unittest/providers/rawfbt/tst.synthetic-entry.r.p > new file mode 100755 > index 000000000..18171f044 > --- /dev/null > +++ b/test/unittest/providers/rawfbt/tst.synthetic-entry.r.p > @@ -0,0 +1,3 @@ > +#!/bin/sh > + > +uniq > diff --git a/test/unittest/providers/rawfbt/tst.synthetic-return.r.p b/test/unittest/providers/rawfbt/tst.synthetic-return.r.p > new file mode 100755 > index 000000000..18171f044 > --- /dev/null > +++ b/test/unittest/providers/rawfbt/tst.synthetic-return.r.p > @@ -0,0 +1,3 @@ > +#!/bin/sh > + > +uniq > -- > 2.43.5 > From kris.van.hees at oracle.com Wed Mar 19 18:53:17 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Wed, 19 Mar 2025 14:53:17 -0400 Subject: [DTrace-devel] [PATCH] test: Account for pid:::entry ucaller being correct In-Reply-To: <20250319063230.28171-1-eugene.loh@oracle.com> References: <20250319063230.28171-1-eugene.loh@oracle.com> Message-ID: On Wed, Mar 19, 2025 at 02:32:26AM -0400, eugene.loh at oracle.com wrote: > From: Eugene Loh > > In commit f38bdf9ea ("test: Account for pid:::entry ustack() being correct") > we accounted for x86-specific heuristics introduced in Linux 6.11 that > dealt with pid:::entry uprobes firing so early in the function preamble > that the frame pointer is not yet set and the caller is not (yet) > correctly identified. > > Update a related test to account for the same effect with ucaller. > > Signed-off-by: Eugene Loh LGTM Reviewed-by: Kris Van Hees > --- > test/unittest/vars/tst.ucaller.r.p | 28 ++++++++++++++++++++++++++++ > 1 file changed, 28 insertions(+) > create mode 100755 test/unittest/vars/tst.ucaller.r.p > > diff --git a/test/unittest/vars/tst.ucaller.r.p b/test/unittest/vars/tst.ucaller.r.p > new file mode 100755 > index 000000000..8e03f110d > --- /dev/null > +++ b/test/unittest/vars/tst.ucaller.r.p > @@ -0,0 +1,28 @@ > +#!/bin/sh > + > +# A pid entry probe places a uprobe on the first instruction of a function. > +# Unfortunately, this is so early in the function preamble that the function > +# frame pointer has not yet been established and the actual caller of the > +# traced function is missed. > +# > +# In Linux 6.11, x86-specific heuristics are introduced to fix this problem. > +# See commit cfa7f3d > +# ("perf,x86: avoid missing caller address in stack traces captured in uprobe") > +# for both a description of the problem and an explanation of the heuristics. > +# > +# Add post processing to these test results to allow for both cases: > +# caller frame is missing or not missing. > + > +if [ $(uname -m) == "x86_64" ]; then > + read MAJOR MINOR <<< `uname -r | grep -Eo '^[0-9]+\.[0-9]+' | tr '.' ' '` > + > + if [ $MAJOR -ge 6 ]; then > + if [ $MAJOR -gt 6 -o $MINOR -ge 11 ]; then > + awk '{ sub("myfunc_w", "myfunc_v"); print; }' > + exit 0 > + fi > + fi > +fi > + > +# Otherwise, just pass the output through. > +cat > -- > 2.43.5 > From kris.van.hees at oracle.com Wed Mar 19 18:54:18 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Wed, 19 Mar 2025 14:54:18 -0400 Subject: [DTrace-devel] [PATCH] Fix dt_bvar_probedesc() for late USDT processes In-Reply-To: <20250319063230.28171-2-eugene.loh@oracle.com> References: <20250319063230.28171-1-eugene.loh@oracle.com> <20250319063230.28171-2-eugene.loh@oracle.com> Message-ID: On Wed, Mar 19, 2025 at 02:32:27AM -0400, eugene.loh at oracle.com wrote: > From: Eugene Loh > > With commit 8bd26415b > ("bpf: separate bvar implementation into separate functions"), > test/unittest/usdt/tst.nusdtprobes.sh started failing reproducibly > on all platforms. > > In that patch, the get_bvar() function is factored into separate > functions. It includes a change that looks basically like this: > > uint32_t key = mst->prid; > > if (key < ((uint64_t)&NPROBES)) { > [...] > } else { > char *s = bpf_map_lookup_elem(&usdt_names, &key); > switch (idx) { > - case DIF_VAR_PROBENAME: s += DTRACE_FUNCNAMELEN; > + case DIF_VAR_PROBEPROV: s += DTRACE_FUNCNAMELEN; > - case DIF_VAR_PROBEFUNC: s += DTRACE_MODNAMELEN; > + case DIF_VAR_PROBEMOD : s += DTRACE_MODNAMELEN; > - case DIF_VAR_PROBEMOD : s += DTRACE_PROVNAMELEN; > + case DIF_VAR_PROBEFUNC: s += DTRACE_PROVNAMELEN; > - case DIF_VAR_PROBEPROV: > + case DIF_VAR_PROBENAME: > } > return (uint64_t)s; > } > > That is, for the case of key>=NPROBES (that is, for USDT probes that > were added after the dtrace session was started), the meanings of > prov, mod, func, and name were exchanged. > > Restore the correct meanings. > > Signed-off-by: Eugene Loh My fault. Thanks for the fix. Reviewed-by: Kris Van Hees > --- > bpf/get_bvar.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/bpf/get_bvar.c b/bpf/get_bvar.c > index c760126da..fadb06c00 100644 > --- a/bpf/get_bvar.c > +++ b/bpf/get_bvar.c > @@ -185,13 +185,13 @@ noinline uint64_t dt_bvar_probedesc(const dt_dctx_t *dctx, uint32_t idx) > return (uint64_t)dctx->strtab; > > switch (idx) { > - case DIF_VAR_PROBEPROV: > + case DIF_VAR_PROBENAME: > s += DTRACE_FUNCNAMELEN; > - case DIF_VAR_PROBEMOD: > - s += DTRACE_MODNAMELEN; > case DIF_VAR_PROBEFUNC: > + s += DTRACE_MODNAMELEN; > + case DIF_VAR_PROBEMOD: > s += DTRACE_PROVNAMELEN; > - case DIF_VAR_PROBENAME: > + case DIF_VAR_PROBEPROV: > } > > return (uint64_t)s; > -- > 2.43.5 > From kris.van.hees at oracle.com Wed Mar 19 18:58:55 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Wed, 19 Mar 2025 14:58:55 -0400 Subject: [DTrace-devel] [PATCH] Copy fprobes entry args with BPF helper function In-Reply-To: <20250319063230.28171-3-eugene.loh@oracle.com> References: <20250319063230.28171-1-eugene.loh@oracle.com> <20250319063230.28171-3-eugene.loh@oracle.com> Message-ID: On Wed, Mar 19, 2025 at 02:32:28AM -0400, eugene.loh at oracle.com wrote: > From: Eugene Loh > > With commit a6b626a89 ("Fix fprobe/kprobe selection"), fprobes were > effectively turned on. Unfortunately, with this fix, some tests like > test/unittest/stack/tst.stack_fbt.sh encountered problems on UEK7 > since the BPF verifier would complain about the prototypes of some of > the probe arguments. E.g., when loading arg3 in fprobe_trampoline() > from fbt::vfs_write:entry from %r8+24, the BPF verifier complains > > func 'vfs_write' arg3 type INT is not a struct > invalid bpf_context access off=24 size=8 > > We can bypass this problem by using a BPF helper function to copy the > arguments onto the BPF stack and then load the arguments into mstate > from there. > > There is also a BPF get_func_arg() helper function, but it is not > introduced until 5.17 -- that is, it appears after UEK7. See Linux > commit f92c1e1 ("bpf: Add get_func_[arg|ret|arg_cnt] helpers"). I'm OK with merging this (see r-b below) but I think that we should perhaps see a follow-up patch soon that implements using bpf_get_func_arg() if available, and otherwise use bpf_probe_read_kernel(). Not that it is (at this point) necessary but it might be a good idea to adjust to newer ways to access this data in case e.g. future kernels will hide access to the argument data behind that bpf_get_fucn_arg() helper. > While the already mentioned test signals the problem and the fix, we > also add an additional test that actually checks the correctness of > the arguments. > > Signed-off-by: Eugene Loh Reviewed-by: Kris Van Hees > --- > libdtrace/dt_prov_fbt.c | 14 ++- > test/unittest/fbtprovider/tst.entryargs2.r | 29 ++++++ > test/unittest/fbtprovider/tst.entryargs2.sh | 105 ++++++++++++++++++++ > 3 files changed, 147 insertions(+), 1 deletion(-) > create mode 100644 test/unittest/fbtprovider/tst.entryargs2.r > create mode 100755 test/unittest/fbtprovider/tst.entryargs2.sh > > diff --git a/libdtrace/dt_prov_fbt.c b/libdtrace/dt_prov_fbt.c > index 8aa53d643..50fa0d9dc 100644 > --- a/libdtrace/dt_prov_fbt.c > +++ b/libdtrace/dt_prov_fbt.c > @@ -285,8 +285,20 @@ static int fprobe_trampoline(dt_pcb_t *pcb, uint_t exitlbl) > if (strcmp(pcb->pcb_probe->desc->prb, "entry") == 0) { > int i; > > + /* > + * We want to copy entry args from %r8 to %r7 (plus offsets). > + * Unfortunately, for fprobes, the BPF verifier can reject > + * certain argument types. We work around this by copying > + * the arguments onto the BPF stack and loading them from there. > + */ > + emit(dlp, BPF_MOV_REG(BPF_REG_1, BPF_REG_FP)); > + emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, DT_TRAMP_SP_SLOT(prp->argc - 1))); > + emit(dlp, BPF_MOV_IMM(BPF_REG_2, 8 * prp->argc)); > + emit(dlp, BPF_MOV_REG(BPF_REG_3, BPF_REG_8)); > + emit(dlp, BPF_CALL_HELPER(dtp->dt_bpfhelper[BPF_FUNC_probe_read_kernel])); > + > for (i = 0; i < prp->argc; i++) { > - emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_0, BPF_REG_8, i * 8)); > + emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_0, BPF_REG_FP, DT_TRAMP_SP_SLOT(prp->argc - 1) + i * 8)); > emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(i), BPF_REG_0)); > } > } else { > diff --git a/test/unittest/fbtprovider/tst.entryargs2.r b/test/unittest/fbtprovider/tst.entryargs2.r > new file mode 100644 > index 000000000..efc4685f9 > --- /dev/null > +++ b/test/unittest/fbtprovider/tst.entryargs2.r > @@ -0,0 +1,29 @@ > +mode READ : no > +mode WRITE : yes > +mode LSEEK : yes > +mode PREAD : yes > +mode PWRITE : yes > +mode WRITER : yes > +mode CAN_READ : no > +mode CAN_WRITE : yes > +mode OPENED : yes > +buf: ========================= > +count: 8 > +pos: 20 > +abcdefghijklmnopqrst========CDEFGHIJKLMNOPQRSTUVWXYZ0123456789 > + > +mode READ : yes > +mode WRITE : yes > +mode LSEEK : yes > +mode PREAD : yes > +mode PWRITE : yes > +mode WRITER : yes > +mode CAN_READ : yes > +mode CAN_WRITE : yes > +mode OPENED : yes > +buf: ========================= > +count: 8 > +pos: 20 > +abcdefghijklmnopqrst========CDEFGHIJKLMNOPQRSTUVWXYZ0123456789 > + > +success > diff --git a/test/unittest/fbtprovider/tst.entryargs2.sh b/test/unittest/fbtprovider/tst.entryargs2.sh > new file mode 100755 > index 000000000..f5b435f56 > --- /dev/null > +++ b/test/unittest/fbtprovider/tst.entryargs2.sh > @@ -0,0 +1,105 @@ > +#!/bin/bash > +# > +# Oracle Linux DTrace. > +# Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved. > +# Licensed under the Universal Permissive License v 1.0 as shown at > +# http://oss.oracle.com/licenses/upl. > +# > +# Another test of entry args. > +# > + > +dtrace=$1 > +CC=${CC:-/usr/bin/gcc} > + > +# Set up test directory. > + > +DIRNAME=$tmpdir/entryargs2.$$.$RANDOM > +mkdir -p $DIRNAME > +cd $DIRNAME > + > +# Make the trigger. > + > +cat << EOF > main.c > +#include > +#include // open() > +#include // lseek(), write(), close() > + > +int main(int c, char **v) { > + int fd = open("tmp.txt", c == 1 ? O_WRONLY : O_RDWR); > + > + if (fd == -1) > + return 1; > + > + /* Move the offset, then write to the file. */ > + /* (We will overwrite some "middle section" of the file with "========".) */ > + lseek(fd, 20, SEEK_SET); > + write(fd, "=========================", 8); > + close(fd); > + > + return 0; > +} > +EOF > + > +# Build the trigger. FIXME do consistent with Sam's changes. > + > +$CC $test_cppflags $test_ldflags main.c > +if [ $? -ne 0 ]; then > + echo "failed to link final executable" >&2 > + exit 1 > +fi > + > +# Prepare the D script. > + > +cat << EOF > D.d > +/* these definitions come from kernel header include/linux/fs.h */ > +#define FMODE_READ (1 << 0) > +#define FMODE_WRITE (1 << 1) > +#define FMODE_LSEEK (1 << 2) > +#define FMODE_PREAD (1 << 3) > +#define FMODE_PWRITE (1 << 4) > + > +#define FMODE_WRITER (1 << 16) > +#define FMODE_CAN_READ (1 << 17) > +#define FMODE_CAN_WRITE (1 << 18) > +#define FMODE_OPENED (1 << 19) > + > +fbt::vfs_write:entry > +/pid == \$target/ > +{ > + mode = ((struct file *)arg0)->f_mode; > + printf("mode READ : %s\n", mode & FMODE_READ ? "yes" : "no"); > + printf("mode WRITE : %s\n", mode & FMODE_WRITE ? "yes" : "no"); > + printf("mode LSEEK : %s\n", mode & FMODE_LSEEK ? "yes" : "no"); > + printf("mode PREAD : %s\n", mode & FMODE_PREAD ? "yes" : "no"); > + printf("mode PWRITE : %s\n", mode & FMODE_PWRITE ? "yes" : "no"); > + printf("mode WRITER : %s\n", mode & FMODE_WRITER ? "yes" : "no"); > + printf("mode CAN_READ : %s\n", mode & FMODE_CAN_READ ? "yes" : "no"); > + printf("mode CAN_WRITE : %s\n", mode & FMODE_CAN_WRITE ? "yes" : "no"); > + printf("mode OPENED : %s\n", mode & FMODE_OPENED ? "yes" : "no"); > + > + printf("buf: %s\n", stringof(copyinstr(arg1))); > + printf("count: %d\n", arg2); > + printf("pos: %d", *((loff_t *)arg3)); > + exit(0); > +} > +EOF > + > +# Run the D script and trigger twice, once with O_WRONLY and then O_RDWR. > + > +for args in "" "dummy"; do > + > + # Prepare the file to be (over)written. > + rm -f tmp.txt > + echo abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789 > tmp.txt > + > + # Run the D script and trigger. > + $dtrace $dt_flags -c "./a.out $args" -Cqs D.d > + > + # Report the output file. > + cat tmp.txt > + echo > + > +done > + > +echo success > +exit 0 > -- > 2.43.5 > From kris.van.hees at oracle.com Wed Mar 19 19:04:55 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Wed, 19 Mar 2025 15:04:55 -0400 Subject: [DTrace-devel] [PATCH] test: Expect USDT argmap to fail on ARM on older kernels In-Reply-To: <20250319063230.28171-4-eugene.loh@oracle.com> References: <20250319063230.28171-1-eugene.loh@oracle.com> <20250319063230.28171-4-eugene.loh@oracle.com> Message-ID: I am holding off on this patch for the moment, just to get a closer look at the potential cause (and some info on how old the kernel needs to be for this to fail). I would expect arg access to be the problem rather than the mapping because the mapping is simply something that happens in BPF code. That should not be kernel-dependent. On Wed, Mar 19, 2025 at 02:32:29AM -0400, eugene.loh--- via DTrace-devel wrote: > From: Eugene Loh > > Signed-off-by: Eugene Loh > --- > test/unittest/usdt/skip_arm_uek6.x | 25 +++++++++++++++++++ > .../usdt/tst.argmap-typed-partial.aarch64.x | 1 + > test/unittest/usdt/tst.argmap-typed.aarch64.x | 1 + > .../tst.multiprov-dupprobe-fire.aarch64.x | 1 + > .../tst.multiprov-dupprobe-shlibs.aarch64.x | 1 + > .../usdt/tst.multiprovider-fire.aarch64.x | 1 + > 6 files changed, 30 insertions(+) > create mode 100755 test/unittest/usdt/skip_arm_uek6.x > create mode 120000 test/unittest/usdt/tst.argmap-typed-partial.aarch64.x > create mode 120000 test/unittest/usdt/tst.argmap-typed.aarch64.x > create mode 120000 test/unittest/usdt/tst.multiprov-dupprobe-fire.aarch64.x > create mode 120000 test/unittest/usdt/tst.multiprov-dupprobe-shlibs.aarch64.x > create mode 120000 test/unittest/usdt/tst.multiprovider-fire.aarch64.x > > diff --git a/test/unittest/usdt/skip_arm_uek6.x b/test/unittest/usdt/skip_arm_uek6.x > new file mode 100755 > index 000000000..252cbebb5 > --- /dev/null > +++ b/test/unittest/usdt/skip_arm_uek6.x > @@ -0,0 +1,25 @@ > +#!/bin/bash > +# Licensed under the Universal Permissive License v 1.0 as shown at > +# http://oss.oracle.com/licenses/upl. > +# > +# @@skip: not run directly by test harness > +# > +# Tests that depend on USDT argument translation fail on ARM for UEK6. > +# They're fine for UEK7. It is unclear in exactly which kernel they > +# start working. > + > +if [[ `uname -m` != "aarch64" ]]; then > + exit 0 > +fi > + > +read MAJOR MINOR <<< `uname -r | grep -Eo '^[0-9]+\.[0-9]+' | tr '.' ' '` > + > +if [ $MAJOR -gt 5 ]; then > + exit 0 > +fi > +if [ $MAJOR -eq 5 -a $MINOR -ge 10 ]; then > + exit 0 > +fi > + > +echo "USDT argmap not working on ARM on older kernels" > +exit 1 > diff --git a/test/unittest/usdt/tst.argmap-typed-partial.aarch64.x b/test/unittest/usdt/tst.argmap-typed-partial.aarch64.x > new file mode 120000 > index 000000000..8d462f98f > --- /dev/null > +++ b/test/unittest/usdt/tst.argmap-typed-partial.aarch64.x > @@ -0,0 +1 @@ > +skip_arm_uek6.x > \ No newline at end of file > diff --git a/test/unittest/usdt/tst.argmap-typed.aarch64.x b/test/unittest/usdt/tst.argmap-typed.aarch64.x > new file mode 120000 > index 000000000..8d462f98f > --- /dev/null > +++ b/test/unittest/usdt/tst.argmap-typed.aarch64.x > @@ -0,0 +1 @@ > +skip_arm_uek6.x > \ No newline at end of file > diff --git a/test/unittest/usdt/tst.multiprov-dupprobe-fire.aarch64.x b/test/unittest/usdt/tst.multiprov-dupprobe-fire.aarch64.x > new file mode 120000 > index 000000000..8d462f98f > --- /dev/null > +++ b/test/unittest/usdt/tst.multiprov-dupprobe-fire.aarch64.x > @@ -0,0 +1 @@ > +skip_arm_uek6.x > \ No newline at end of file > diff --git a/test/unittest/usdt/tst.multiprov-dupprobe-shlibs.aarch64.x b/test/unittest/usdt/tst.multiprov-dupprobe-shlibs.aarch64.x > new file mode 120000 > index 000000000..8d462f98f > --- /dev/null > +++ b/test/unittest/usdt/tst.multiprov-dupprobe-shlibs.aarch64.x > @@ -0,0 +1 @@ > +skip_arm_uek6.x > \ No newline at end of file > diff --git a/test/unittest/usdt/tst.multiprovider-fire.aarch64.x b/test/unittest/usdt/tst.multiprovider-fire.aarch64.x > new file mode 120000 > index 000000000..8d462f98f > --- /dev/null > +++ b/test/unittest/usdt/tst.multiprovider-fire.aarch64.x > @@ -0,0 +1 @@ > +skip_arm_uek6.x > \ No newline at end of file > -- > 2.43.5 > > > _______________________________________________ > DTrace-devel mailing list > DTrace-devel at oss.oracle.com > https://oss.oracle.com/mailman/listinfo/dtrace-devel From kris.van.hees at oracle.com Wed Mar 19 19:07:17 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Wed, 19 Mar 2025 15:07:17 -0400 Subject: [DTrace-devel] [PATCH] Get execargs from user space In-Reply-To: <20250319063230.28171-5-eugene.loh@oracle.com> References: <20250319063230.28171-1-eugene.loh@oracle.com> <20250319063230.28171-5-eugene.loh@oracle.com> Message-ID: On Wed, Mar 19, 2025 at 02:32:30AM -0400, eugene.loh--- via DTrace-devel wrote: > From: Eugene Loh > > Signed-off-by: Eugene Loh Reviewed-by: Kris Van Hees I expect this is a failure on arm64 only? That would make sense since it is the only arch we currently work with for DTrace that has kernels where the specific probe_read_kernel/user separation is enforced. I'll fix the comment as indicated below while merging. > --- > bpf/bvar_execargs.S | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/bpf/bvar_execargs.S b/bpf/bvar_execargs.S > index 1c47cafb2..08844f15f 100644 > --- a/bpf/bvar_execargs.S > +++ b/bpf/bvar_execargs.S > @@ -65,7 +65,7 @@ dt_bvar_execargs: > mov %r1, %r9 > mov %r2, %r8 > mov %r3, %r7 > - call BPF_FUNC_probe_read /* bpf_probe_read(&args, len + 1, arg_start) */ > + call BPF_FUNC_probe_read_user /* bpf_probe_read(&args, len + 1, arg_start) */ comment should also mention bpf_probe_read_user > jne %r0, 0, .Lerror > > /* loop over args and replace '\0' with ' ' */ > -- > 2.43.5 > > > _______________________________________________ > DTrace-devel mailing list > DTrace-devel at oss.oracle.com > https://oss.oracle.com/mailman/listinfo/dtrace-devel From eugene.loh at oracle.com Wed Mar 19 19:19:01 2025 From: eugene.loh at oracle.com (Eugene Loh) Date: Wed, 19 Mar 2025 15:19:01 -0400 Subject: [DTrace-devel] [PATCH] Get execargs from user space In-Reply-To: References: <20250319063230.28171-1-eugene.loh@oracle.com> <20250319063230.28171-5-eugene.loh@oracle.com> Message-ID: <7d0f5417-ee02-815f-03e1-f2fe0a544bbd@oracle.com> On 3/19/25 15:07, Kris Van Hees wrote: > On Wed, Mar 19, 2025 at 02:32:30AM -0400, eugene.loh--- via DTrace-devel wrote: >> From: Eugene Loh >> >> Signed-off-by: Eugene Loh > Reviewed-by: Kris Van Hees > > I expect this is a failure on arm64 only? Right. > That would make sense since it is > the only arch we currently work with for DTrace that has kernels where the > specific probe_read_kernel/user separation is enforced. > > I'll fix the comment as indicated below while merging. Thanks. >> --- >> bpf/bvar_execargs.S | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/bpf/bvar_execargs.S b/bpf/bvar_execargs.S >> index 1c47cafb2..08844f15f 100644 >> --- a/bpf/bvar_execargs.S >> +++ b/bpf/bvar_execargs.S >> @@ -65,7 +65,7 @@ dt_bvar_execargs: >> mov %r1, %r9 >> mov %r2, %r8 >> mov %r3, %r7 >> - call BPF_FUNC_probe_read /* bpf_probe_read(&args, len + 1, arg_start) */ >> + call BPF_FUNC_probe_read_user /* bpf_probe_read(&args, len + 1, arg_start) */ > comment should also mention bpf_probe_read_user > >> jne %r0, 0, .Lerror >> >> /* loop over args and replace '\0' with ' ' */ >> -- >> 2.43.5 >> >> >> _______________________________________________ >> DTrace-devel mailing list >> DTrace-devel at oss.oracle.com >> https://oss.oracle.com/mailman/listinfo/dtrace-devel From noreply at github.com Thu Mar 20 04:52:52 2025 From: noreply at github.com (euloh) Date: Wed, 19 Mar 2025 21:52:52 -0700 Subject: [DTrace-devel] [oracle/dtrace-utils] 93870f: test: Allow more variations in expected fbt kernel... Message-ID: Branch: refs/heads/devel Home: https://github.com/oracle/dtrace-utils Commit: 93870f7cf35269aea14c1353e3aa9b01c2776071 https://github.com/oracle/dtrace-utils/commit/93870f7cf35269aea14c1353e3aa9b01c2776071 Author: Eugene Loh Date: 2025-03-19 (Wed, 19 Mar 2025) Changed paths: M test/unittest/stack/tst.stack_fbt.sh Log Message: ----------- test: Allow more variations in expected fbt kernel stacks This test checks the call stack upon entry to vfs_write(). Unfortunately, these checks require some maintenance since the call stack can vary -- slightly or greatly -- depending on processor or kernel. There is a competition between ease of test maintenance and strictness of correctness checks. Adapt post processing of output to allow new variations in stacks seen in UEK 8 (currently Linux 6.12). Orabug: 37459289 Signed-off-by: Eugene Loh Reviewed-by: Kris Van Hees Commit: 219d6ac387f3c9cdfe24ae8d7ca67ee2f8285d1e https://github.com/oracle/dtrace-utils/commit/219d6ac387f3c9cdfe24ae8d7ca67ee2f8285d1e Author: Eugene Loh Date: 2025-03-19 (Wed, 19 Mar 2025) Changed paths: M test/unittest/ustack/tst.ustack25_pid.r A test/unittest/ustack/tst.ustack25_pid.r.p Log Message: ----------- test: Account for pid:::entry ustack() being correct The pid:::entry uprobe fires so early in the function preamble that the frame pointer is not yet set and the caller is not (yet) correctly identified. In Linux 6.11, x86-specific heuristics address this problem. Post process results from this test to accommodate both cases -- missing caller and not missing caller. Orabug: 37459289 Signed-off-by: Eugene Loh Reviewed-by: Kris Van Hees Commit: 6bd4b26026aad14255d83f57044b1c41d4bd698b https://github.com/oracle/dtrace-utils/commit/6bd4b26026aad14255d83f57044b1c41d4bd698b Author: Eugene Loh Date: 2025-03-19 (Wed, 19 Mar 2025) Changed paths: M libdtrace/dt_cg.c Log Message: ----------- Use DT_TRAMP_SP_SLOT() for BPF stack scratch space in trampoline We might as well get this code right, even if this "fix" is arguably irrelevant for two reasons: *) The offset just so happens to be -96 before and after the change anyhow, just by coincidence. *) The fix is on a code path that is not currently in use. Signed-off-by: Eugene Loh Reviewed-by: Kris Van Hees Commit: d5beecf9f1a9a3aa7064676841f6a738b2c7e348 https://github.com/oracle/dtrace-utils/commit/d5beecf9f1a9a3aa7064676841f6a738b2c7e348 Author: Eugene Loh Date: 2025-03-19 (Wed, 19 Mar 2025) Changed paths: M libdtrace/dt_program.c Log Message: ----------- Rename _DTRACE_VERSION There are many DTrace version numbers (for version, API version, package version, etc.). Meanwhile, _DTRACE_VERSION is not a version number at all. It's a preprocessor macro in USDT .h header files. Prior to commit e2fb0ecd9 ("Ensure multiple passes through dtrace -G work."), it was perhaps not even set. With that commit, it was always set to 1, with the rationale: Also add an explicit define for _DTRACE_VERSION in the generated header file from 'dtrace -h' invocations. This seems silly, but it is there to give people a skeleton to work with if they want to pre-generate header files and select whether to actually compile on the probes at a later time. Rename to _DTRACE_USE_USDT for better clarity. Define it only once per file. Place the definition inside an #ifndef test so that a developer could set the value without manually changing the file. Signed-off-by: Eugene Loh Reviewed-by: Kris Van Hees Commit: b35a2d35b0ec173e5a5b6ceb546eba3bd4cfa21e https://github.com/oracle/dtrace-utils/commit/b35a2d35b0ec173e5a5b6ceb546eba3bd4cfa21e Author: eugene.loh at oracle.com Date: 2025-03-19 (Wed, 19 Mar 2025) Changed paths: M libdtrace/dt_prov_uprobe.c Log Message: ----------- Clarify how the usdt_prids key is stored on the BPF stack While one can access the BPF stack relative to %r9, the whole point of DT_TRAMP_SP_SLOT(0) is to make trampoline code more readable. So use it. Signed-off-by: Eugene Loh Reviewed-by: Kris Van Hees Commit: 666abf31923976c8da43d038b5f2d7d7ba9a0ab0 https://github.com/oracle/dtrace-utils/commit/666abf31923976c8da43d038b5f2d7d7ba9a0ab0 Author: eugene.loh at oracle.com Date: 2025-03-19 (Wed, 19 Mar 2025) Changed paths: M dtprobed/dtprobed.c Log Message: ----------- Fix format specifier in dtprobed.c The format specifier is %i but nprobes is size_t. Some compilers issue warnings. Change the format specifier to match the type. Signed-off-by: Eugene Loh Reviewed-by: Kris Van Hees Commit: 6cb6817b7c75836d4526c08fc780deaae2050e1e https://github.com/oracle/dtrace-utils/commit/6cb6817b7c75836d4526c08fc780deaae2050e1e Author: eugene.loh at oracle.com Date: 2025-03-19 (Wed, 19 Mar 2025) Changed paths: A test/unittest/builtinvar/tst.tid_pid.r A test/unittest/builtinvar/tst.tid_pid.sh Log Message: ----------- test: Check tid value We were checking the built-in variable tid simply by testing that we could print it and its value was not -1. Add a test that confirms the value is actually correct; compare to C output of gettid(). In line with other similar tests, also check for the profile provider. While we're at it, check the pid value and the pthread_t value returned via pthread_create(). Signed-off-by: Eugene Loh Reviewed-by: Kris Van Hees Commit: 4aec5c9931ebe770b13b1dc9b716393ad91980fb https://github.com/oracle/dtrace-utils/commit/4aec5c9931ebe770b13b1dc9b716393ad91980fb Author: Nick Alcock Date: 2025-03-19 (Wed, 19 Mar 2025) Changed paths: M libdtrace/dt_pid.c Log Message: ----------- dt_pid: pid grabs should be shortlived If we use long-lived grabs for this, we are requiring that the process is ptraceable, and thus preventing pid tracing of system daemons, init, processes already being debugged or traced by others, etc. Signed-off-by: Nick Alcock Reviewed-by: Kris Van Hees Commit: 6f398f2297aa7c903aec0d2e083add64e7644044 https://github.com/oracle/dtrace-utils/commit/6f398f2297aa7c903aec0d2e083add64e7644044 Author: eugene.loh at oracle.com Date: 2025-03-19 (Wed, 19 Mar 2025) Changed paths: A test/unittest/usdt/convert_PID_and_PRID.awk M test/unittest/usdt/err.argmap-null.r R test/unittest/usdt/err.argmap-null.r.p A test/unittest/usdt/err.argmap-null.r.p M test/unittest/usdt/tst.dlclose1.r R test/unittest/usdt/tst.dlclose1.r.p A test/unittest/usdt/tst.dlclose1.r.p M test/unittest/usdt/tst.enable_pid.r R test/unittest/usdt/tst.enable_pid.r.p A test/unittest/usdt/tst.enable_pid.r.p M test/unittest/usdt/tst.exec-dof-replacement.r R test/unittest/usdt/tst.exec-dof-replacement.r.p A test/unittest/usdt/tst.exec-dof-replacement.r.p R test/unittest/usdt/tst.multiprov-dupprobe-fire.r.p A test/unittest/usdt/tst.multiprov-dupprobe-fire.r.p R test/unittest/usdt/tst.multiprov-dupprobe.r.p A test/unittest/usdt/tst.multiprov-dupprobe.r.p R test/unittest/usdt/tst.multiprovider-fire.r.p A test/unittest/usdt/tst.multiprovider-fire.r.p R test/unittest/usdt/tst.multiprovider.r.p A test/unittest/usdt/tst.multiprovider.r.p Log Message: ----------- test: Make tests more resilient to different prid widths Various tests convert run-dependent values -- like PIDs and probe IDs -- to run-independent strings before checking against their .r results files. But the conversions could be remarkably sensitive to the width of probe IDs. E.g., some conversions assumed probe IDs were flush with the beginning of the line, but if they were narrower they were preceded by white space and were not detected. E.g., this happened in recent fbt work, where probe IDs for fbt probes became much smaller in value. Also, these conversions were being carried out by a hodgepodge of scripts -- sed, awk, and grep; some using run-independent strings like "NNN" or "XXXX" instead of more informative "PID" and "PRID" strings; some incorrectly using "PID" for PRIDs, etc. Replace these .r.p postprocessing scripts with a single script that is more resilient to PRID widths and is commented. Signed-off-by: Eugene Loh Reviewed-by: Kris Van Hees Commit: 8ac281593130c58566e5bb40e9d22a9950644366 https://github.com/oracle/dtrace-utils/commit/8ac281593130c58566e5bb40e9d22a9950644366 Author: eugene.loh at oracle.com Date: 2025-03-19 (Wed, 19 Mar 2025) Changed paths: A test/unittest/vars/tst.ucaller.r.p Log Message: ----------- test: Account for pid:::entry ucaller being correct In commit f38bdf9ea ("test: Account for pid:::entry ustack() being correct") we accounted for x86-specific heuristics introduced in Linux 6.11 that dealt with pid:::entry uprobes firing so early in the function preamble that the frame pointer is not yet set and the caller is not (yet) correctly identified. Update a related test to account for the same effect with ucaller. Signed-off-by: Eugene Loh Reviewed-by: Kris Van Hees Commit: 5fa86bd5021d4a8d531a1f5fbe8c67954e29d51f https://github.com/oracle/dtrace-utils/commit/5fa86bd5021d4a8d531a1f5fbe8c67954e29d51f Author: eugene.loh at oracle.com Date: 2025-03-19 (Wed, 19 Mar 2025) Changed paths: M bpf/get_bvar.c Log Message: ----------- Fix dt_bvar_probedesc() for late USDT processes With commit 8bd26415b ("bpf: separate bvar implementation into separate functions"), test/unittest/usdt/tst.nusdtprobes.sh started failing reproducibly on all platforms. In that patch, the get_bvar() function is factored into separate functions. It includes a change that looks basically like this: uint32_t key = mst->prid; if (key < ((uint64_t)&NPROBES)) { [...] } else { char *s = bpf_map_lookup_elem(&usdt_names, &key); switch (idx) { - case DIF_VAR_PROBENAME: s += DTRACE_FUNCNAMELEN; + case DIF_VAR_PROBEPROV: s += DTRACE_FUNCNAMELEN; - case DIF_VAR_PROBEFUNC: s += DTRACE_MODNAMELEN; + case DIF_VAR_PROBEMOD : s += DTRACE_MODNAMELEN; - case DIF_VAR_PROBEMOD : s += DTRACE_PROVNAMELEN; + case DIF_VAR_PROBEFUNC: s += DTRACE_PROVNAMELEN; - case DIF_VAR_PROBEPROV: + case DIF_VAR_PROBENAME: } return (uint64_t)s; } That is, for the case of key>=NPROBES (that is, for USDT probes that were added after the dtrace session was started), the meanings of prov, mod, func, and name were exchanged. Restore the correct meanings. Signed-off-by: Eugene Loh Reviewed-by: Kris Van Hees Commit: da8559baa7685b42c19180bb8d0f485ac0f736e5 https://github.com/oracle/dtrace-utils/commit/da8559baa7685b42c19180bb8d0f485ac0f736e5 Author: eugene.loh at oracle.com Date: 2025-03-19 (Wed, 19 Mar 2025) Changed paths: M libdtrace/dt_prov_fbt.c A test/unittest/fbtprovider/tst.entryargs2.r A test/unittest/fbtprovider/tst.entryargs2.sh Log Message: ----------- Copy fprobes entry args with BPF helper function With commit a6b626a89 ("Fix fprobe/kprobe selection"), fprobes were effectively turned on. Unfortunately, with this fix, some tests like test/unittest/stack/tst.stack_fbt.sh encountered problems on UEK7 since the BPF verifier would complain about the prototypes of some of the probe arguments. E.g., when loading arg3 in fprobe_trampoline() from fbt::vfs_write:entry from %r8+24, the BPF verifier complains func 'vfs_write' arg3 type INT is not a struct invalid bpf_context access off=24 size=8 We can bypass this problem by using a BPF helper function to copy the arguments onto the BPF stack and then load the arguments into mstate from there. There is also a BPF get_func_arg() helper function, but it is not introduced until 5.17 -- that is, it appears after UEK7. See Linux commit f92c1e1 ("bpf: Add get_func_[arg|ret|arg_cnt] helpers"). While the already mentioned test signals the problem and the fix, we also add an additional test that actually checks the correctness of the arguments. Signed-off-by: Eugene Loh Reviewed-by: Kris Van Hees Commit: ef34498e93f8651e511d6d8531603002c9c3ed26 https://github.com/oracle/dtrace-utils/commit/ef34498e93f8651e511d6d8531603002c9c3ed26 Author: eugene.loh at oracle.com Date: 2025-03-19 (Wed, 19 Mar 2025) Changed paths: M bpf/bvar_execargs.S Log Message: ----------- Get execargs from user space Signed-off-by: Eugene Loh Reviewed-by: Kris Van Hees Commit: b12520b7da20c6f9d962a42444d41dfc27b6f577 https://github.com/oracle/dtrace-utils/commit/b12520b7da20c6f9d962a42444d41dfc27b6f577 Author: eugene.loh at oracle.com Date: 2025-03-19 (Wed, 19 Mar 2025) Changed paths: A test/unittest/providers/rawfbt/tst.synthetic-entry.r.p A test/unittest/providers/rawfbt/tst.synthetic-return.r.p Log Message: ----------- test: Allow duplicate lines for rawfbt synthetic tests It is possible for the probe in these tests to fire "concurrently" on multiple CPUs, leading to duplicate lines of output. Add post processing to reduce multiple lines of duplicate output. Signed-off-by: Eugene Loh Reviewed-by: Kris Van Hees Compare: https://github.com/oracle/dtrace-utils/compare/ad68224b0018...b12520b7da20 To unsubscribe from these emails, change your notification settings at https://github.com/oracle/dtrace-utils/settings/notifications From kris.van.hees at oracle.com Fri Mar 21 14:42:21 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Fri, 21 Mar 2025 10:42:21 -0400 Subject: [DTrace-devel] [PATCH 1/2] dlib: remove obsolete dt_dlib_add_probe_var() Message-ID: <834ff39378a30c154da54c67f4093dde.kris.van.hees@oracle.com> The dt_dlib_add_probe_var() function was added to allow for relcoation processing filling in probe ids for dependent probes, but since the probe ids are known at code generation time, there is no need for this. Signed-off-by: Kris Van Hees --- libdtrace/dt_cg.c | 3 +-- libdtrace/dt_dlibs.c | 16 ---------------- libdtrace/dt_prov_uprobe.c | 6 +----- 3 files changed, 2 insertions(+), 23 deletions(-) diff --git a/libdtrace/dt_cg.c b/libdtrace/dt_cg.c index 8cc99246..e954173b 100644 --- a/libdtrace/dt_cg.c +++ b/libdtrace/dt_cg.c @@ -905,14 +905,13 @@ dt_cg_add_dependent(dtrace_hdl_t *dtp, dt_probe_t *prp, void *arg) { dt_pcb_t *pcb = dtp->dt_pcb; dt_irlist_t *dlp = &pcb->pcb_ir; - dt_ident_t *idp = dt_dlib_add_probe_var(pcb->pcb_hdl, prp); uint_t exitlbl = dt_irlist_label(dlp); int skip = 0; dt_cg_tramp_save_args(pcb); pcb->pcb_parent_probe = pcb->pcb_probe; pcb->pcb_probe = prp; - emite(dlp, BPF_STORE_IMM(BPF_W, BPF_REG_7, DMST_PRID, prp->desc->id), idp); + emit(dlp, BPF_STORE_IMM(BPF_W, BPF_REG_7, DMST_PRID, prp->desc->id)); if (prp->prov->impl->trampoline != NULL) skip = prp->prov->impl->trampoline(pcb, exitlbl); diff --git a/libdtrace/dt_dlibs.c b/libdtrace/dt_dlibs.c index 9ad4f5e7..9a2970c7 100644 --- a/libdtrace/dt_dlibs.c +++ b/libdtrace/dt_dlibs.c @@ -250,22 +250,6 @@ dt_dlib_add_var(dtrace_hdl_t *dtp, const char *name, uint_t id) return dt_dlib_add_sym_id(dtp, name, DT_IDENT_SCALAR, id); } -/* - * Add a BPF variable for a probe. - * The fully qualified probe name is tha variable name, and the probe ID is the - * value of the variable. - */ -dt_ident_t * -dt_dlib_add_probe_var(dtrace_hdl_t *dtp, const dt_probe_t *prp) -{ - char pn[DTRACE_FULLNAMELEN + 1]; - - snprintf(pn, DTRACE_FULLNAMELEN, "%s:%s:%s:%s", prp->desc->prv, - prp->desc->mod, prp->desc->fun, prp->desc->prb); - - return dt_dlib_add_var(dtp, pn, prp->desc->id); -} - /* * Return the DIFO for an external symbol. */ diff --git a/libdtrace/dt_prov_uprobe.c b/libdtrace/dt_prov_uprobe.c index 8dbd2aed..28762eb3 100644 --- a/libdtrace/dt_prov_uprobe.c +++ b/libdtrace/dt_prov_uprobe.c @@ -947,7 +947,6 @@ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl) const dt_probe_t *prp = pop->probe; uint_t lbl_next = dt_irlist_label(dlp); pid_t pid; - dt_ident_t *idp; if (prp->prov->impl != &dt_pid) continue; @@ -955,9 +954,6 @@ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl) pid = dt_pid_get_pid(prp->desc, pcb->pcb_hdl, pcb, NULL); assert(pid != -1); - idp = dt_dlib_add_probe_var(pcb->pcb_hdl, prp); - assert(idp != NULL); - /* * Populate probe arguments. */ @@ -971,7 +967,7 @@ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl) * process, and emit a sequence of clauses for it when it does. */ emit(dlp, BPF_BRANCH_IMM(BPF_JNE, BPF_REG_6, pid, lbl_next)); - emite(dlp, BPF_STORE_IMM(BPF_W, BPF_REG_7, DMST_PRID, prp->desc->id), idp); + emit(dlp, BPF_STORE_IMM(BPF_W, BPF_REG_7, DMST_PRID, prp->desc->id)); dt_cg_tramp_call_clauses(pcb, prp, DT_ACTIVITY_ACTIVE); emitl(dlp, lbl_next, BPF_NOP()); -- 2.45.2 From kris.van.hees at oracle.com Fri Mar 21 14:43:59 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Fri, 21 Mar 2025 10:43:59 -0400 Subject: [DTrace-devel] [PATCH 2/2] consume: avoid a bad prid causing a core dump Message-ID: <9aa129d930bfce0b1a060389fc886e9f.kris.van.hees@oracle.com> We were not guarding against prid being DTACE_IDNONE (0), which would cause a core dump if it was encountered in a trace record. Signed-off-by: Kris Van Hees --- libdtrace/dt_consume.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libdtrace/dt_consume.c b/libdtrace/dt_consume.c index 58a2ead9..ed52b88b 100644 --- a/libdtrace/dt_consume.c +++ b/libdtrace/dt_consume.c @@ -2191,7 +2191,7 @@ dt_consume_one_probe(dtrace_hdl_t *dtp, FILE *fp, char *data, uint32_t size, pdat->dtpda_stid = stid; pdat->dtpda_data = data; - if (prid >= dtp->dt_probe_id) + if (prid >= dtp->dt_probe_id || prid == DTRACE_IDNONE) return dt_set_errno(dtp, EDT_BADID); pdat->dtpda_pdesc = (dtrace_probedesc_t *)dtp->dt_probes[prid]->desc; if (dt_stid_lookup(dtp, stid, &pdat->dtpda_ddesc) != 0) -- 2.45.2 From nick.alcock at oracle.com Fri Mar 21 15:04:55 2025 From: nick.alcock at oracle.com (Nick Alcock) Date: Fri, 21 Mar 2025 15:04:55 +0000 Subject: [DTrace-devel] [PATCH 1/2] dlib: remove obsolete dt_dlib_add_probe_var() In-Reply-To: <834ff39378a30c154da54c67f4093dde.kris.van.hees@oracle.com> (Kris Van Hees via DTrace-devel's message of "Fri, 21 Mar 2025 10:42:21 -0400") References: <834ff39378a30c154da54c67f4093dde.kris.van.hees@oracle.com> Message-ID: <87msde4e7c.fsf@esperi.org.uk> On 21 Mar 2025, Kris Van Hees via DTrace-devel outgrape: > The dt_dlib_add_probe_var() function was added to allow for relcoation > processing filling in probe ids for dependent probes, but since the > probe ids are known at code generation time, there is no need for this. I guess this is removing a relocation/fixup which was always the identity transformation (prp->desc->id being reset to itself at reloc resolution time). If so... > Signed-off-by: Kris Van Hees Reviewed-by: Nick Alcock -- NULL && (void) From nick.alcock at oracle.com Fri Mar 21 15:06:28 2025 From: nick.alcock at oracle.com (Nick Alcock) Date: Fri, 21 Mar 2025 15:06:28 +0000 Subject: [DTrace-devel] [PATCH 2/2] consume: avoid a bad prid causing a core dump In-Reply-To: <9aa129d930bfce0b1a060389fc886e9f.kris.van.hees@oracle.com> (Kris Van Hees via DTrace-devel's message of "Fri, 21 Mar 2025 10:43:59 -0400") References: <9aa129d930bfce0b1a060389fc886e9f.kris.van.hees@oracle.com> Message-ID: <87iko24e4r.fsf@esperi.org.uk> On 21 Mar 2025, Kris Van Hees via DTrace-devel said: > We were not guarding against prid being DTACE_IDNONE (0), which would > cause a core dump if it was encountered in a trace record. ... hm, I had no idea that the dt_probes array was 1-based, but I guess given that the NONE id is 0 it kind of has to be. > Signed-off-by: Kris Van Hees Reviewed-by: Nick Alcock From noreply at github.com Fri Mar 21 15:21:04 2025 From: noreply at github.com (Kris Van Hees) Date: Fri, 21 Mar 2025 08:21:04 -0700 Subject: [DTrace-devel] [oracle/dtrace-utils] f6aa2c: dlib: remove obsolete dt_dlib_add_probe_var() Message-ID: Branch: refs/heads/devel Home: https://github.com/oracle/dtrace-utils Commit: f6aa2c36ea55b8b84cbbf5b195d07597c8ec89b5 https://github.com/oracle/dtrace-utils/commit/f6aa2c36ea55b8b84cbbf5b195d07597c8ec89b5 Author: Kris Van Hees Date: 2025-03-21 (Fri, 21 Mar 2025) Changed paths: M libdtrace/dt_cg.c M libdtrace/dt_dlibs.c M libdtrace/dt_prov_uprobe.c Log Message: ----------- dlib: remove obsolete dt_dlib_add_probe_var() The dt_dlib_add_probe_var() function was added to allow for relcoation processing filling in probe ids for dependent probes, but since the probe ids are known at code generation time, there is no need for this. Signed-off-by: Kris Van Hees Reviewed-by: Nick Alcock Commit: 2335340016e46d5c8c9ca0b60c3da6ce869215a5 https://github.com/oracle/dtrace-utils/commit/2335340016e46d5c8c9ca0b60c3da6ce869215a5 Author: Kris Van Hees Date: 2025-03-21 (Fri, 21 Mar 2025) Changed paths: M libdtrace/dt_consume.c Log Message: ----------- consume: avoid a bad prid causing a core dump We were not guarding against prid being DTACE_IDNONE (0), which would cause a core dump if it was encountered in a trace record. Signed-off-by: Kris Van Hees Reviewed-by: Nick Alcock Compare: https://github.com/oracle/dtrace-utils/compare/b12520b7da20...2335340016e4 To unsubscribe from these emails, change your notification settings at https://github.com/oracle/dtrace-utils/settings/notifications From eugene.loh at oracle.com Tue Mar 25 22:25:19 2025 From: eugene.loh at oracle.com (eugene.loh at oracle.com) Date: Tue, 25 Mar 2025 18:25:19 -0400 Subject: [DTrace-devel] [PATCH 2/4] test: Skip trace() of a 1-byte struct In-Reply-To: <20250325222521.15224-1-eugene.loh@oracle.com> References: <20250325222521.15224-1-eugene.loh@oracle.com> Message-ID: <20250325222521.15224-2-eugene.loh@oracle.com> From: Eugene Loh With commit 3a551bfd ("trace: fix char-array handling"), this test started to FAIL. Meanwhile, the behavior of trace() on a 1-byte struct is poorly defined. Users wishing clear semantics should use print() or other actions. Skip the test. Signed-off-by: Eugene Loh --- .../actions/trace/tst.struct-1-byte.d | 24 +++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/test/unittest/actions/trace/tst.struct-1-byte.d b/test/unittest/actions/trace/tst.struct-1-byte.d index 36de485f8..8bfcbdd53 100644 --- a/test/unittest/actions/trace/tst.struct-1-byte.d +++ b/test/unittest/actions/trace/tst.struct-1-byte.d @@ -5,6 +5,30 @@ * http://oss.oracle.com/licenses/upl. */ +/* + * With Solaris or legacy DTrace on Linux, the script gives + * 52 + * That is, contents are dumped as an integer because the trace() + * argument is 1, 2, 4, or 8 bytes -- specifically, it is 1 byte. + * + * Before commit 3a551bfd ("trace: fix char-array handling"), we got + * 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef + * 0: 34 4 + * That is, contents were dumped as raw bytes. + * + * With that patch, we get: + * 4 + * That is, contents are dumped as the string "4" since the contents + * are a printable char followed optionally by NUL bytes. + * + * The truth is that the semantics of trace() are poorly defined. + * So we are hard-pressed to say what is correct. The test has + * little value. Skip it. + * + * Users who want type-aware printing can use the print() action. + */ +/* @@skip: poorly defined semantics */ + /* * ASSERTION: The trace() action prints a struct { int8_t } correctly. * -- 2.43.5 From eugene.loh at oracle.com Tue Mar 25 22:25:20 2025 From: eugene.loh at oracle.com (eugene.loh at oracle.com) Date: Tue, 25 Mar 2025 18:25:20 -0400 Subject: [DTrace-devel] [PATCH 3/4] test: Update some char-array results files In-Reply-To: <20250325222521.15224-1-eugene.loh@oracle.com> References: <20250325222521.15224-1-eugene.loh@oracle.com> Message-ID: <20250325222521.15224-3-eugene.loh@oracle.com> From: Eugene Loh A few tests started failing with commit 3a551bfd ("trace: fix char-array handling"). Update the results files to reflect older behavior, whether on Solaris or with the legacy Linux DTrace implementation. Notice that test/unittest/funcs/substr/tst.substr-large-idx.d specifies strsize=256. The older behavior would result in 256 chars being shown. We add a results file that includes a 257th char. Signed-off-by: Eugene Loh --- .../funcs/substr/tst.substr-large-idx.r | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/test/unittest/funcs/substr/tst.substr-large-idx.r b/test/unittest/funcs/substr/tst.substr-large-idx.r index 8b1378917..79fecbd32 100644 --- a/test/unittest/funcs/substr/tst.substr-large-idx.r +++ b/test/unittest/funcs/substr/tst.substr-large-idx.r @@ -1 +1,20 @@ + 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef + 0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ + 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ + 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ + 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ + 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ + 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ + 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ + 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ + 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ + 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ + a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ + b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ + c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ + d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ + e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ + f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ + 100: 00 . + -- 2.43.5 From eugene.loh at oracle.com Tue Mar 25 22:25:21 2025 From: eugene.loh at oracle.com (eugene.loh at oracle.com) Date: Tue, 25 Mar 2025 18:25:21 -0400 Subject: [DTrace-devel] [PATCH 4/4] Pad strings in the output buffer with NUL bytes after terminating byte In-Reply-To: <20250325222521.15224-1-eugene.loh@oracle.com> References: <20250325222521.15224-1-eugene.loh@oracle.com> Message-ID: <20250325222521.15224-4-eugene.loh@oracle.com> From: Eugene Loh The consumer checks if there are non-NUL bytes after the terminating NUL byte to decide whether to print the contents of the output buffer as a string or as raw bytes. So, for strings, make sure that the string is padded with NUL bytes. A robust test that shows the problem can be hard to devise. Some tests encounter the problem. This patch adds another test that might show the problem. Signed-off-by: Eugene Loh --- libdtrace/dt_cg.c | 46 ++++++++++++++++++-------- test/unittest/error/tst.trace_string.d | 26 +++++++++++++++ 2 files changed, 59 insertions(+), 13 deletions(-) create mode 100644 test/unittest/error/tst.trace_string.d diff --git a/libdtrace/dt_cg.c b/libdtrace/dt_cg.c index 9b3592b9c..6dcf4cd3d 100644 --- a/libdtrace/dt_cg.c +++ b/libdtrace/dt_cg.c @@ -1596,6 +1596,22 @@ dt_cg_check_ptr_arg(dt_irlist_t *dlp, dt_regset_t *drp, dt_node_t *dnp, static void dt_cg_setx(dt_irlist_t *dlp, int reg, uint64_t x); +/* + * Store a pointer to the 'memory block of zeros' in reg. + */ +static void +dt_cg_zerosptr(int reg, dt_irlist_t *dlp, dt_regset_t *drp) +{ + dtrace_hdl_t *dtp = yypcb->pcb_hdl; + dt_ident_t *zero_off = dt_dlib_get_var(dtp, "ZERO_OFF"); + + dt_cg_access_dctx(reg, dlp, drp, DCTX_STRTAB); + emite(dlp, BPF_ALU64_IMM(BPF_ADD, reg, -1), zero_off); +} + +/* + * Store a value to the output buffer. + */ static int dt_cg_store_val(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind, dt_pfargv_t *pfp, int arg) @@ -1676,6 +1692,7 @@ dt_cg_store_val(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind, goto ok; } else if (dt_node_is_string(dnp)) { + uint_t lbl_ok = dt_irlist_label(dlp); size_t strsize = dtp->dt_options[DTRACEOPT_STRSIZE]; if (!not_null) @@ -1702,6 +1719,22 @@ dt_cg_store_val(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind, dt_regset_xalloc(drp, BPF_REG_0); emit(dlp, BPF_CALL_HELPER(BPF_FUNC_probe_read_str)); dt_regset_free_args(drp); + + /* + * Pad the rest with zeroes, if necessary. + */ + emit(dlp, BPF_BRANCH_IMM(BPF_JGE, BPF_REG_0, strsize + 1, lbl_ok)); + if (dt_regset_xalloc_args(drp) == -1) + longjmp(yypcb->pcb_jmpbuf, EDT_NOREG); + emit(dlp, BPF_MOV_REG(BPF_REG_1, BPF_REG_9)); + emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, off)); + emit(dlp, BPF_ALU64_REG(BPF_ADD, BPF_REG_1, BPF_REG_0)); + emit(dlp, BPF_MOV_IMM(BPF_REG_2, strsize + 1)); + emit(dlp, BPF_ALU64_REG(BPF_SUB, BPF_REG_2, BPF_REG_0)); + dt_cg_zerosptr(BPF_REG_3, dlp, drp); + emit(dlp, BPF_CALL_HELPER(dtp->dt_bpfhelper[BPF_FUNC_probe_read_kernel])); + dt_regset_free_args(drp); + emitl(dlp, lbl_ok, BPF_NOP()); dt_regset_free(drp, BPF_REG_0); TRACE_REGSET("store_val(): End "); @@ -3042,19 +3075,6 @@ dt_cg_pop_stack(int reg, dt_irlist_t *dlp, dt_regset_t *drp) dt_regset_free(drp, treg); } -/* - * Store a pointer to the 'memory block of zeros' in reg. - */ -static void -dt_cg_zerosptr(int reg, dt_irlist_t *dlp, dt_regset_t *drp) -{ - dtrace_hdl_t *dtp = yypcb->pcb_hdl; - dt_ident_t *zero_off = dt_dlib_get_var(dtp, "ZERO_OFF"); - - dt_cg_access_dctx(reg, dlp, drp, DCTX_STRTAB); - emite(dlp, BPF_ALU64_IMM(BPF_ADD, reg, -1), zero_off); -} - /* * Generate code to promote signed scalars (size < 64 bits) to native register * size (64 bits). diff --git a/test/unittest/error/tst.trace_string.d b/test/unittest/error/tst.trace_string.d new file mode 100644 index 000000000..4b06aef88 --- /dev/null +++ b/test/unittest/error/tst.trace_string.d @@ -0,0 +1,26 @@ +/* + * Oracle Linux DTrace. + * Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved. + * Licensed under the Universal Permissive License v 1.0 as shown at + * http://oss.oracle.com/licenses/upl. + */ + +/* + * ASSERTION: Test ERROR probe firing. + * + * SECTION: dtrace Provider + */ + +#pragma D option quiet + +ERROR +{ + trace("Error fired"); + exit(0); +} + +BEGIN +{ + *(char *)NULL; + exit(1); +} -- 2.43.5 From eugene.loh at oracle.com Tue Mar 25 22:25:18 2025 From: eugene.loh at oracle.com (eugene.loh at oracle.com) Date: Tue, 25 Mar 2025 18:25:18 -0400 Subject: [DTrace-devel] [PATCH 1/4] Remove orphaned dtrace_recdesc_t component dtrd_uarg Message-ID: <20250325222521.15224-1-eugene.loh@oracle.com> From: Eugene Loh Signed-off-by: Eugene Loh --- include/dtrace/metadesc.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/include/dtrace/metadesc.h b/include/dtrace/metadesc.h index 8a4add255..b0f789932 100644 --- a/include/dtrace/metadesc.h +++ b/include/dtrace/metadesc.h @@ -2,7 +2,7 @@ * Licensed under the Universal Permissive License v 1.0 as shown at * http://oss.oracle.com/licenses/upl. * - * Copyright (c) 2009, 2024, Oracle and/or its affiliates. All rights reserved. + * Copyright (c) 2009, 2025, Oracle and/or its affiliates. All rights reserved. */ /* @@ -39,7 +39,6 @@ typedef struct dtrace_recdesc { uint16_t dtrd_alignment; /* required alignment */ void *dtrd_format; /* format, if any */ uint64_t dtrd_arg; /* action argument */ - uint64_t dtrd_uarg; /* user argument */ } dtrace_recdesc_t; typedef struct dtrace_datadesc { -- 2.43.5 From eugene.loh at oracle.com Sun Mar 30 20:57:36 2025 From: eugene.loh at oracle.com (eugene.loh at oracle.com) Date: Sun, 30 Mar 2025 16:57:36 -0400 Subject: [DTrace-devel] [PATCH] Fix arg3 for sched enqueue and dequeue probes Message-ID: <20250330205736.10970-1-eugene.loh@oracle.com> From: Eugene Loh For sched enqueue and dequeue probes, arg3 should be an int for enqueue and not exist for dequeue. In the code it was the other way around. Fix this and the associated tests. The trampoline was already set up to get this argument correct. Signed-off-by: Eugene Loh --- libdtrace/dt_prov_sched.c | 3 ++- test/unittest/sched/tst.lv-dequeue.r | 1 - test/unittest/sched/tst.lv-enqueue.r | 1 + 3 files changed, 3 insertions(+), 2 deletions(-) diff --git a/libdtrace/dt_prov_sched.c b/libdtrace/dt_prov_sched.c index 125d58917..3a218f3cb 100644 --- a/libdtrace/dt_prov_sched.c +++ b/libdtrace/dt_prov_sched.c @@ -46,10 +46,10 @@ static probe_arg_t probe_args[] = { { "dequeue", 0, { 0, 0, "struct task_struct *", "lwpsinfo_t *" } }, { "dequeue", 1, { 0, 0, "struct task_struct *", "psinfo_t *" } }, { "dequeue", 2, { 1, 0, "cpuinfo_t *", } }, - { "dequeue", 3, { 2, 0, "int", } }, { "enqueue", 0, { 0, 0, "struct task_struct *", "lwpsinfo_t *" } }, { "enqueue", 1, { 0, 0, "struct task_struct *", "psinfo_t *" } }, { "enqueue", 2, { 1, 0, "cpuinfo_t *", } }, + { "enqueue", 3, { 2, 0, "int", } }, { "off-cpu", 0, { 0, 0, "struct task_struct *", "lwpsinfo_t *" } }, { "off-cpu", 1, { 0, 0, "struct task_struct *", "psinfo_t *" } }, { "on-cpu", }, @@ -128,6 +128,7 @@ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl) * associated with the runqueue. */ emit(dlp, BPF_STORE_IMM(BPF_DW, BPF_REG_7, DMST_ARG(1), 0)); + emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_0, BPF_REG_7, DMST_ARG(2))); emit(dlp, BPF_ALU64_IMM(BPF_AND, BPF_REG_0, ENQUEUE_HEAD)); emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(2), BPF_REG_0)); diff --git a/test/unittest/sched/tst.lv-dequeue.r b/test/unittest/sched/tst.lv-dequeue.r index 7580a8abc..657667f34 100644 --- a/test/unittest/sched/tst.lv-dequeue.r +++ b/test/unittest/sched/tst.lv-dequeue.r @@ -14,5 +14,4 @@ PROBE sched vmlinux dequeue args[0]: lwpsinfo_t * args[1]: psinfo_t * args[2]: cpuinfo_t * - args[3]: int diff --git a/test/unittest/sched/tst.lv-enqueue.r b/test/unittest/sched/tst.lv-enqueue.r index edc5e2c4c..2f5efa30c 100644 --- a/test/unittest/sched/tst.lv-enqueue.r +++ b/test/unittest/sched/tst.lv-enqueue.r @@ -14,4 +14,5 @@ PROBE sched vmlinux enqueue args[0]: lwpsinfo_t * args[1]: psinfo_t * args[2]: cpuinfo_t * + args[3]: int -- 2.43.5 From kris.van.hees at oracle.com Sun Mar 30 21:13:21 2025 From: kris.van.hees at oracle.com (Kris Van Hees) Date: Sun, 30 Mar 2025 17:13:21 -0400 Subject: [DTrace-devel] [PATCH] Fix arg3 for sched enqueue and dequeue probes In-Reply-To: <20250330205736.10970-1-eugene.loh@oracle.com> References: <20250330205736.10970-1-eugene.loh@oracle.com> Message-ID: On Sun, Mar 30, 2025 at 04:57:36PM -0400, eugene.loh at oracle.com wrote: > From: Eugene Loh > > For sched enqueue and dequeue probes, arg3 should be an int > for enqueue and not exist for dequeue. In the code it was > the other way around. Fix this and the associated tests. > > The trampoline was already set up to get this argument correct. > > Signed-off-by: Eugene Loh Reviewed-by: Kris Van Hees > --- > libdtrace/dt_prov_sched.c | 3 ++- > test/unittest/sched/tst.lv-dequeue.r | 1 - > test/unittest/sched/tst.lv-enqueue.r | 1 + > 3 files changed, 3 insertions(+), 2 deletions(-) > > diff --git a/libdtrace/dt_prov_sched.c b/libdtrace/dt_prov_sched.c > index 125d58917..3a218f3cb 100644 > --- a/libdtrace/dt_prov_sched.c > +++ b/libdtrace/dt_prov_sched.c > @@ -46,10 +46,10 @@ static probe_arg_t probe_args[] = { > { "dequeue", 0, { 0, 0, "struct task_struct *", "lwpsinfo_t *" } }, > { "dequeue", 1, { 0, 0, "struct task_struct *", "psinfo_t *" } }, > { "dequeue", 2, { 1, 0, "cpuinfo_t *", } }, > - { "dequeue", 3, { 2, 0, "int", } }, > { "enqueue", 0, { 0, 0, "struct task_struct *", "lwpsinfo_t *" } }, > { "enqueue", 1, { 0, 0, "struct task_struct *", "psinfo_t *" } }, > { "enqueue", 2, { 1, 0, "cpuinfo_t *", } }, > + { "enqueue", 3, { 2, 0, "int", } }, > { "off-cpu", 0, { 0, 0, "struct task_struct *", "lwpsinfo_t *" } }, > { "off-cpu", 1, { 0, 0, "struct task_struct *", "psinfo_t *" } }, > { "on-cpu", }, > @@ -128,6 +128,7 @@ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl) > * associated with the runqueue. > */ > emit(dlp, BPF_STORE_IMM(BPF_DW, BPF_REG_7, DMST_ARG(1), 0)); > + > emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_0, BPF_REG_7, DMST_ARG(2))); > emit(dlp, BPF_ALU64_IMM(BPF_AND, BPF_REG_0, ENQUEUE_HEAD)); > emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(2), BPF_REG_0)); > diff --git a/test/unittest/sched/tst.lv-dequeue.r b/test/unittest/sched/tst.lv-dequeue.r > index 7580a8abc..657667f34 100644 > --- a/test/unittest/sched/tst.lv-dequeue.r > +++ b/test/unittest/sched/tst.lv-dequeue.r > @@ -14,5 +14,4 @@ PROBE sched vmlinux dequeue > args[0]: lwpsinfo_t * > args[1]: psinfo_t * > args[2]: cpuinfo_t * > - args[3]: int > > diff --git a/test/unittest/sched/tst.lv-enqueue.r b/test/unittest/sched/tst.lv-enqueue.r > index edc5e2c4c..2f5efa30c 100644 > --- a/test/unittest/sched/tst.lv-enqueue.r > +++ b/test/unittest/sched/tst.lv-enqueue.r > @@ -14,4 +14,5 @@ PROBE sched vmlinux enqueue > args[0]: lwpsinfo_t * > args[1]: psinfo_t * > args[2]: cpuinfo_t * > + args[3]: int > > -- > 2.43.5 > From eugene.loh at oracle.com Mon Mar 31 21:45:00 2025 From: eugene.loh at oracle.com (eugene.loh at oracle.com) Date: Mon, 31 Mar 2025 17:45:00 -0400 Subject: [DTrace-devel] [PATCH 1/2] Add a cpuinfos BPF map Message-ID: <20250331214501.24126-1-eugene.loh@oracle.com> From: Eugene Loh The cpuinfo BPF map is a per-CPU map that has CPU information on each CPU for that CPU. Add a cpuinfos BPF map that allows any CPU to access information for any other CPU. For now, we retain the older per-CPU map. If desired, a future patch can migrate existing uses of the per-CPU map to the new map, decommissioning the old one. This would include map set up: *) libdtrace/dt_dlibs.c: DT_BPF_SYMBOL(cpuinfo, DT_IDENT_PTR), *) libdtrace/dt_impl.h: int dt_cpumap_fd; *) libdtrace/dt_bpf.c: dtp->dt_cpumap_fd = ... libdtrace/dt_bpf.c: CREATE_MAP(cpuinfo) and map use: *) bpf/get_agg.c *) bpf/get_bvar.c *) libdtrace/dt_cg.c *) libdtrace/dt_prov_lockstat.c Signed-off-by: Eugene Loh --- libdtrace/dt_bpf.c | 13 +++++++++++++ libdtrace/dt_dlibs.c | 1 + libdtrace/dt_impl.h | 1 + 3 files changed, 15 insertions(+) diff --git a/libdtrace/dt_bpf.c b/libdtrace/dt_bpf.c index 6d42a96c7..8da51d6b9 100644 --- a/libdtrace/dt_bpf.c +++ b/libdtrace/dt_bpf.c @@ -786,7 +786,20 @@ gmap_create_cpuinfo(dtrace_hdl_t *dtp) if (dtp->dt_cpumap_fd == -1) return -1; + dtp->dt_cpusmap_fd = create_gmap(dtp, "cpuinfos", + BPF_MAP_TYPE_HASH, + sizeof(uint32_t), + sizeof(dt_bpf_cpuinfo_t), ncpus); + if (dtp->dt_cpusmap_fd == -1) + return -1; + rc = dt_bpf_map_update(dtp->dt_cpumap_fd, &key, data); + + for (i = 0, ci = &conf->cpus[0]; i < ncpus && rc != -1; i++, ci++) { + key = ci->cpu_id; + rc = dt_bpf_map_update(dtp->dt_cpusmap_fd, &key, &data[ci->cpu_id]); + } + dt_free(dtp, data); if (rc == -1) return dt_bpf_error(dtp, diff --git a/libdtrace/dt_dlibs.c b/libdtrace/dt_dlibs.c index 21df22a8a..0f19f3566 100644 --- a/libdtrace/dt_dlibs.c +++ b/libdtrace/dt_dlibs.c @@ -61,6 +61,7 @@ static const dt_ident_t dt_bpf_symbols[] = { DT_BPF_SYMBOL(agggen, DT_IDENT_PTR), DT_BPF_SYMBOL(buffers, DT_IDENT_PTR), DT_BPF_SYMBOL(cpuinfo, DT_IDENT_PTR), + DT_BPF_SYMBOL(cpuinfos, DT_IDENT_PTR), DT_BPF_SYMBOL(dvars, DT_IDENT_PTR), DT_BPF_SYMBOL(gvars, DT_IDENT_PTR), DT_BPF_SYMBOL(lvars, DT_IDENT_PTR), diff --git a/libdtrace/dt_impl.h b/libdtrace/dt_impl.h index 68fb8ec53..a5e42801c 100644 --- a/libdtrace/dt_impl.h +++ b/libdtrace/dt_impl.h @@ -390,6 +390,7 @@ struct dtrace_hdl { int dt_aggmap_fd; /* file descriptor for the 'aggs' BPF map */ int dt_genmap_fd; /* file descriptor for the 'agggen' BPF map */ int dt_cpumap_fd; /* file descriptor for the 'cpuinfo' BPF map */ + int dt_cpusmap_fd; /* file descriptor for the 'cpuinfos' BPF map */ int dt_usdt_pridsmap_fd; /* file descriptor for the 'usdt_prids' BPF map */ int dt_usdt_namesmap_fd; /* file descriptor for the 'usdt_names' BPF map */ dtrace_handle_err_f *dt_errhdlr; /* error handler, if any */ -- 2.43.5 From eugene.loh at oracle.com Mon Mar 31 21:45:01 2025 From: eugene.loh at oracle.com (eugene.loh at oracle.com) Date: Mon, 31 Mar 2025 17:45:01 -0400 Subject: [DTrace-devel] [PATCH 2/2] Clean up sched provider trampoline FIXMEs In-Reply-To: <20250331214501.24126-1-eugene.loh@oracle.com> References: <20250331214501.24126-1-eugene.loh@oracle.com> Message-ID: <20250331214501.24126-2-eugene.loh@oracle.com> From: Eugene Loh The sched provider trampoline for enqueue and dequeue probes had pending FIXMEs for providing a cpuinfo_t* for the cpu associated with the run queue. Implement the missing code. Signed-off-by: Eugene Loh --- libdtrace/dt_prov_sched.c | 74 +++++++++++++++++++++++++------ test/unittest/sched/tst.enqueue.d | 1 - 2 files changed, 60 insertions(+), 15 deletions(-) diff --git a/libdtrace/dt_prov_sched.c b/libdtrace/dt_prov_sched.c index 3a218f3cb..d626b27be 100644 --- a/libdtrace/dt_prov_sched.c +++ b/libdtrace/dt_prov_sched.c @@ -84,6 +84,40 @@ static int populate(dtrace_hdl_t *dtp) probe_args, probes); } +/* + * Get a pointer to the cpuinfo_t structure for the CPU associated + * with the runqueue that is in arg0. + * + * Clobbers %r1 through %r5 + * Stores pointer to cpuinfo_t struct in %r0 + */ +static void get_cpuinfo(dtrace_hdl_t *dtp, dt_irlist_t *dlp, uint_t exitlbl) +{ + dt_ident_t *idp = dt_dlib_get_map(dtp, "cpuinfos"); + + assert(idp != NULL); + + /* Put the runqueue pointer from mst->arg0 into %r3. */ + emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_3, BPF_REG_7, DMST_ARG(0))); + + /* Turn it into a pointer to its cpu member. */ + emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, dt_cg_ctf_offsetof("struct rq", "cpu", NULL, 1))); + + /* Call bpf_probe_read_kernel(%fp + DT_TRAMP_SP_SLOT[0], sizeof(int), %r3) */ + emit(dlp, BPF_MOV_IMM(BPF_REG_2, (int) sizeof(int))); + emit(dlp, BPF_MOV_REG(BPF_REG_1, BPF_REG_FP)); + emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, DT_TRAMP_SP_SLOT(0))); + emit(dlp, BPF_CALL_HELPER(BPF_FUNC_probe_read_kernel)); + emit(dlp, BPF_BRANCH_IMM(BPF_JNE, BPF_REG_0, 0, exitlbl)); + + /* Now look up the corresponding cpuinfo_t. */ + dt_cg_xsetx(dlp, idp, DT_LBL_NONE, BPF_REG_1, idp->di_id); + emit(dlp, BPF_MOV_REG(BPF_REG_2, BPF_REG_FP)); + emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, DT_TRAMP_SP_SLOT(0))); + emit(dlp, BPF_CALL_HELPER(BPF_FUNC_map_lookup_elem)); + emit(dlp, BPF_BRANCH_IMM(BPF_JEQ, BPF_REG_0, 0, exitlbl)); +} + /* * Generate a BPF trampoline for a SDT probe. * @@ -98,18 +132,39 @@ static int populate(dtrace_hdl_t *dtp) */ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl) { + dtrace_hdl_t *dtp = pcb->pcb_hdl; dt_irlist_t *dlp = &pcb->pcb_ir; dt_probe_t *prp = pcb->pcb_probe; if (strcmp(prp->desc->prb, "dequeue") == 0) { - emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_0, BPF_REG_7, DMST_ARG(1))); - emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(0), BPF_REG_0)); /* - * FIXME: arg1 should be a pointer to cpuinfo_t for the CPU - * associated with the runqueue. + * Get the runqueue from arg0 and place a cpuinfo_t* into %r0. + */ + get_cpuinfo(dtp, dlp, exitlbl); + + /* + * Copy arg1 into arg0. */ - emit(dlp, BPF_STORE_IMM(BPF_DW, BPF_REG_7, DMST_ARG(1), 0)); + emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_3, BPF_REG_7, DMST_ARG(1))); + emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(0), BPF_REG_3)); + + /* Store the cpuinfo_t* in %r0 into arg1. */ + emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(1), BPF_REG_0)); } else if (strcmp(prp->desc->prb, "enqueue") == 0) { + /* + * Get the runqueue from arg0 and place a cpuinfo_t* into %r0. + */ + get_cpuinfo(dtp, dlp, exitlbl); + + /* + * Copy arg1 into arg0. + */ + emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_3, BPF_REG_7, DMST_ARG(1))); + emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(0), BPF_REG_3)); + + /* Store the cpuinfo_t* in %r0 into arg1. */ + emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(1), BPF_REG_0)); + /* * This is ugly but necessary... enqueue_task() takes a flags argument and the * ENQUEUE_HEAD flag is used to indicate that the task is to be placed at the @@ -120,15 +175,6 @@ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl) * outside the kernel source tree. */ #define ENQUEUE_HEAD 0x10 - - emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_0, BPF_REG_7, DMST_ARG(1))); - emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(0), BPF_REG_0)); - /* - * FIXME: arg1 should be a pointer to cpuinfo_t for the CPU - * associated with the runqueue. - */ - emit(dlp, BPF_STORE_IMM(BPF_DW, BPF_REG_7, DMST_ARG(1), 0)); - emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_0, BPF_REG_7, DMST_ARG(2))); emit(dlp, BPF_ALU64_IMM(BPF_AND, BPF_REG_0, ENQUEUE_HEAD)); emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(2), BPF_REG_0)); diff --git a/test/unittest/sched/tst.enqueue.d b/test/unittest/sched/tst.enqueue.d index f445ac843..28dcace8c 100644 --- a/test/unittest/sched/tst.enqueue.d +++ b/test/unittest/sched/tst.enqueue.d @@ -4,7 +4,6 @@ * Licensed under the Universal Permissive License v 1.0 as shown at * http://oss.oracle.com/licenses/upl. */ -/* @@xfail: dtv2 */ #pragma D option switchrate=100hz #pragma D option destructive -- 2.43.5 From eugene.loh at oracle.com Mon Mar 31 21:46:40 2025 From: eugene.loh at oracle.com (eugene.loh at oracle.com) Date: Mon, 31 Mar 2025 17:46:40 -0400 Subject: [DTrace-devel] [PATCH v2 2/2] Clean up sched provider trampoline FIXMEs Message-ID: <20250331214640.24230-1-eugene.loh@oracle.com> From: Eugene Loh The sched provider trampoline for enqueue and dequeue probes had pending FIXMEs for providing a cpuinfo_t* for the cpu associated with the run queue. Implement the missing code. Signed-off-by: Eugene Loh --- libdtrace/dt_prov_sched.c | 74 +++++++++++++++++++++++++------ test/unittest/sched/tst.enqueue.d | 1 - 2 files changed, 60 insertions(+), 15 deletions(-) diff --git a/libdtrace/dt_prov_sched.c b/libdtrace/dt_prov_sched.c index 3a218f3cb..8b9bf4a70 100644 --- a/libdtrace/dt_prov_sched.c +++ b/libdtrace/dt_prov_sched.c @@ -84,6 +84,40 @@ static int populate(dtrace_hdl_t *dtp) probe_args, probes); } +/* + * Get a pointer to the cpuinfo_t structure for the CPU associated + * with the runqueue that is in arg0. + * + * Clobbers %r1 through %r5 + * Stores pointer to cpuinfo_t struct in %r0 + */ +static void get_cpuinfo(dtrace_hdl_t *dtp, dt_irlist_t *dlp, uint_t exitlbl) +{ + dt_ident_t *idp = dt_dlib_get_map(dtp, "cpuinfos"); + + assert(idp != NULL); + + /* Put the runqueue pointer from mst->arg0 into %r3. */ + emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_3, BPF_REG_7, DMST_ARG(0))); + + /* Turn it into a pointer to its cpu member. */ + emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, dt_cg_ctf_offsetof("struct rq", "cpu", NULL, 1))); + + /* Call bpf_probe_read_kernel(%fp + DT_TRAMP_SP_SLOT[0], sizeof(int), %r3) */ + emit(dlp, BPF_MOV_IMM(BPF_REG_2, (int) sizeof(int))); + emit(dlp, BPF_MOV_REG(BPF_REG_1, BPF_REG_FP)); + emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, DT_TRAMP_SP_SLOT(0))); + emit(dlp, BPF_CALL_HELPER(BPF_FUNC_probe_read_kernel)); + emit(dlp, BPF_BRANCH_IMM(BPF_JNE, BPF_REG_0, 0, exitlbl)); + + /* Now look up the corresponding cpuinfo_t. */ + dt_cg_xsetx(dlp, idp, DT_LBL_NONE, BPF_REG_1, idp->di_id); + emit(dlp, BPF_MOV_REG(BPF_REG_2, BPF_REG_FP)); + emit(dlp, BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, DT_TRAMP_SP_SLOT(0))); + emit(dlp, BPF_CALL_HELPER(BPF_FUNC_map_lookup_elem)); + emit(dlp, BPF_BRANCH_IMM(BPF_JEQ, BPF_REG_0, 0, exitlbl)); +} + /* * Generate a BPF trampoline for a SDT probe. * @@ -98,18 +132,39 @@ static int populate(dtrace_hdl_t *dtp) */ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl) { + dtrace_hdl_t *dtp = pcb->pcb_hdl; dt_irlist_t *dlp = &pcb->pcb_ir; dt_probe_t *prp = pcb->pcb_probe; if (strcmp(prp->desc->prb, "dequeue") == 0) { - emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_0, BPF_REG_7, DMST_ARG(1))); - emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(0), BPF_REG_0)); /* - * FIXME: arg1 should be a pointer to cpuinfo_t for the CPU - * associated with the runqueue. + * Get the runqueue from arg0 and place its cpuinfo_t* into %r0. + */ + get_cpuinfo(dtp, dlp, exitlbl); + + /* + * Copy arg1 into arg0. */ - emit(dlp, BPF_STORE_IMM(BPF_DW, BPF_REG_7, DMST_ARG(1), 0)); + emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_3, BPF_REG_7, DMST_ARG(1))); + emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(0), BPF_REG_3)); + + /* Store the cpuinfo_t* in %r0 into arg1. */ + emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(1), BPF_REG_0)); } else if (strcmp(prp->desc->prb, "enqueue") == 0) { + /* + * Get the runqueue from arg0 and place its cpuinfo_t* into %r0. + */ + get_cpuinfo(dtp, dlp, exitlbl); + + /* + * Copy arg1 into arg0. + */ + emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_3, BPF_REG_7, DMST_ARG(1))); + emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(0), BPF_REG_3)); + + /* Store the cpuinfo_t* in %r0 into arg1. */ + emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(1), BPF_REG_0)); + /* * This is ugly but necessary... enqueue_task() takes a flags argument and the * ENQUEUE_HEAD flag is used to indicate that the task is to be placed at the @@ -120,15 +175,6 @@ static int trampoline(dt_pcb_t *pcb, uint_t exitlbl) * outside the kernel source tree. */ #define ENQUEUE_HEAD 0x10 - - emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_0, BPF_REG_7, DMST_ARG(1))); - emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(0), BPF_REG_0)); - /* - * FIXME: arg1 should be a pointer to cpuinfo_t for the CPU - * associated with the runqueue. - */ - emit(dlp, BPF_STORE_IMM(BPF_DW, BPF_REG_7, DMST_ARG(1), 0)); - emit(dlp, BPF_LOAD(BPF_DW, BPF_REG_0, BPF_REG_7, DMST_ARG(2))); emit(dlp, BPF_ALU64_IMM(BPF_AND, BPF_REG_0, ENQUEUE_HEAD)); emit(dlp, BPF_STORE(BPF_DW, BPF_REG_7, DMST_ARG(2), BPF_REG_0)); diff --git a/test/unittest/sched/tst.enqueue.d b/test/unittest/sched/tst.enqueue.d index f445ac843..28dcace8c 100644 --- a/test/unittest/sched/tst.enqueue.d +++ b/test/unittest/sched/tst.enqueue.d @@ -4,7 +4,6 @@ * Licensed under the Universal Permissive License v 1.0 as shown at * http://oss.oracle.com/licenses/upl. */ -/* @@xfail: dtv2 */ #pragma D option switchrate=100hz #pragma D option destructive -- 2.43.5