[DTrace-devel] [PATCH 7/7] module: support kallsyms with modules.builtin.ranges

Nick Alcock nick.alcock at oracle.com
Fri Nov 3 16:22:16 UTC 2023


On 31 Oct 2023, Kris Van Hees via DTrace-devel uttered the following:

> Extend the support for different forms of symbol-to-address information
> with nodule association.  We now support three different variaties:
>  - kallmodsyms
>  - kallsyms with modules.builtin.ranges
>  - kallsyms w/o modules.builtin.ranges

Thorough! :)

> The modules.builtin.ranges file contains data about address ranges that
> contain symbols for one or more built-in modules.  Since the load
> address of each kernel section is not known at kernel buildtime, this
> file stores offset ranges relative to the load address of the section.
>
> The file will look like this:
>
> .text 00000000-00000000 = _text
> .text 0000baf0-0000cb10 amd_uncore
> .text 0009bd10-0009c8e0 iosf_mbi
> ...
> .text 008e6660-008e9630 snd_soc_wcd_mbhc
> .text 008e9630-008ea610 snd_soc_wcd9335 snd_soc_wcd934x snd_soc_wcd938x
> .text 008ea610-008ea780 snd_soc_wcd9335
> ...
> .data 00000000-00000000 = _sdata
> .data 0000f020-0000f680 amd_uncore
>
> For each ELF section, it lists the offset of the first symbol.  This can
> be used to deteermine the base address of the section at runtime.
>
> Next, it lists (in strict ascending order) offset ranges in that section
> that cover the symbols of one or more builtin modules.  Multiple ranges
> can apply to a single module, and ranges can be shared between modules.
>
> When processing /proc/kallsyms with modules.builtin.ranges data, symbols
> are annotated (as applicable) with a built-in module name.  If a symbol
> belongs to multiple modules, multiple copies of the symbol will be added
> with the same address but distinct module names.  This means that the
> symbol will be found with modname`symname lookups, but reverse lookup on
> the address will always give the first one found.  This is consistent
> with existing behaviour in DTrace when two symbols refer to the same
> address.

That's much better than what we were doing before, so good!

> Signed-off-by: Kris Van Hees <kris.van.hees at oracle.com>

Really quite hard to review due to a lot of code change. I respan the
diff below with -c diff.algorithm=histogram, which while a bit tangled
is much clearer.

A couple of tiny typo niggles, but other than that

Reviewed-by: Nick Alcock <nick.alcock at oracle.com>

and all very nice!

> diff --git a/libdtrace/dt_module.c b/libdtrace/dt_module.c
> index 5d67bdc6a24d5..af916ecdc9646 100644
> --- a/libdtrace/dt_module.c
> +++ b/libdtrace/dt_module.c
> @@ -1010,10 +1010,23 @@ dt_kern_module_find_ctf(dtrace_hdl_t *dtp, dt_module_t *dmp)
>  	}
>  }
>  
> +/*
> + * Symbol data can be collected in three ways:
> + *  - kallmodsyms
> + *  - kallsyms with modules.builtin.ranges
> + *  - kallsyms w/o modules.builtin.ranges
> + *
> + * The processing state uses a kind member to identify the primary source
> + * (kallsyms vs kallmodsyms), and the rfp member is used to determine whether
> + * a mdules.builtin.ranges file is available.
> + */
> +#define DT_MODSYM_KALLSYMS	1
> +#define DT_MODSYM_KALLMODSYMS	2

mdules!

>  /*
>   * We will use kernel_flag to track which symbols we are reading.
>   *
> - * /proc/kallmodsyms starts with kernel (and built-in-module) symbols.
> + * /proc/kall[mod]syms starts with kernel (and built-in-module) symbols.
>   *
>   * The last kernel address is expected to have the name "_end",
>   * but there might also be a symbol "__brk_limit" with that address.
> @@ -1031,123 +1044,123 @@ dt_kern_module_find_ctf(dtrace_hdl_t *dtp, dt_module_t *dmp)
>  #define KERNEL_FLAG_KERNEL_END 1
>  #define KERNEL_FLAG_LOADABLE 2
>  #define KERNEL_FLAG_INIT_SCRATCH 4
> -/*
> - * Update our module cache.  For each line, create or
> - * populate the dt_module_t for this module (if necessary), extend its address
> - * ranges as needed, and add the symbol in this line to the module's kernel
> - * symbol table.
> - *
> - * If we return non-NULL, we might have a changing file, probably due
> - * to module unloading during read.  Perhaps this case should trigger a retry.
> - */
> -static int
> -dt_modsym_update(dtrace_hdl_t *dtp, const char *line, int flag)
> +
> +typedef struct dt_kasstate {

Oh reifying this as a structure rather than a giant pile of locals makes
everything so much clearer! ACK.

> +static dt_kasstate_t *
> +dt_kasstate_new(dtrace_hdl_t *dtp, int kind)
>  {
> -	static uint_t kernel_flag = 0;
> -	static dt_module_t *last_dmp = NULL;
> -	static int last_sym_text = -1;
> +	char		*fn;
> +	dt_kasstate_t	*state;
>  
> -	GElf_Addr sym_addr;
> -	long long unsigned sym_size = 1;
> -	char sym_type;
> -	int sym_text;
> -	dt_module_t *dmp;
> -	dtrace_addr_range_t *range = NULL;
> -	char sym_name[KSYM_NAME_MAX];
> -	char mod_name[PATH_MAX] = "vmlinux]";	/* note trailing ] */
> -	int skip = 0;
> +	state = dt_zalloc(dtp, sizeof(dt_kasstate_t));
> +	state->kind = kind;
> +	state->kernel_flag = 0;
> +	state->last_dmp = NULL;
> +	state->last_sym_text = -1;
> +
> +	if (kind == DT_MODSYM_KALLMODSYMS)
> +		return state;
> +
> +	if (asprintf(&fn, "%s/modules.builtin.ranges",
> +		     dtp->dt_module_path) == -1)
> +		return state;
> +
> +	state->rfp = fopen(fn, "r");
> +	free(fn);
> +
> +	if (state->rfp == NULL) {
> +		/* TODO: waiting on a warning infrastructure */

... we really should do something about that. A dt_warning() that emits
to stderr unless explicitly turned off with a new option shouldn't be
that hard!

>  	/*
> -	 * Read symbol.
> +	 * If we are looking for an anchor symbol, see if the current symbol is
> +	 * the one we want.  If so, update the section base address and clear
> +	 * the symbol so we don't look for it again.  If not, we are done
> +	 * because the current address cannot be part of the last module we saw
> +	 * once we are looking for the next section anchor.

Ack. If the next symbol does come from the same module (which it might,
by chance, though it's unlikely), the name will recur.

> +	/*
> +	 * If the given address is beyond the end of the current builtin module
> +	 * more data needs to be read from modules.builtin.ranges until a range
> +	 * is found that either contains the address or goes beyond.
> +	 */
> +	while (addr > state->mod_eaddr) {
> +		if (getline(&line, &line_n, state->rfp) == -1)
> +			break;
> +
> +		if (sscanf(line, "%*s %lx-%lx %[A-Za-z0-9_ ]",
> +			   &state->mod_saddr, &state->mod_eaddr,
> +			   state->mod_name) == 3) {
> +			state->mod_saddr += state->sect_base;
> +			state->mod_eaddr += state->sect_base;
> +			state->mod_name[strlen(state->mod_name)] = '\0';

Technically this is a security hole (nothing technically limits module
names to PATH_MAX, though in practice they will not be larger than the
page size: on Linux, pathnames can be up to a page in length, regardless
of the basically fictional value of PATH_MAX), but it was before.  We
can change that to use %ms in a later patch :)

> +		} else {
> +			if (sscanf(line, "%*s %lx-%*x = %[A-Za-z0-9_]",

(ditto)

> +typedef struct dt_kallsym {
> +	GElf_Addr	addr;
> +	size_t		size;
> +	char		type;
> +	char		name[KSYM_NAME_MAX];
> +	char		mod[PATH_MAX];
> +} dt_kallsym_t;

(ditto).

Looks like these are only used temporarily, and not stored in large
numbers, so the space used by the mod field is not troublesome. (If it
was stored that way, the space wastage would be horrific! but it's not.)

> +	if (strcmp(sym.mod, "bpf") == 0)
> +		return 0;

I know this isn't new, but I've never known why it was there...

> +	/*
> +	 * Special case: rename the 'ctf' module to 'shared_ctf': the
> +	 * parent-name lookup code presumes that names that appear in CTF's
> +	 * parent section are the names of modules, but the ctf module's CTF
> +	 * section is special-cased to contain the contents of the shared_ctf
> +	 * repository, not ctf.ko's types.
> +	 */
> +	if (strcmp(sym.mod, "ctf") == 0)
> +		strcpy(sym.mod, "shared_ctf");

Side note: this module no longer exists. I should have taken this out
when support for modules in the core kernel in a fake ctf.ko module was
removed. One for a later commit, I think.

> +	/*
> +	 * It is possible that the symbol belongs to more than one module.
> +	 * Loop over all modules in sym.mod and add the symbol to each.
> +	 */

The other case we should deal with (again, in a later commit) is a
symbol appearing repeatedly with multiple addresses, possibly in
different subsets of modules each time.

> +	for (modname = strtok(sym.mod, " "); modname;
> +	     modname = strtok(NULL, " ")) {
> +		dt_module_t	*dmp;
> +		int		err;
> +
> +		dmp = dt_module_lookup_by_name(dtp, modname);
> +		if (dmp == NULL) {
> +			dmp = dt_module_create(dtp, modname);
> +			if (dmp == NULL)
> +				return EDT_NOMEM;
> +
> +			err = dt_kern_module_init(dtp, dmp);
> +			if (err != 0)
> +				return err;
> +		}
> +
> +		err = dt_modsym_addsym(dtp, dmp, &sym, state);
> +		if (err != 0)
> +			return err;

Very nice, compared to how much work it is to do this efficiently
in-kernel :)



More information about the DTrace-devel mailing list