[DTrace-devel] [PATCH 4/5] module: script to generate offset ranges for builtin modules

Tue Oct 31 14:55:40 UTC 2023

On 30 Oct 2023, Kris Van Hees via DTrace-devel told this:

> The offset range data for builtin modules is generated using:
>  - modules.builtin.objs: associates object files with module names
>  - vmlinux.o: provides load order of sections and offset of first member per
>     section
>  - vmlinux.o.map: provides offset of object file content per section
>
> The generated data will look like:
>
> .text 00000000-00000000 = _text
> .text 0000baf0-0000cb10 amd_uncore
> .text 0009bd10-0009c8e0 iosf_mbi
> ...
> .text 008e6660-008e9630 snd_soc_wcd_mbhc
> .text 008e9630-008ea610 snd_soc_wcd9335 snd_soc_wcd934x snd_soc_wcd938x
> .text 008ea610-008ea780 snd_soc_wcd9335
> ...
> .data 00000000-00000000 = _sdata
> .data 0000f020-0000f680 amd_uncore
>
> For each ELF section, it lists the offset of the first symbol.  This can
> be used to deteermine the base address of the section at runtime.

"determine".

> Next, it lists (in strict ascending order) offset ranges in that section
> that cover the symbols of one or more builtin modules.  Multiple ranges
> can apply to a single module, and ranges can be shared between modules.

Looks good!

> Signed-off-by: Kris Van Hees <kris.van.hees at oracle.com>

Reviewed-by: Nick Alcock <nick.alcock at oracle.com>

I'm just going to *assume* this works.

> ---
>  scripts/generate_builtin_ranges.awk | 149 ++++++++++++++++++++++++++++

All that C code replaced with one simple awk script! I am humbled :)

>  1 file changed, 149 insertions(+)
>  create mode 100755 scripts/generate_builtin_ranges.awk
>
> diff --git a/scripts/generate_builtin_ranges.awk b/scripts/generate_builtin_ranges.awk
> new file mode 100755
> index 000000000000..d5d668c97bd7
> --- /dev/null
> +++ b/scripts/generate_builtin_ranges.awk
> @@ -0,0 +1,149 @@
> +#!/usr/bin/gawk -f
> +
> +FNR == 1 {
> +	FC++;
> +}

... I spent some time looking for where you'd switched from one file to
the next until I noticed that this FNR == 1 check would track awk's
doing it on its own without you needing to add any actual code to do so.

> +# (1) Build a mapping to associate object files with built-in module names.
> +#
> +# The first file argument is used as input (modules.builtin.objs).
> +#
> +FC == 1 {
> +	sub(/:/, "");
> +	mod = $1;
> +	sub(/([^/]*\/)+/, "", mod);
> +	sub(/\.o$/, "", mod);
> +	gsub(/-/, "_", mod);

Silly question: does this work with clang's mapfile format? I guess not,
but then I never tried to get the old kallmodsyms's approach to work
with that either. (I presume it *does* work with the CONFIG_X86_IBT
approach of linking vmlinux.o, stuffing that into a vmlinux.a and then
linking that into vmlinux, because that's the default these days. This
is where the section-relative offset stuff comes in handy, after all.)

> +# (2) Determine the load address for each section.
> +#
> +# The second file argument is used as input (vmlinux.map).
> +# Since some AWK implementations cannot handle large integers, we strip of the
> +# first 4 hex digits from the address.  This is safe because the kernel space
> +# is not large enough for addresses to extend into those digits.

... I suppose that's less awful than having a dedicated C program to do
the arithmetic. These are all zero-relative .o file addresses anyway, so
we don't need to worry about platforms like SPARC64 where the kernel is
stored in high memory and all the addresses are way up there. (I don't
think any other platforms are likely to do this after the number of
interfaces that doing this silently broke when applied to things like
/proc/*/mem. Minor things like pread().)

For extra safety you could always strip off the first four digits and
then stuff them back on afterwards -- that would be safe unless the
kernel was so big that it moved from one such range to another (and you
could easily check that by stripping off the first four digits from the
first and last such line and checking that they were identical). But I'm
probably overdesigning this as usual :)

> +# (3) We need to determine the base address of the section so that ranges can
> +# be expressed based on offsets from the base address.  This accomodates the
> +# kernel sections getting loaded at different addresses than what is recorded
> +# in vmlinux.map.
> +#
> +# At runtime, we will need to determine the base address of each section we are
> +# interested in.  We do that by recording the offset of the first symbol in the
> +# section.  Once we know the address of this symbol in the running kernel, we
> +# can calculate the base address of the section.
> +#
> +# If possible, we use an explicit anchor symbol (sym = .) listed at the base
> +# address (offset 0).
> +#
> +# If there is no such symbol, we record the first symbol in the section along
> +# with its offset.
> +#
> +# We also determine the offset of the first member in the section in case the
> +# final linking inserts some content between the start of the section and the
> +# first member.  I.e. in that case, vmlinux.map will list the first member at
> +# a non-zero offset whereas vmlinux.o.map will list it at offset 0.  We record
> +# the addend so we can apply it when processing vmlinux.o.map (next).

*Nice* comments. And likely more robust than what I was doing (I don't
 think I ever considered the last case).

The code seems to match it, and that seems to make sense to me :)

-- 
NULL && (void)