[DTrace-devel] changes needed for PIE executables

Kris Van Hees kris.van.hees at oracle.com
Fri Nov 22 11:43:58 PST 2019


On Thu, Nov 21, 2019 at 07:23:08PM -0500, Kris Van Hees wrote:
> On Thu, Nov 21, 2019 at 11:19:45PM +0000, Nick Alcock wrote:
> > On 21 Nov 2019, David Mc Lean verbalised:
> > 
> > > The question now is how do we adapt to this new file format when running 
> > > our USDT ?
> > 
> > This led me to do a bunch of digging around in code I have never really
> > had reason to look at before in libdtrace/dt_link.c, in particular
> > prepare_elf64() (and, I suppose, prepare_elf32(), though that is more or
> > less legacy now). So this is mostly speculation on my part, but at least
> > vaguely semi-half-informed speculation, honest. However, experimentation
> > is needed to see how much of what I wrote below is accidental fiction.
> > 
> > When a final link is done with dtrace -G, dump_elf64() is eventually
> > called. This calls prepare_elf64() to compute relocations and then
> > writes them out in a normal SHT_RELA reloc+addend relocation table.
> > prepare_elf64() derives the relocations to each usdt probe from the
> > relocs in the DOF (DOF section DOF_SECT_URELHDR), and translates them
> > into suitable relocs so that ld.so will fix up the DOF for us when the
> > program is loaded. (The only alternative to that would be doing the same
> > work ld.so is doing to fix up relocs *ourselves*, which is obviously
> > pointless if ld.so can do it.)
> 
> Hm, no, the point at which we are creating the relocations is when we generate
> the provider .o ELF object that holds the DOF that describes the probes in the
> executable (or shared library).  So, the relocations that are created in
> prepare_elf64() are based on relocations to (fake) probe functions (with a very
> specific name that we can recognize).  Those relocations identify the probe
> locations in the code.  THe DOF contains information about the offset of probes
> in each function that contains probes, but since we won't know the address (or
> offset within a section/segment) until the final linking of the executable we
> need to generate relocations for references to the functions for which we have
> probes.  The relocations point to fields in the DOF, and exist so that the
> linker will fill in the correct information for those functions (symbols) in
> the DOF.
> > 
> > Right now, on x86_64, we are writing out relocs of type
> > R_X86_64_GLOB_DAT:
> > 
> >                 for (j = 0; j < nrel; j++) {
> > #if defined(__i386) || defined(__amd64)
> >                         rel->r_offset = s->dofs_offset +
> >                             dofr[j].dofr_offset;
> >                         rel->r_info = ELF64_R_INFO(count + dep->de_global,
> >                             R_X86_64_GLOB_DAT);
> > [...]
> > 
> > A GLOB_DAT simply returns the symbol value as a word64, which is why we
> > set it here by adding the offset of the DOF section to the offset
> > recorded in the DOF itself.
> 
> I think you are confusing offset and value.  Offset refers to the location
> in the DOF section where this relocation needs to fill in a value.  The value
> will be the final address of the symbol (based on the GLOB_DAT reloc type).
> 
> > This works for executables, but breaks completely for PIE executables
> > (and I can't imagine it works too well for shared libraries, either).
> > The problem is that this is an *absolute value*, but we want a
> > section-relative value which ld.so can then adjust by the offset of the
> > section.
> 
> We want a section-relative value that the DTrace kernel code can then adjust
> based on the load address of the section.
> 
> > So we probably want to use R_X86_64_PC64, which also takes a word64 but
> > subtracts the section offset from it (and adds an addend, which in this
> > case we already hardwire to zero so we can ignore it).

I did a little experiment in compiling a little test executable (PIE) with
two source files so I could have a nice call to an externally defined (global)
symbol.  That one ends up being encoded as a R_X86_64_PLT32 relocation.  I do
not know (yet) if we can use that for this but that's one way I tend to start
looking at this (i.e. what I did when I did this for the aarch64 port - I
looked at inspiration on how various relocation types get used).

One of the pitfalls to avoid is that the relocation refers to to DOF section
in terms of what needs to be patched, but references a symbol in the .text
section.  What we need is the offset of the symbol *relative to the .text
section start*.

> > The resulting value will then be the value relative to the start of the
> > section, rather than the value relative to the start of the program.
> > We can then fairly easily modify the kernel-side code to add the section
> > base address to that (well, the segment base address, but for DOF I'm
> > fairly sure we always put each DOF section in its own unique segment:
> > please tell me if I'm wrong...)
> 
> Nothing needs to be modified on the kernel side because we already have code
> that handles relocating DOF information based on a load address.  We just need
> to ensure that we simply always pass in the load address of the code segment
> along with the DOF.
> 
> > This is of course a semantic change to DOF, so this should probably bump
> > the DOF version so the kernel code can distinguish old from new
> > semantics and determine whether to add the base address or not.
> 
> No change in semantics as far as DOF is concerned because it is already designed
> to support this.
> 
> > (Other supported architectures will no doubt need similar changes. Any
> > arch supporting shared libraries should already have a suitable reloc
> > type.)
> 
> We'll need to investigate that but yes, it is likely that some changes may be
> needed in hos the DOF relocations are generated for each arch.
> 
> 	Kris
> 
> _______________________________________________
> DTrace-devel mailing list
> DTrace-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/dtrace-devel



More information about the DTrace-devel mailing list