[DTrace-devel] [PATCH] No uprobes on ARM autiasp instructions

Eugene Loh eugene.loh at oracle.com
Mon Aug 11 20:52:43 UTC 2025


On 8/11/25 13:54, Kris Van Hees wrote:

> On Tue, Jun 10, 2025 at 05:10:42PM -0400, eugene.loh at oracle.com wrote:
>> From: Eugene Loh <eugene.loh at oracle.com>
>>
>> New compilers emit autiasp instructions much more liberally.
>> A test like test/unittest/pid/tst.entry_off0.sh, which tries
>> to put a probe on each instruction, may fail.
>>
>> Signed-off-by: Eugene Loh <eugene.loh at oracle.com>
>> ---
>>   libdtrace/dt_pid.c | 23 +++++++++++++++++------
>>   1 file changed, 17 insertions(+), 6 deletions(-)
>>
>> diff --git a/libdtrace/dt_pid.c b/libdtrace/dt_pid.c
>> index e2d4e540d..833e9b647 100644
>> --- a/libdtrace/dt_pid.c
>> +++ b/libdtrace/dt_pid.c
>> @@ -279,12 +279,17 @@ dt_pid_per_sym(dt_pid_probe_t *pp, const GElf_Sym *symp, const char *func)
>>   
>>   		nmatches++;
>>   	} else if (glob) {
>> -#if defined(__amd64)
>>   		/*
>> -		 * We need to step through the instructions to find their
>> -		 * offsets.  This is difficult on x86, which has variable
>> -		 * instruction lengths.  We invoke the disassembler in
>> -		 * libopcodes.
>> +		 * We need the instructions for two reasons:
>> +		 * = On x86, instructions have varying lengths.  So,
>> +		 *   to step through the instructions, we need to
>> +		 *   disassemble them to know what they are.
>> +		 *   We invoke the disassembler in libopcodes.
>> +		 *   (On ARM, we step through 4 bytes at a time.)
>> +		 * = On both x86 and arm, we want to skip certain
>> +		 *   instructions.  So, again, we need to know what they are.
>> +		 */
>> +		/*
>>   		 *
>>   		 * We look for the Elf pointer.  It is already stored in
>>   		 * file_elf in file_info_t, but getting it back over here
>> @@ -298,7 +303,6 @@ dt_pid_per_sym(dt_pid_probe_t *pp, const GElf_Sym *symp, const char *func)
>>   		GElf_Shdr shdr;
>>   		Elf_Data *data;
>>   		size_t shstrndx, off;
>> -		disassembler_ftype disasm;
>>   
>>   		/* Set things up. */
>>   		fd = open(pp->dpp_fname, O_RDONLY);
>> @@ -344,12 +348,14 @@ dt_pid_per_sym(dt_pid_probe_t *pp, const GElf_Sym *symp, const char *func)
>>   		/* Get the instructions. */
>>   		data = elf_getdata(scn, NULL);
>>   
>> +#if defined(__amd64)
>>   		/*
>>   		 * "Disassemble" instructions just to get the offsets.
>>   		 *
>>   		 * Unfortunately, libopcodes's disassembler() has a different
>>   		 * interface in binutils versions before 2.29.
>>   		 */
>> +		disassembler_ftype disasm;
>>   #if defined(HAVE_DIS1) == defined(HAVE_DIS4)
>>   #error expect disassembler() to have 1 or else 4 arguments
>>   #endif
>> @@ -390,6 +396,11 @@ dt_pid_per_sym(dt_pid_probe_t *pp, const GElf_Sym *symp, const char *func)
>>   			/* Newer kernels do not allow uprobes on "hlt" instructions. */
>>   			if ((unsigned int)disasm_info.buffer[off] == 0xf4)
>>   				continue;
>> +#else
>> +			/* On ARM, we cannot place uprobes on "autiasp" instructions. */
>> +			if (*((unsigned int *)(data->d_buf + (sym.st_value + off - shdr.sh_addr)))
>> +			    == 0xd50323bf)
> Are there symbolic names we can use here?  From an include file concerning
> opcodes or (worst case) define one ourselves.  From the comment, I can assume
> that the 32-bit hex value you give must be that instruction.  But is it an
> actual 4-byte instruction without any values taht can be set for different
> uses, etc?  Perhaps a define and a comment explaining the value might be
> useful here.
>
> And perhaps do the same for the 'hlt' x86 instruction mentioned above it?

Quite frankly, I'm not familiar with these instructions, and the 
situation is getting worse.  There is another x86 instruction prefix 
that causes a problem and needs to be added -- some "multi-byte nop 0x66 
0x66" thing.

I'm tempted to go another route.  Instead of building a list of hlt, 
autiasp, etc., just exclude the ones that don't work.  That is, if I ask 
for pid$pid:$mod:$fun:*, silently ignore the offsets that don't work.

If I remember correctly, a challenge with this approach is that you 
don't know until later that a particular offset will be a problem. Each 
offset would entail an extra runtime check.  That seems unfortunate (and 
unnecessary, if one could know in advance which offsets are okay), but 
maybe it's not such a big deal if you have a lot of probes anyhow.

Anyhow, I don't know these instruction sets.  It might be nice to have a 
solution that does not require specific knowledge of instruction sets 
(and other info).

>> +				continue;
>>   #endif
>>   
>>   			snprintf(offstr, sizeof(offstr), "%lx", off);
>> -- 
>> 2.43.5
>>



More information about the DTrace-devel mailing list