[DTrace-devel] [PATCH] test/utils: add more reliable "get remote address" approach

Alan Maguire alan.maguire at oracle.com
Tue Jul 8 19:04:41 UTC 2025


On 08/07/2025 18:30, Kris Van Hees wrote:
> On Tue, Jul 08, 2025 at 06:19:25PM +0100, Alan Maguire wrote:
>> On 08/07/2025 02:34, Kris Van Hees wrote:
>>> On Mon, Jul 07, 2025 at 10:51:10PM +0100, Alan Maguire wrote:
>>>> On 07/07/2025 20:55, Kris Van Hees wrote:
>>>>> On Mon, Jul 07, 2025 at 07:14:35PM +0100, Alan Maguire wrote:
>>>>>> On 07/07/2025 17:53, Kris Van Hees wrote:
>>>>>>> On Mon, Jul 07, 2025 at 05:32:19PM +0100, Alan Maguire wrote:
>>>>>>>> On 03/07/2025 23:36, Kris Van Hees wrote:
>>>>>>>>> On Thu, Jul 03, 2025 at 04:59:44PM -0400, Kris Van Hees wrote:
>>>>>>>>>> On Thu, Jul 03, 2025 at 09:23:46PM +0100, Alan Maguire wrote:
>>>>>>>>>>> On 03/07/2025 20:03, Kris Van Hees wrote:
>>>>>>>>>>>> On Thu, Jul 03, 2025 at 07:41:41PM +0100, Alan Maguire wrote:
>>>>>>>>>>>>> On 03/07/2025 19:26, Kris Van Hees wrote:
>>>>>>>>>>>>>> On Thu, Jul 03, 2025 at 07:02:57PM +0100, Alan Maguire wrote:
>>>>>>>>>>>>>>> On 03/07/2025 18:06, Eugene Loh wrote:
>>>>>>>>>>>>>>>> On 7/3/25 12:59, Alan Maguire wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 03/07/2025 17:43, Eugene Loh wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I tested and it looks good (modulo the OL8 UEK6 issue mentioned in the
>>>>>>>>>>>>>>>>>> patch 3/4 feedback).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Sorry I couldn't find that issue; is this the 5.15 problem with the ip
>>>>>>>>>>>>>>>>> send probes?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>     dtrace: failed to compile script /dev/stdin:
>>>>>>>>>>>>>>>>     ".../build/dlibs/5.2/tcp.d", line 177: failed to resolve type of
>>>>>>>>>>>>>>>> inet_ntoa arg#1 (ipaddr_t *):
>>>>>>>>>>>>>>>>     Unknown type name
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Ah, sorry yep I have a fix for that one in the next round. Basically we
>>>>>>>>>>>>>>> need to add it to the core set of typedefs and add a type for a pointer
>>>>>>>>>>>>>>> to ipaddr_t; we can't rely on the #pragma to include net.d unfortunately.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Why can't we rely on the pragma?  That is how e.g. the ip provider manages
>>>>>>>>>>>>>> this I believe?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Unfortunately the #pragma include doesn't do enough; it just defines a
>>>>>>>>>>>>> type for ipaddr_t , not a type for a _pointer_ to an ipaddr_t , which is
>>>>>>>>>>>>> what we need as a parameter to inet_ntoa(). I tried adding the ipaddr_t
>>>>>>>>>>>>> typedef to net.d and doing the pointer lookup/addition but that doesn't
>>>>>>>>>>>>> work either. Seems we need the core typedef + pointer addition or we hit
>>>>>>>>>>>>> this failure.
>>>>>>>>>>>>
>>>>>>>>>>>> Actually, if you move 'typedef __be32          ipaddr_t;' from ip.d to net.d,
>>>>>>>>>>>> you should be set.  That is what I did in my priliminary tcp provider impl.
>>>>>>>>>>>> I do believe that works.  Either way, we use inet_ntoa() in the ip.d
>>>>>>>>>>>> translators and that works with that typedef in the file, so this really ought
>>>>>>>>>>>> to work.
>>>>>>>>>>
>>>>>>>>>>> Yep, I tried that in the v2 patch series; Eugene hit the undefined error
>>>>>>>>>>> in one test and I now hit it consistently for all tcp/ip tests
>>>>>>>>>>> unfortunately with "typedef __be32 ipaddr_t;" in net.d.
>>>>>>>>>>>
>>>>>>>>>>> My assumption (probably wrong) is that the include of the library does
>>>>>>>>>>> happen but nothing triggers the pointer type generation for "ipaddr *"
>>>>>>>>>>> in the CTF dict. If there was a way to force that type generation at the
>>>>>>>>>>> .d file level that would be great, not sure I see a way currently tho.
>>>>>>>>>>
>>>>>>>>>> Well, like I said, it does work for ip.d so I don't see why this would be
>>>>>>>>>> any different.  I'll have a look and see if I can figure something out.
>>>>>>>>>
>>>>>>>>> Looking into this more, I think the problem is simply that you did not sync
>>>>>>>>> all the dlibs for the various kernel versions with the updated ip.d, net.d, and
>>>>>>>>> tcp.d files.  So, if the kernel on the OL8 instance you test on does not have
>>>>>>>>> your change, it will fail.
>>>>>>>>>
>>>>>>>>
>>>>>>>> No, don't think that's it; the .d files that matched the kernel I tested
>>>>>>>> on (6.10) were synced; the use of the 6.10 .d files was visible in the
>>>>>>>> error message. The problem appears to be around the fact that tcp.d uses
>>>>>>>> the ipaddr_t * in inet_ntoa(), but unlike ip.d (which uses ipaddr_t in
>>>>>>>> translated types) it does not have any other mention of ipaddr_t.
>>>>>>>> Adding an explicit cast in tcp.d to the argument to inet_ntoa() to
>>>>>>>> ipaddr_t * resolves the issue without having to add ipaddr_t to the core
>>>>>>>> type list.
>>>>>>>
>>>>>>> Can you reproduce this at will?  Can you give me specifics on OL version,
>>>>>>> kernel version, etc?  I'd like to be able to reproduce what you see, because
>>>>>>> so far, all I tried actually works once the ipaddr_t typedef is in net.d.
>>>>>>>
>>>>>>
>>>>>> Yep, it's 100% reproducible for me on an upstream (bpf-next 6.15) kernel
>>>>>> + OL9. Moving ipaddr_t to net.d works for ip.d but not tcp.d in that
>>>>>> environment. The extra casts for the inet_ntoa() parameters that I
>>>>>> mention above are needed in tcp.d to get things to work properly for me.
>>>>>>
>>>>>> I pushed a branch to
>>>>>>
>>>>>> https://github.com/alan-maguire/dtrace-utils/tree/remote-tcp-v3-wip-broken
>>>>>>
>>>>>> that illustrates the failure.
>>>>>>
>>>>>> Relative to devel, it consists of 6 commits
>>>>>>
>>>>>> 1: the v2 of the remote IP address change (ensuring the remote address
>>>>>> tests won't fail);
>>>>>> 2-4: a few prep patches for the tcp provider; and
>>>>>> 5: the tcp provider patch (in a v3 work-in-progress form); and finally
>>>>>> 6: the top-level commit then removes the casts I added to tcp.d in the
>>>>>> previous "tcp: new provider" commit. With that change in place on my
>>>>>> system, the previously-passing IP tests start failing.
>>>>>>
>>>>>> If I "git reset --hard HEAD~1" on that branch (reestablishing those
>>>>>> ipaddr_t * casts) and rebuild, the failures go away for me.
>>>>>
>>>>> I tested your tree on Debian with the 6.15 kernel, and this is the result:
>>>>>
>>>>> $ uname -a
>>>>> Linux kvh-deb-bpf3 6.15.0 #1 SMP PREEMPT_DYNAMIC Mon Jul  7 15:19:59 EDT 2025 x86_64 GNU/Linux
>>>>> $ cat test/log/current/runtest.sum 
>>>>> dtrace: Oracle D 2.0
>>>>> This is DTrace 2.0.1
>>>>> dtrace(1) version-control ID: cf3219c3069ac51c6f03f7a6dcb50958213466fc
>>>>> libdtrace version-control ID: cf3219c3069ac51c6f03f7a6dcb50958213466fc
>>>>> Linux kvh-deb-bpf3 6.15.0 #1 SMP PREEMPT_DYNAMIC Mon Jul  7 15:19:59 EDT 2025 x86_64 GNU/Linux
>>>>> testsuite version-control ID: cf3219c3069ac51c6f03f7a6dcb50958213466fc
>>>>>
>>>>> test/unittest/tcp/tst.ipv4localtcp.sh: PASS.
>>>>> test/unittest/tcp/tst.ipv4localtcpstate.sh: PASS.
>>>>> test/unittest/tcp/tst.ipv4remotetcp.sh: PASS.
>>>>> test/unittest/tcp/tst.ipv4remotetcpstate.sh: PASS.
>>>>> test/unittest/tcp/tst.ipv6localtcp.sh: PASS.
>>>>> test/unittest/tcp/tst.ipv6localtcpstate.sh: PASS.
>>>>> 6 cases (6 PASS, 0 FAIL, 0 XPASS, 0 XFAIL, 0 SKIP)
>>>>>
>>>>> I will try to get 6.15 on an OL9 instance and try there, but either way, I
>>>>> have a feeling there is a binutils (libctf) discrepancy somewhere?  What
>>>>
>>>> could be; see below..
>>>>
>>>>> version of binutils is installed on your system (nm -V)?
>>>>
>>>> $ nm -V
>>>> GNU nm version 2.35.2-42.0.1.el9
>>>> Copyright (C) 2020 Free Software Foundation, Inc.
>>>> This program is free software; you may redistribute it under the terms of
>>>> the GNU General Public License version 3 or (at your option) any later
>>>> version.
>>>> This program has absolutely no warranty.
>>>>
>>>> Let me know if you need any more info. Thanks!
>>>>
>>>> Alan
>>>
>>>
>>> Tried it on OL9 with 6.15.4 kernel, and aside from some probes not firing,
>>> the tests work.
>>>
>>> $ nm -V
>>> GNU nm version 2.35.2-63.0.1.el9
>>> Copyright (C) 2020 Free Software Foundation, Inc.
>>> This program is free software; you may redistribute it under the terms of
>>> the GNU General Public License version 3 or (at your option) any later version.
>>> This program has absolutely no warranty.
>>>
>>> So I think you need to yum update your system?
>>
>> I think I may have found another clue to why it's happening. I tried on
>> a gcc-toolset-14 -built system, with
>>
>> $ nm -V
>> GNU nm version 2.41-3.el9
>> Copyright (C) 2023 Free Software Foundation, Inc.
>> This program is free software; you may redistribute it under the terms of
>> the GNU General Public License version 3 or (at your option) any later
>> version.
>> This program has absolutely no warranty.
>>
>>
>> Now I can run the following fine:
>>
>> # build/dtrace -n 'ip:::send /args[4]->ipv4_protocol == IPPROTO_TCP/ {
>> @c[args[2]->ip_saddr, args[4]->ipv4_protocol] = count(); } END {
>> printa(@c); }'
>> dtrace: description 'ip:::send ' matched 2 probes
>>
>> However, if I add a syslibdir path - as the tests do when they execute -
>> I see
>>
>> $ build/dtrace -xsyslibdir=$(pwd)/build/dlibs -n 'ip:::send
>> /args[4]->ipv4_protocol == IPPROTO_TCP/ { @c[args[2]->ip_saddr,
>> args[4]->ipv4_protocol] = count(); } END { printa(@c); }'
>> dtrace: invalid probe specifier ip:::send /args[4]->ipv4_protocol ==
>> IPPROTO_TCP/ { @c[args[2]->ip_saddr, args[4]->ipv4_protocol] = count();
>> } END { printa(@c); }:
>> "/home/opc/src/dtrace-utils/build/dlibs/6.10/tcp.d", line 183: failed to
>> resolve type of inet_ntoa arg#1 (ipaddr_t *): Unknown type name
>>
>> using -xdebug I see a lot less types added in the error case; i.e. only
>> the following are added when processing .d files:
>>
>> libdtrace DEBUG 1751994411: typedef conninfo_t added as id 2147483678
>> libdtrace DEBUG 1751994411: typedef netstackid_t added as id 2147483679
>> libdtrace DEBUG 1751994411: typedef ipaddr_t added as id 2147483683
>> libdtrace DEBUG 1751994411: typedef in6_addr_t added as id 2147483695
>> libdtrace DEBUG 1751994411: typedef pktinfo_t added as id 2147483697
>> libdtrace DEBUG 1751994411: typedef csinfo_t added as id 2147483699
>> libdtrace DEBUG 1751994411: typedef tcpinfo_t added as id 2147483701
>> libdtrace DEBUG 1751994411: typedef tcpsinfo_t added as id 2147483703
>> libdtrace DEBUG 1751994411: typedef tcplsinfo_t added as id 2147483705
>>
>>
>> versus the good case:
>>
>> libdtrace DEBUG 1751994399: typedef processorid_t added as id 2147483677
>> libdtrace DEBUG 1751994399: typedef psetid_t added as id 2147483678
>> libdtrace DEBUG 1751994399: typedef chipid_t added as id 2147483679
>> libdtrace DEBUG 1751994399: typedef lgrp_id_t added as id 2147483680
>> libdtrace DEBUG 1751994399: typedef cpuinfo_t added as id 2147483682
>> libdtrace DEBUG 1751994399: typedef cpuinfo_t_p added as id 2147483684
>> libdtrace DEBUG 1751994399: typedef time_t added as id 2147483688
>> libdtrace DEBUG 1751994399: typedef timestruc_t added as id 2147483690
>> libdtrace DEBUG 1751994399: typedef lwpsinfo_t added as id 2147483695
>> libdtrace DEBUG 1751994399: typedef taskid_t added as id 2147483696
>> libdtrace DEBUG 1751994399: typedef dprojid_t added as id 2147483697
>> libdtrace DEBUG 1751994399: typedef poolid_t added as id 2147483698
>> libdtrace DEBUG 1751994399: typedef zoneid_t added as id 2147483699
>> libdtrace DEBUG 1751994399: typedef psinfo_t added as id 2147490324
>> libdtrace DEBUG 1751994399: typedef conninfo_t added as id 2147490329
>> libdtrace DEBUG 1751994399: typedef netstackid_t added as id 2147490330
>> libdtrace DEBUG 1751994399: typedef ipaddr_t added as id 2147490331
>> libdtrace DEBUG 1751994399: typedef in6_addr_t added as id 2147490332
>> libdtrace DEBUG 1751994399: typedef pktinfo_t added as id 2147490334
>> libdtrace DEBUG 1751994399: typedef csinfo_t added as id 2147490336
>> libdtrace DEBUG 1751994399: skipping library
>> /usr/lib64/dtrace/6.10/udp.d: "/usr/lib64/dtrace/6.10/udp.d", line 10:
>> program requires provider udp
>> libdtrace DEBUG 1751994399: typedef tcpinfo_t added as id 2147490338
>> libdtrace DEBUG 1751994399: typedef tcpsinfo_t added as id 2147490340
>> libdtrace DEBUG 1751994399: typedef tcplsinfo_t added as id 2147490342
>> libdtrace DEBUG 1751994399: typedef ipinfo_t added as id 2147490350
>> libdtrace DEBUG 1751994399: typedef ifinfo_t added as id 2147490352
>> libdtrace DEBUG 1751994399: typedef ipv4info_t added as id 2147490361
>> libdtrace DEBUG 1751994399: typedef ipv6info_t added as id 2147490369
>> libdtrace DEBUG 1751994399: typedef void_ip_t added as id 2147490370
>> libdtrace DEBUG 1751994399: typedef __dtrace_tcp_void_ip_t added as id
>> 2147490371
>> libdtrace DEBUG 1751994399: typedef caddr_t added as id 2147490380
>> libdtrace DEBUG 1751994399: typedef bufinfo_t added as id 2147490382
>> libdtrace DEBUG 1751994399: typedef devinfo_t added as id 2147490385
>>
>> So it looks like sched.d wasn't processed for example, but weirdly in
>> the failing case net.d (containing the typedef ipaddr_t) and tcp.d were.
>>
>> The actual error comes later though, after processing kernel/module BTF:
>>
>> dtrace: invalid probe specifier ip:::send /args[4]->ipv4_protocol ==
>> IPPROTO_TCP/ { @c[args[2]->ip_saddr, args[4]->ipv4_protocol] = count();
>> } END { printa(@c); }:
>> "/home/opc/src/dtrace-utils/build/dlibs/6.10/tcp.d", line 183: failed to
>> resolve type of inet_ntoa arg#1 (ipaddr_t *): Unknown type name
>>
>> And so I checked for differences in the build/dlibs files versus what is
>> installed, and found none. Maybe the above might help reproduce this at
>> least. Thanks!
> 
> I'll have a look, but when you are using a locally built dtrace, you should use
> ./build/run-dtrace so that the correct paths are set up for libdtrace.so and
> the dlibs to be found.  Otherwise, you end up using the locally built frontend
> (dtrace) with the installed libdtrace.so and dlibs.  And even when passing
> the -xsyslibdir, you still end up using the installed libdtrace.so, so your
> testing is not based on the locally built dtrace.
> 
> 	Kris

thanks; tried with build/run-dtrace with same result. However by adding
some debug logging I think I've discovered the root cause; the order of
.d file sorting seems to be different in the build/dlibs versus
/usr/lib64/dtrace case, and the problem is that tcp.d actually
implicitly relies on ip.d for ipinfo_t . We get lucky in the sort order for
/usr/lib64/dtrace, and because an ipaddr_t * gets added during ip.d
processing, by the time we lookup "ipaddr_t *" in tcp.d it's already in
the D CTF dict. I _think_ the ipaddr_t * gets added as a side effect of
the fact that there are fields of type ipaddr_t in the translated
ipv4info_t in ip.d

However in the problematic case with build/run-dtrace , net.d is still
loaded first, and then tcp.d is loaded immediately after without an
intervening load of ip.d. As a result we have no "ipaddr_t *", hence

dtrace: invalid probe specifier ip:::send /args[4]->ipv4_protocol ==
IPPROTO_TCP/ { @c[args[2]->ip_saddr, args[4]->ipv4_protocol] = count();
} END { printa(@c); }:
"/home/opc/src/dtrace-utils/build/dlibs/6.10/tcp.d", line 183: failed to
resolve type of inet_ntoa arg#1 (ipaddr_t *): Unknown type name

To fix this I think the right answer is to change the dependency tcp.d
has on ip.d, from

#pragma D depends_on provider ip

to

#pragma D depends_on library ip.d

This is needed for other reasons (ipinfo_t declaration for example), but
with that change the problem is resolved.

Alan



More information about the DTrace-devel mailing list