[DTrace-devel] [PATCH] test/utils: add more reliable "get remote address" approach
Kris Van Hees
kris.van.hees at oracle.com
Tue Jul 8 17:30:44 UTC 2025
On Tue, Jul 08, 2025 at 06:19:25PM +0100, Alan Maguire wrote:
> On 08/07/2025 02:34, Kris Van Hees wrote:
> > On Mon, Jul 07, 2025 at 10:51:10PM +0100, Alan Maguire wrote:
> >> On 07/07/2025 20:55, Kris Van Hees wrote:
> >>> On Mon, Jul 07, 2025 at 07:14:35PM +0100, Alan Maguire wrote:
> >>>> On 07/07/2025 17:53, Kris Van Hees wrote:
> >>>>> On Mon, Jul 07, 2025 at 05:32:19PM +0100, Alan Maguire wrote:
> >>>>>> On 03/07/2025 23:36, Kris Van Hees wrote:
> >>>>>>> On Thu, Jul 03, 2025 at 04:59:44PM -0400, Kris Van Hees wrote:
> >>>>>>>> On Thu, Jul 03, 2025 at 09:23:46PM +0100, Alan Maguire wrote:
> >>>>>>>>> On 03/07/2025 20:03, Kris Van Hees wrote:
> >>>>>>>>>> On Thu, Jul 03, 2025 at 07:41:41PM +0100, Alan Maguire wrote:
> >>>>>>>>>>> On 03/07/2025 19:26, Kris Van Hees wrote:
> >>>>>>>>>>>> On Thu, Jul 03, 2025 at 07:02:57PM +0100, Alan Maguire wrote:
> >>>>>>>>>>>>> On 03/07/2025 18:06, Eugene Loh wrote:
> >>>>>>>>>>>>>> On 7/3/25 12:59, Alan Maguire wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On 03/07/2025 17:43, Eugene Loh wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I tested and it looks good (modulo the OL8 UEK6 issue mentioned in the
> >>>>>>>>>>>>>>>> patch 3/4 feedback).
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Sorry I couldn't find that issue; is this the 5.15 problem with the ip
> >>>>>>>>>>>>>>> send probes?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> dtrace: failed to compile script /dev/stdin:
> >>>>>>>>>>>>>> ".../build/dlibs/5.2/tcp.d", line 177: failed to resolve type of
> >>>>>>>>>>>>>> inet_ntoa arg#1 (ipaddr_t *):
> >>>>>>>>>>>>>> Unknown type name
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Ah, sorry yep I have a fix for that one in the next round. Basically we
> >>>>>>>>>>>>> need to add it to the core set of typedefs and add a type for a pointer
> >>>>>>>>>>>>> to ipaddr_t; we can't rely on the #pragma to include net.d unfortunately.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Why can't we rely on the pragma? That is how e.g. the ip provider manages
> >>>>>>>>>>>> this I believe?
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Unfortunately the #pragma include doesn't do enough; it just defines a
> >>>>>>>>>>> type for ipaddr_t , not a type for a _pointer_ to an ipaddr_t , which is
> >>>>>>>>>>> what we need as a parameter to inet_ntoa(). I tried adding the ipaddr_t
> >>>>>>>>>>> typedef to net.d and doing the pointer lookup/addition but that doesn't
> >>>>>>>>>>> work either. Seems we need the core typedef + pointer addition or we hit
> >>>>>>>>>>> this failure.
> >>>>>>>>>>
> >>>>>>>>>> Actually, if you move 'typedef __be32 ipaddr_t;' from ip.d to net.d,
> >>>>>>>>>> you should be set. That is what I did in my priliminary tcp provider impl.
> >>>>>>>>>> I do believe that works. Either way, we use inet_ntoa() in the ip.d
> >>>>>>>>>> translators and that works with that typedef in the file, so this really ought
> >>>>>>>>>> to work.
> >>>>>>>>
> >>>>>>>>> Yep, I tried that in the v2 patch series; Eugene hit the undefined error
> >>>>>>>>> in one test and I now hit it consistently for all tcp/ip tests
> >>>>>>>>> unfortunately with "typedef __be32 ipaddr_t;" in net.d.
> >>>>>>>>>
> >>>>>>>>> My assumption (probably wrong) is that the include of the library does
> >>>>>>>>> happen but nothing triggers the pointer type generation for "ipaddr *"
> >>>>>>>>> in the CTF dict. If there was a way to force that type generation at the
> >>>>>>>>> .d file level that would be great, not sure I see a way currently tho.
> >>>>>>>>
> >>>>>>>> Well, like I said, it does work for ip.d so I don't see why this would be
> >>>>>>>> any different. I'll have a look and see if I can figure something out.
> >>>>>>>
> >>>>>>> Looking into this more, I think the problem is simply that you did not sync
> >>>>>>> all the dlibs for the various kernel versions with the updated ip.d, net.d, and
> >>>>>>> tcp.d files. So, if the kernel on the OL8 instance you test on does not have
> >>>>>>> your change, it will fail.
> >>>>>>>
> >>>>>>
> >>>>>> No, don't think that's it; the .d files that matched the kernel I tested
> >>>>>> on (6.10) were synced; the use of the 6.10 .d files was visible in the
> >>>>>> error message. The problem appears to be around the fact that tcp.d uses
> >>>>>> the ipaddr_t * in inet_ntoa(), but unlike ip.d (which uses ipaddr_t in
> >>>>>> translated types) it does not have any other mention of ipaddr_t.
> >>>>>> Adding an explicit cast in tcp.d to the argument to inet_ntoa() to
> >>>>>> ipaddr_t * resolves the issue without having to add ipaddr_t to the core
> >>>>>> type list.
> >>>>>
> >>>>> Can you reproduce this at will? Can you give me specifics on OL version,
> >>>>> kernel version, etc? I'd like to be able to reproduce what you see, because
> >>>>> so far, all I tried actually works once the ipaddr_t typedef is in net.d.
> >>>>>
> >>>>
> >>>> Yep, it's 100% reproducible for me on an upstream (bpf-next 6.15) kernel
> >>>> + OL9. Moving ipaddr_t to net.d works for ip.d but not tcp.d in that
> >>>> environment. The extra casts for the inet_ntoa() parameters that I
> >>>> mention above are needed in tcp.d to get things to work properly for me.
> >>>>
> >>>> I pushed a branch to
> >>>>
> >>>> https://github.com/alan-maguire/dtrace-utils/tree/remote-tcp-v3-wip-broken
> >>>>
> >>>> that illustrates the failure.
> >>>>
> >>>> Relative to devel, it consists of 6 commits
> >>>>
> >>>> 1: the v2 of the remote IP address change (ensuring the remote address
> >>>> tests won't fail);
> >>>> 2-4: a few prep patches for the tcp provider; and
> >>>> 5: the tcp provider patch (in a v3 work-in-progress form); and finally
> >>>> 6: the top-level commit then removes the casts I added to tcp.d in the
> >>>> previous "tcp: new provider" commit. With that change in place on my
> >>>> system, the previously-passing IP tests start failing.
> >>>>
> >>>> If I "git reset --hard HEAD~1" on that branch (reestablishing those
> >>>> ipaddr_t * casts) and rebuild, the failures go away for me.
> >>>
> >>> I tested your tree on Debian with the 6.15 kernel, and this is the result:
> >>>
> >>> $ uname -a
> >>> Linux kvh-deb-bpf3 6.15.0 #1 SMP PREEMPT_DYNAMIC Mon Jul 7 15:19:59 EDT 2025 x86_64 GNU/Linux
> >>> $ cat test/log/current/runtest.sum
> >>> dtrace: Oracle D 2.0
> >>> This is DTrace 2.0.1
> >>> dtrace(1) version-control ID: cf3219c3069ac51c6f03f7a6dcb50958213466fc
> >>> libdtrace version-control ID: cf3219c3069ac51c6f03f7a6dcb50958213466fc
> >>> Linux kvh-deb-bpf3 6.15.0 #1 SMP PREEMPT_DYNAMIC Mon Jul 7 15:19:59 EDT 2025 x86_64 GNU/Linux
> >>> testsuite version-control ID: cf3219c3069ac51c6f03f7a6dcb50958213466fc
> >>>
> >>> test/unittest/tcp/tst.ipv4localtcp.sh: PASS.
> >>> test/unittest/tcp/tst.ipv4localtcpstate.sh: PASS.
> >>> test/unittest/tcp/tst.ipv4remotetcp.sh: PASS.
> >>> test/unittest/tcp/tst.ipv4remotetcpstate.sh: PASS.
> >>> test/unittest/tcp/tst.ipv6localtcp.sh: PASS.
> >>> test/unittest/tcp/tst.ipv6localtcpstate.sh: PASS.
> >>> 6 cases (6 PASS, 0 FAIL, 0 XPASS, 0 XFAIL, 0 SKIP)
> >>>
> >>> I will try to get 6.15 on an OL9 instance and try there, but either way, I
> >>> have a feeling there is a binutils (libctf) discrepancy somewhere? What
> >>
> >> could be; see below..
> >>
> >>> version of binutils is installed on your system (nm -V)?
> >>
> >> $ nm -V
> >> GNU nm version 2.35.2-42.0.1.el9
> >> Copyright (C) 2020 Free Software Foundation, Inc.
> >> This program is free software; you may redistribute it under the terms of
> >> the GNU General Public License version 3 or (at your option) any later
> >> version.
> >> This program has absolutely no warranty.
> >>
> >> Let me know if you need any more info. Thanks!
> >>
> >> Alan
> >
> >
> > Tried it on OL9 with 6.15.4 kernel, and aside from some probes not firing,
> > the tests work.
> >
> > $ nm -V
> > GNU nm version 2.35.2-63.0.1.el9
> > Copyright (C) 2020 Free Software Foundation, Inc.
> > This program is free software; you may redistribute it under the terms of
> > the GNU General Public License version 3 or (at your option) any later version.
> > This program has absolutely no warranty.
> >
> > So I think you need to yum update your system?
>
> I think I may have found another clue to why it's happening. I tried on
> a gcc-toolset-14 -built system, with
>
> $ nm -V
> GNU nm version 2.41-3.el9
> Copyright (C) 2023 Free Software Foundation, Inc.
> This program is free software; you may redistribute it under the terms of
> the GNU General Public License version 3 or (at your option) any later
> version.
> This program has absolutely no warranty.
>
>
> Now I can run the following fine:
>
> # build/dtrace -n 'ip:::send /args[4]->ipv4_protocol == IPPROTO_TCP/ {
> @c[args[2]->ip_saddr, args[4]->ipv4_protocol] = count(); } END {
> printa(@c); }'
> dtrace: description 'ip:::send ' matched 2 probes
>
> However, if I add a syslibdir path - as the tests do when they execute -
> I see
>
> $ build/dtrace -xsyslibdir=$(pwd)/build/dlibs -n 'ip:::send
> /args[4]->ipv4_protocol == IPPROTO_TCP/ { @c[args[2]->ip_saddr,
> args[4]->ipv4_protocol] = count(); } END { printa(@c); }'
> dtrace: invalid probe specifier ip:::send /args[4]->ipv4_protocol ==
> IPPROTO_TCP/ { @c[args[2]->ip_saddr, args[4]->ipv4_protocol] = count();
> } END { printa(@c); }:
> "/home/opc/src/dtrace-utils/build/dlibs/6.10/tcp.d", line 183: failed to
> resolve type of inet_ntoa arg#1 (ipaddr_t *): Unknown type name
>
> using -xdebug I see a lot less types added in the error case; i.e. only
> the following are added when processing .d files:
>
> libdtrace DEBUG 1751994411: typedef conninfo_t added as id 2147483678
> libdtrace DEBUG 1751994411: typedef netstackid_t added as id 2147483679
> libdtrace DEBUG 1751994411: typedef ipaddr_t added as id 2147483683
> libdtrace DEBUG 1751994411: typedef in6_addr_t added as id 2147483695
> libdtrace DEBUG 1751994411: typedef pktinfo_t added as id 2147483697
> libdtrace DEBUG 1751994411: typedef csinfo_t added as id 2147483699
> libdtrace DEBUG 1751994411: typedef tcpinfo_t added as id 2147483701
> libdtrace DEBUG 1751994411: typedef tcpsinfo_t added as id 2147483703
> libdtrace DEBUG 1751994411: typedef tcplsinfo_t added as id 2147483705
>
>
> versus the good case:
>
> libdtrace DEBUG 1751994399: typedef processorid_t added as id 2147483677
> libdtrace DEBUG 1751994399: typedef psetid_t added as id 2147483678
> libdtrace DEBUG 1751994399: typedef chipid_t added as id 2147483679
> libdtrace DEBUG 1751994399: typedef lgrp_id_t added as id 2147483680
> libdtrace DEBUG 1751994399: typedef cpuinfo_t added as id 2147483682
> libdtrace DEBUG 1751994399: typedef cpuinfo_t_p added as id 2147483684
> libdtrace DEBUG 1751994399: typedef time_t added as id 2147483688
> libdtrace DEBUG 1751994399: typedef timestruc_t added as id 2147483690
> libdtrace DEBUG 1751994399: typedef lwpsinfo_t added as id 2147483695
> libdtrace DEBUG 1751994399: typedef taskid_t added as id 2147483696
> libdtrace DEBUG 1751994399: typedef dprojid_t added as id 2147483697
> libdtrace DEBUG 1751994399: typedef poolid_t added as id 2147483698
> libdtrace DEBUG 1751994399: typedef zoneid_t added as id 2147483699
> libdtrace DEBUG 1751994399: typedef psinfo_t added as id 2147490324
> libdtrace DEBUG 1751994399: typedef conninfo_t added as id 2147490329
> libdtrace DEBUG 1751994399: typedef netstackid_t added as id 2147490330
> libdtrace DEBUG 1751994399: typedef ipaddr_t added as id 2147490331
> libdtrace DEBUG 1751994399: typedef in6_addr_t added as id 2147490332
> libdtrace DEBUG 1751994399: typedef pktinfo_t added as id 2147490334
> libdtrace DEBUG 1751994399: typedef csinfo_t added as id 2147490336
> libdtrace DEBUG 1751994399: skipping library
> /usr/lib64/dtrace/6.10/udp.d: "/usr/lib64/dtrace/6.10/udp.d", line 10:
> program requires provider udp
> libdtrace DEBUG 1751994399: typedef tcpinfo_t added as id 2147490338
> libdtrace DEBUG 1751994399: typedef tcpsinfo_t added as id 2147490340
> libdtrace DEBUG 1751994399: typedef tcplsinfo_t added as id 2147490342
> libdtrace DEBUG 1751994399: typedef ipinfo_t added as id 2147490350
> libdtrace DEBUG 1751994399: typedef ifinfo_t added as id 2147490352
> libdtrace DEBUG 1751994399: typedef ipv4info_t added as id 2147490361
> libdtrace DEBUG 1751994399: typedef ipv6info_t added as id 2147490369
> libdtrace DEBUG 1751994399: typedef void_ip_t added as id 2147490370
> libdtrace DEBUG 1751994399: typedef __dtrace_tcp_void_ip_t added as id
> 2147490371
> libdtrace DEBUG 1751994399: typedef caddr_t added as id 2147490380
> libdtrace DEBUG 1751994399: typedef bufinfo_t added as id 2147490382
> libdtrace DEBUG 1751994399: typedef devinfo_t added as id 2147490385
>
> So it looks like sched.d wasn't processed for example, but weirdly in
> the failing case net.d (containing the typedef ipaddr_t) and tcp.d were.
>
> The actual error comes later though, after processing kernel/module BTF:
>
> dtrace: invalid probe specifier ip:::send /args[4]->ipv4_protocol ==
> IPPROTO_TCP/ { @c[args[2]->ip_saddr, args[4]->ipv4_protocol] = count();
> } END { printa(@c); }:
> "/home/opc/src/dtrace-utils/build/dlibs/6.10/tcp.d", line 183: failed to
> resolve type of inet_ntoa arg#1 (ipaddr_t *): Unknown type name
>
> And so I checked for differences in the build/dlibs files versus what is
> installed, and found none. Maybe the above might help reproduce this at
> least. Thanks!
I'll have a look, but when you are using a locally built dtrace, you should use
./build/run-dtrace so that the correct paths are set up for libdtrace.so and
the dlibs to be found. Otherwise, you end up using the locally built frontend
(dtrace) with the installed libdtrace.so and dlibs. And even when passing
the -xsyslibdir, you still end up using the installed libdtrace.so, so your
testing is not based on the locally built dtrace.
Kris
More information about the DTrace-devel
mailing list