[DTrace-devel] [PATCH v2 3/4] dtrace: add tcp provider
Alan Maguire
alan.maguire at oracle.com
Thu Jul 3 15:03:05 UTC 2025
On 03/07/2025 01:02, Eugene Loh wrote:
> On 7/2/25 11:06, Alan Maguire wrote:
>
>> On 02/07/2025 00:16, Eugene Loh wrote:
>>> On most VMs,
>>> test/unittest/tcp/tst.ipv4remotetcp.sh
>>> test/unittest/tcp/tst.ipv4remotetcpstate.sh
>>> xfail due to missing remote. Are we okay with "shrugging our shoulders"
>>> like that?
>> Yeah, I don't think the remote test is robust enough. Specifically in
>> OCI it seems to always fail. I'd suggest we replace it with creating a
>> network namespace with IP addresses configured on top of veths to
>> simulate the remote case, the codepaths will be the same. I've done this
>> in other test suites and it works well.
>
> Sounds great (if "we" is "you", haha).
>
I had a go; see
https://lore.kernel.org/dtrace/20250703113345.1273604-1-alan.maguire@oracle.com/
>>> Meanwhile, my one non-OCI VM ran those tests. The first test passes.
>>> The second one consistently reports
>>> -tcp:::state-change to time-wait - yes
>>> +tcp:::state-change to time-wait - no
>> I hit some of these failure during development; adding the
>> fbt::tcp_time_wait:entry probe helped. Is that inlined or something
>> perhaps (grep tcp_time_wait /proc/kallsyms)?
>
> On the VM in question:
>
> # grep -w tcp_time_wait /proc/kallsyms
> ffffffff92ad25b0 T tcp_time_wait
> # dtrace -lP fbt |& grep tcp_time_wait
> 49373 fbt vmlinux tcp_time_wait return
> 49372 fbt vmlinux tcp_time_wait entry
> # dtrace -lP rawfbt |& grep tcp_time_wait
> 51079 rawfbt vmlinux tcp_time_wait return
> 51078 rawfbt vmlinux tcp_time_wait entry
>
I'm not sure if it's related, but in testing the IP provider with the
net namespace stuff I saw some weird behaviour with the IP sdt probes
that had multiple underlying probe definitions. If we had a program with
ip:::send and ip:::receive, we were often left one probe short (i.e. no
BPF prog created/attached) whatever the first probe point in the program
was. So if I traced ip:::send then ip:::receive the ip6_finish_output
send probe was missing and the test failed. Reversing the order seemed
to transfer the problem to the receive probe. So maybe there's a general
bug around synthetic probes that's biting us here? Not sure but I'll
investigate further.
>>> and occasionally reports stuff like
>>> dtrace: error in dt_clause_2 for probe ID 4976 (tcp:vmlinux::send):
>>> invalid address (0x1fc0c0000000000) at BPF pc 287
>>> dtrace: error in dt_clause_2 for probe ID 4976 (tcp:vmlinux::send):
>>> invalid address (0x225b80000000000) at BPF pc 287
>>>
>> ah, ok there must be a null deref somewhere. Haven't seen this before;
>> what kernel version/arch is this?
>
> 5.15.0-300.161.13.el9uek.x86_64
>
> FWIW, I can comment out all probes in tcp other than:
>
> { "send", DTRACE_PROBESPEC_NAME,
> "rawfbt::ip_send_unicast_reply:entry" },
>
> Then I run
>
> dtrace -c "$testdir/client.ip.pl tcp $dest $tcpport" -qn 'tcp:::send /
> args[2]->ip_saddr == "'$source'"/ { tcpsend++; }'
>
> The disassembly shows that I look up args[2] using dt_bvar_args()
> (including checking for a fault). Then we try to dereference args[2]-
>>ip_saddr. We first check the pointer is non NULL. Then we call
> dt_cg_load_scalar() to bpf_probe_read() from the desired location. This
> call is problematic.
>
Great, thanks for narrowing this down!
>>> The non-remote tests fail on OL8 UEK6 (x86 and arm).
>>> dtrace: failed to compile script /dev/stdin:
>>> ".../build/dlibs/5.2/tcp.d", line 177: failed to resolve type of
>>> inet_ntoa arg#1 (ipaddr_t *):
>>> Unknown type name
>>>
>> This is a weird failure; I see it on some systems but not on others.
>> In tcp.d we have
>>
>> #pragma D depends_on library net.d
>>
>> which contains the typedef for ipaddr_t ; it seems that's not enough to
>> pull in the typedef reliably. I suspect there is a timing element
>> involved here in when the net.d library is included. Perhaps there is a
>> better way to define ipaddr_t ; would using a builtin typedef in
>> _dtrace_typedefs_32/64 work better perhaps?
>
> Don't know.
>
I'll dig into this further. If anyone has hints here it would be great.
>>> The probe names are
>>> tcp:ip:*:* Solaris
>>> tcp:vmlinux:*:* DTv1
>>> tcp:vmlinux::* with this patch (that is, no more function)
>>> I guess precedents have already been set for other SDT providers; so,
>>> okay. Just noting for my own sake.
>>> Meanwhile, the typed args[] have changed in number and type from
>>> Solaris> to DTv1 to this patch. Does that merit discussion?
>> Hmm, that's not intentional (aside from the additional INBOUND/OUTBOUND
>> etc which we use to help inform translation).
>
> Worth mentioning somewhere?
>
I guess though I hadn't really considered the fact that the argN values
become args[] values unless we intervene.
>> Do you see other changes aside from them? Thanks!
>
> This is what I have for typed args[] for tcp probes.
>
> The typed probe arguments for probes
> accept-[refused|established]
> connect-[refused|established|request]
> receive
> are the same as for send.
>
> The typed probe arguments for state-change may be different.
>
> So, the typed probe arguments are (wide screen, fixed-width font):
>
> args[0]: args[1]: args[2]: args[3]: args[4]:
> args[5]: args[6]: args[7]:
>
> send Solaris pktinfo_t * csinfo_t * ipinfo_t *
> tcpsinfo_t * tcpinfo_t *
> send DTv1 (unknown) (unknown) (unknown)
> (unknown) (unknown) (unknown) int int
> send DTv2 pktinfo_t * csinfo_t * ipinfo_t *
> tcpsinfo_t * tcpinfo_t * int tcplsinfo_t * int
>
> state-change Solaris void csinfo_t * void
> tcpsinfo_t * void tcplsinfo_t *
> state-change DTv1 (unknown) (unknown) (unknown)
> (unknown) (unknown) (unknown) int int
> state-change DTv2 void * csinfo_t * void *
> tcpsinfo_t * void * void * tcplsinfo_t * int
>
> Here, "DTv1" refers to legacy DTrace on Linux. I guess we can ignore
> that. By "DTv2" I mean your patch. For state-change, Solaris calls
> some things "void" (not "void *") and tcplsinfo_t* moves from args[5] to
> args[6].
That latter one definitely needs fixing; I think in the other cases it's
just that we need to fix up the provider description as the fields
aren't set for Linux either.
More information about the DTrace-devel
mailing list