[DTrace-devel] [PATCH v2 3/4] dtrace: add tcp provider
Eugene Loh
eugene.loh at oracle.com
Thu Jul 3 00:02:00 UTC 2025
On 7/2/25 11:06, Alan Maguire wrote:
> On 02/07/2025 00:16, Eugene Loh wrote:
>> On most VMs,
>> test/unittest/tcp/tst.ipv4remotetcp.sh
>> test/unittest/tcp/tst.ipv4remotetcpstate.sh
>> xfail due to missing remote. Are we okay with "shrugging our shoulders"
>> like that?
> Yeah, I don't think the remote test is robust enough. Specifically in
> OCI it seems to always fail. I'd suggest we replace it with creating a
> network namespace with IP addresses configured on top of veths to
> simulate the remote case, the codepaths will be the same. I've done this
> in other test suites and it works well.
Sounds great (if "we" is "you", haha).
>> Meanwhile, my one non-OCI VM ran those tests. The first test passes.
>> The second one consistently reports
>> -tcp:::state-change to time-wait - yes
>> +tcp:::state-change to time-wait - no
> I hit some of these failure during development; adding the
> fbt::tcp_time_wait:entry probe helped. Is that inlined or something
> perhaps (grep tcp_time_wait /proc/kallsyms)?
On the VM in question:
# grep -w tcp_time_wait /proc/kallsyms
ffffffff92ad25b0 T tcp_time_wait
# dtrace -lP fbt |& grep tcp_time_wait
49373 fbt vmlinux tcp_time_wait return
49372 fbt vmlinux tcp_time_wait entry
# dtrace -lP rawfbt |& grep tcp_time_wait
51079 rawfbt vmlinux tcp_time_wait return
51078 rawfbt vmlinux tcp_time_wait entry
>> and occasionally reports stuff like
>> dtrace: error in dt_clause_2 for probe ID 4976 (tcp:vmlinux::send):
>> invalid address (0x1fc0c0000000000) at BPF pc 287
>> dtrace: error in dt_clause_2 for probe ID 4976 (tcp:vmlinux::send):
>> invalid address (0x225b80000000000) at BPF pc 287
>>
> ah, ok there must be a null deref somewhere. Haven't seen this before;
> what kernel version/arch is this?
5.15.0-300.161.13.el9uek.x86_64
FWIW, I can comment out all probes in tcp other than:
{ "send", DTRACE_PROBESPEC_NAME,
"rawfbt::ip_send_unicast_reply:entry" },
Then I run
dtrace -c "$testdir/client.ip.pl tcp $dest $tcpport" -qn 'tcp:::send
/args[2]->ip_saddr == "'$source'"/ { tcpsend++; }'
The disassembly shows that I look up args[2] using dt_bvar_args()
(including checking for a fault). Then we try to dereference
args[2]->ip_saddr. We first check the pointer is non NULL. Then we
call dt_cg_load_scalar() to bpf_probe_read() from the desired location.
This call is problematic.
>> The non-remote tests fail on OL8 UEK6 (x86 and arm).
>> dtrace: failed to compile script /dev/stdin:
>> ".../build/dlibs/5.2/tcp.d", line 177: failed to resolve type of
>> inet_ntoa arg#1 (ipaddr_t *):
>> Unknown type name
>>
> This is a weird failure; I see it on some systems but not on others.
> In tcp.d we have
>
> #pragma D depends_on library net.d
>
> which contains the typedef for ipaddr_t ; it seems that's not enough to
> pull in the typedef reliably. I suspect there is a timing element
> involved here in when the net.d library is included. Perhaps there is a
> better way to define ipaddr_t ; would using a builtin typedef in
> _dtrace_typedefs_32/64 work better perhaps?
Don't know.
>> The probe names are
>> tcp:ip:*:* Solaris
>> tcp:vmlinux:*:* DTv1
>> tcp:vmlinux::* with this patch (that is, no more function)
>> I guess precedents have already been set for other SDT providers; so,
>> okay. Just noting for my own sake.
>> Meanwhile, the typed args[] have changed in number and type from Solaris> to DTv1 to this patch. Does that merit discussion?
> Hmm, that's not intentional (aside from the additional INBOUND/OUTBOUND
> etc which we use to help inform translation).
Worth mentioning somewhere?
> Do you see other changes aside from them? Thanks!
This is what I have for typed args[] for tcp probes.
The typed probe arguments for probes
accept-[refused|established]
connect-[refused|established|request]
receive
are the same as for send.
The typed probe arguments for state-change may be different.
So, the typed probe arguments are (wide screen, fixed-width font):
args[0]: args[1]: args[2]: args[3]: args[4]:
args[5]: args[6]: args[7]:
send Solaris pktinfo_t * csinfo_t * ipinfo_t *
tcpsinfo_t * tcpinfo_t *
send DTv1 (unknown) (unknown) (unknown)
(unknown) (unknown) (unknown) int int
send DTv2 pktinfo_t * csinfo_t * ipinfo_t *
tcpsinfo_t * tcpinfo_t * int tcplsinfo_t * int
state-change Solaris void csinfo_t * void
tcpsinfo_t * void tcplsinfo_t *
state-change DTv1 (unknown) (unknown) (unknown)
(unknown) (unknown) (unknown) int int
state-change DTv2 void * csinfo_t * void *
tcpsinfo_t * void * void * tcplsinfo_t * int
Here, "DTv1" refers to legacy DTrace on Linux. I guess we can ignore
that. By "DTv2" I mean your patch. For state-change, Solaris calls
some things "void" (not "void *") and tcplsinfo_t* moves from args[5] to
args[6].
More information about the DTrace-devel
mailing list