[DTrace-devel] [PATCH v3 1/2] print() action: identify ctf object used for print

Alan Maguire alan.maguire at oracle.com
Mon Jul 1 17:21:19 UTC 2024


On 01/07/2024 17:50, Kris Van Hees wrote:
> On Mon, Jul 01, 2024 at 09:06:32AM +0100, Alan Maguire wrote:
>> On 28/06/2024 22:55, Kris Van Hees wrote:
>>> Looking at this again, I am wondering why we don't just add type info to the
>>> data record?  Since we know the type of the data items being added to trace
>>> records (and type is expressed as ctfp and type id), we might as well store
>>> that in the record.  And that would make this entire situation pretty much a
>>> non-issue because the record would already comtain all that we need to know
>>> to consume it and print it.  No need to pass a module id.
>>>  in 
>>> If the issue is that modules could get unloaded and thereby some pointers
>>> might become invalid, then perhaps the solution would be to always copy the
>>
>> Yeah, I think (as well as modules going away) the other worry is record
>> corruption, so dealing in direct pointers is too risky I think.
>>
>>> type to the D dict (I think that is the right one - I always mix C and D).
>>> And since that one is always around, there would be no need to associate an
>>> id with the module and pass that.
>>>
>>> Perhaps doing such a copy is the best action here?  And we may need to look
>>> at other cases where typed data from modules might be stored, and if there are
>>> any others, they may also be subject to possible issues when modules are
>>> unloaded and in that case would require a similar type copying action?
>>>
>>
>> I looked to see if there any CTF APIs to simplify recursive type
>> copying, and  I couldn't find any and that's my worry - this sort of
>> copying could get complicated fast. The other issue is size - a lot of
>> the kernel types will pull in a massive number of dependenent types
>> (print()ing a skb will pull in sockets, task_structs etc). Duplicating
>> all of that in the D dict would be expensive I suspect. The other danger
>> is introducing multiple types of the same name into the D dict might
>> complicate things for other consumers of it.
> 
> I mistakenly thought that we already included code in DTrace that performs
> type copying, but I cannot find it so perhaps I was mistaken (or it is more
> obscure).
> 
> But then something else occured to me...
> 
> This is all compile-type-known data.  In other words, none of this is passed
> in the actual trace buffer.  That means that we have a lot less worries about
> how to store this information.  So, look below at your patch and some comments
> that I think make it even easier.
> 

Thanks; more below..

>> The worst case scenario in the current patch series - that the module
>> goes away - means we can't print the type info, but that's not the end
>> of the world really. Certainly in the kernel case most modules tend to
>> stay loaded.
>>
>>> On Fri, Apr 26, 2024 at 02:33:26PM +0100, Alan Maguire wrote:
>>>> when generating code for print() action we need to identify source
>>>> of CTF; is it shared CTF, cdefs or ddefs or another module?
>>>> Introduce a module id scheme where the module id can be stored
>>>> at cg time for later retrieval at print time; this allows us
>>>> to later look up the module and its associated CTF without relying
>>>> on possibly freed pointers.
>>>>
>>>> Under the hood, the id scheme uses a monotonic counter coupled
>>>> with the hash of the modules name; the latter allows us to quickly
>>>> look up the right bucket via dt_htab_find_hval().
>>>>
>>>> This fixes an issue encountered when using a shared kernel type
>>>> with alloca()ed memory; DTrace assumed the type was in ddefs
>>>> and it wasn't so it wasn't displayed.  This change fixes that and
>>>> all tests continue to pass.  Also tested that it works with
>>>> kernel module-defined types now:
>>>>
>>>> dtrace: description 'fbt::ieee80211_rx_napi:entry ' matched 1 probe
>>>> CPU     ID                    FUNCTION:NAME
>>>>   7 123161          ieee80211_rx_napi:entry 0xffff9f8980fa08e0 = *
>>>>                                             (struct ieee80211_hw) {
>>>>                                              .conf = (struct ieee80211_conf) {
>>>>                                               .listen_interval = (u16)10,
>>>>                                               .long_frame_max_tx_count = (u8)4,
>>>>                                               .short_frame_max_tx_count = (u8)7,
>>>>                                              },
>>>>
>>>> Signed-off-by: Alan Maguire <alan.maguire at oracle.com>
>>>> ---
>>>>  libdtrace/dt_cg.c      |  7 ++++++-
>>>>  libdtrace/dt_consume.c |  7 +++++--
>>>>  libdtrace/dt_htab.c    | 17 +++++++++++++++++
>>>>  libdtrace/dt_htab.h    |  2 ++
>>>>  libdtrace/dt_impl.h    |  1 +
>>>>  libdtrace/dt_module.c  | 30 ++++++++++++++++++++++++++++++
>>>>  libdtrace/dt_module.h  |  1 +
>>>>  libdtrace/dt_printf.c  | 12 +++++++++---
>>>>  libdtrace/dt_printf.h  |  2 +-
>>>>  9 files changed, 72 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/libdtrace/dt_cg.c b/libdtrace/dt_cg.c
>>>> index 27246a40..30cccdaf 100644
>>>> --- a/libdtrace/dt_cg.c
>>>> +++ b/libdtrace/dt_cg.c
>>>> @@ -20,6 +20,7 @@
>>>>  #include <dt_dctx.h>
>>>>  #include <dt_cg.h>
>>>>  #include <dt_grammar.h>
>>>> +#include <dt_module.h>
>>>>  #include <dt_parser.h>
>>>>  #include <dt_printf.h>
>>>>  #include <dt_provider.h>
>>>> @@ -2832,7 +2833,9 @@ dt_cg_act_print(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>>>>  	ctf_id_t	type = addr->dn_type;
>>>>  	char		n[DT_TYPE_NAMELEN];
>>>>  	size_t		size;
>>>> +	dt_module_t	*dmp;
>>>>  
>>>> +	dmp = dt_module_lookup_by_ctf(dtp, fp);
>>>>  	type = ctf_type_reference(fp, type);
>>>>  	if (type == CTF_ERR)
>>>>  		longjmp(yypcb->pcb_jmpbuf, EDT_CTF);
>>>> @@ -2842,9 +2845,11 @@ dt_cg_act_print(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
>>>>  			"print( ) argument #1 reference has type '%s' with size 0; cannot print( ) it.\n",
>>>>  			ctf_type_name(fp, type, n, sizeof(n)));
>>>>  
>>>> -	/* reserve space for addr/type, data/size */
>>>> +	/* reserve space for addr/type, module identification, data/size */
>>>>  	addr_off = dt_rec_add(dtp, dt_cg_fill_gap, DTRACEACT_PRINT,
>>>>  			      sizeof(uint64_t), 8, NULL, type);
>>>> +	dt_rec_add(dtp, dt_cg_fill_gap, DTRACEACT_PRINT,
>>>> +		   sizeof(uint64_t), 8, NULL, (uint64_t)dmp->dm_id);
>>>>  	data_off = dt_rec_add(dtp, dt_cg_fill_gap, DTRACEACT_PRINT,
>>>>  			      size, 8, NULL, size);
> 
> Here you add a data item in the trace buffer that will actually not store
> anything at all.  That got me thinking...  And then you store the data size
> both as the specified size of the data item *and* as extra argument.  That use
> of the extra argument is unnecessary since the size is already being added
> to the record as dtrd_size (which gets its value from the 4th arg to
> dt_rec_add().  That means we can use the dtrd_arg for something else.
> 
> So...  how about you do the following:
> 
> 	/* reserve space for addr and data (modname and type as extra arg) */
> 	addr_off = dt_rec_add(dtp, dt_cg_fill_gap, DTRACEACT_PRINT,
> 			      sizeof(uint64_t), 8, NULL, (uint64_t)dmp->dm_name);

Maybe I'm misunderstanding here but isn't there a concern if dmp goes
away before we collect trace data this string will no longer be valid? I
_think_ if we were to store the module name we'd need to store the
string itself in the trace record. The module id approach was an attempt
to remove the need to store and retrieve full strings while not exposing
us to the potential risk of invalid pointer dereference. Sounds like the
trace record storage implementation was inefficient but that was the
intent at least.


> 	data_off = dt_rec_add(dtp, dt_cg_fill_gap, DTRACEACT_PRINT,
> 			      size, 8, NULL, type);
> 
> And then the consume code can perform a module lookup by name based on the arg
> that is attached to the 1st record, and get the type from the arg that is
> attached to the 2nd record.
> 
> This also avoids the complication of modules possibly getting unloaded (even if
> that is quite rare) because that will mean that the module cannot be found and
> we can therefore take appropriate action (unknown type so just dump raw data
> or something like that).
> 

See above; my concern is recording the module name via string could
result in ending up with invalid data in the case of a module unload.

Alan



More information about the DTrace-devel mailing list