[DTrace-devel] [PATCH v3 1/2] print() action: identify ctf object used for print

Wed Jul 3 14:56:01 UTC 2024

On Wed, Jul 03, 2024 at 02:00:55PM +0100, Alan Maguire wrote:
> On 02/07/2024 19:59, Kris Van Hees wrote:
> > On Tue, Jul 02, 2024 at 01:30:11PM +0100, Alan Maguire wrote:
> >> On 01/07/2024 18:47, Kris Van Hees wrote:
> >>> On Mon, Jul 01, 2024 at 06:21:19PM +0100, Alan Maguire wrote:
> >>>> On 01/07/2024 17:50, Kris Van Hees wrote:
> >>>>> On Mon, Jul 01, 2024 at 09:06:32AM +0100, Alan Maguire wrote:
> >>>>>> On 28/06/2024 22:55, Kris Van Hees wrote:
> >>>>>>> Looking at this again, I am wondering why we don't just add type info to the
> >>>>>>> data record?  Since we know the type of the data items being added to trace
> >>>>>>> records (and type is expressed as ctfp and type id), we might as well store
> >>>>>>> that in the record.  And that would make this entire situation pretty much a
> >>>>>>> non-issue because the record would already comtain all that we need to know
> >>>>>>> to consume it and print it.  No need to pass a module id.
> >>>>>>>  in 
> >>>>>>> If the issue is that modules could get unloaded and thereby some pointers
> >>>>>>> might become invalid, then perhaps the solution would be to always copy the
> >>>>>>
> >>>>>> Yeah, I think (as well as modules going away) the other worry is record
> >>>>>> corruption, so dealing in direct pointers is too risky I think.
> >>>>>>
> >>>>>>> type to the D dict (I think that is the right one - I always mix C and D).
> >>>>>>> And since that one is always around, there would be no need to associate an
> >>>>>>> id with the module and pass that.
> >>>>>>>
> >>>>>>> Perhaps doing such a copy is the best action here?  And we may need to look
> >>>>>>> at other cases where typed data from modules might be stored, and if there are
> >>>>>>> any others, they may also be subject to possible issues when modules are
> >>>>>>> unloaded and in that case would require a similar type copying action?
> >>>>>>>
> >>>>>>
> >>>>>> I looked to see if there any CTF APIs to simplify recursive type
> >>>>>> copying, and  I couldn't find any and that's my worry - this sort of
> >>>>>> copying could get complicated fast. The other issue is size - a lot of
> >>>>>> the kernel types will pull in a massive number of dependenent types
> >>>>>> (print()ing a skb will pull in sockets, task_structs etc). Duplicating
> >>>>>> all of that in the D dict would be expensive I suspect. The other danger
> >>>>>> is introducing multiple types of the same name into the D dict might
> >>>>>> complicate things for other consumers of it.
> >>>>>
> >>>>> I mistakenly thought that we already included code in DTrace that performs
> >>>>> type copying, but I cannot find it so perhaps I was mistaken (or it is more
> >>>>> obscure).
> >>>>>
> >>>>> But then something else occured to me...
> >>>>>
> >>>>> This is all compile-type-known data.  In other words, none of this is passed
> >>>>> in the actual trace buffer.  That means that we have a lot less worries about
> >>>>> how to store this information.  So, look below at your patch and some comments
> >>>>> that I think make it even easier.
> >>>>>
> >>>>
> >>>> Thanks; more below..
> >>>>
> >>>>>> The worst case scenario in the current patch series - that the module
> >>>>>> goes away - means we can't print the type info, but that's not the end
> >>>>>> of the world really. Certainly in the kernel case most modules tend to
> >>>>>> stay loaded.
> >>>>>>
> >>>>>>> On Fri, Apr 26, 2024 at 02:33:26PM +0100, Alan Maguire wrote:
> >>>>>>>> when generating code for print() action we need to identify source
> >>>>>>>> of CTF; is it shared CTF, cdefs or ddefs or another module?
> >>>>>>>> Introduce a module id scheme where the module id can be stored
> >>>>>>>> at cg time for later retrieval at print time; this allows us
> >>>>>>>> to later look up the module and its associated CTF without relying
> >>>>>>>> on possibly freed pointers.
> >>>>>>>>
> >>>>>>>> Under the hood, the id scheme uses a monotonic counter coupled
> >>>>>>>> with the hash of the modules name; the latter allows us to quickly
> >>>>>>>> look up the right bucket via dt_htab_find_hval().
> >>>>>>>>
> >>>>>>>> This fixes an issue encountered when using a shared kernel type
> >>>>>>>> with alloca()ed memory; DTrace assumed the type was in ddefs
> >>>>>>>> and it wasn't so it wasn't displayed.  This change fixes that and
> >>>>>>>> all tests continue to pass.  Also tested that it works with
> >>>>>>>> kernel module-defined types now:
> >>>>>>>>
> >>>>>>>> dtrace: description 'fbt::ieee80211_rx_napi:entry ' matched 1 probe
> >>>>>>>> CPU     ID                    FUNCTION:NAME
> >>>>>>>>   7 123161          ieee80211_rx_napi:entry 0xffff9f8980fa08e0 = *
> >>>>>>>>                                             (struct ieee80211_hw) {
> >>>>>>>>                                              .conf = (struct ieee80211_conf) {
> >>>>>>>>                                               .listen_interval = (u16)10,
> >>>>>>>>                                               .long_frame_max_tx_count = (u8)4,
> >>>>>>>>                                               .short_frame_max_tx_count = (u8)7,
> >>>>>>>>                                              },
> >>>>>>>>
> >>>>>>>> Signed-off-by: Alan Maguire <alan.maguire at oracle.com>
> >>>>>>>> ---
> >>>>>>>>  libdtrace/dt_cg.c      |  7 ++++++-
> >>>>>>>>  libdtrace/dt_consume.c |  7 +++++--
> >>>>>>>>  libdtrace/dt_htab.c    | 17 +++++++++++++++++
> >>>>>>>>  libdtrace/dt_htab.h    |  2 ++
> >>>>>>>>  libdtrace/dt_impl.h    |  1 +
> >>>>>>>>  libdtrace/dt_module.c  | 30 ++++++++++++++++++++++++++++++
> >>>>>>>>  libdtrace/dt_module.h  |  1 +
> >>>>>>>>  libdtrace/dt_printf.c  | 12 +++++++++---
> >>>>>>>>  libdtrace/dt_printf.h  |  2 +-
> >>>>>>>>  9 files changed, 72 insertions(+), 7 deletions(-)
> >>>>>>>>
> >>>>>>>> diff --git a/libdtrace/dt_cg.c b/libdtrace/dt_cg.c
> >>>>>>>> index 27246a40..30cccdaf 100644
> >>>>>>>> --- a/libdtrace/dt_cg.c
> >>>>>>>> +++ b/libdtrace/dt_cg.c
> >>>>>>>> @@ -20,6 +20,7 @@
> >>>>>>>>  #include <dt_dctx.h>
> >>>>>>>>  #include <dt_cg.h>
> >>>>>>>>  #include <dt_grammar.h>
> >>>>>>>> +#include <dt_module.h>
> >>>>>>>>  #include <dt_parser.h>
> >>>>>>>>  #include <dt_printf.h>
> >>>>>>>>  #include <dt_provider.h>
> >>>>>>>> @@ -2832,7 +2833,9 @@ dt_cg_act_print(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
> >>>>>>>>  	ctf_id_t	type = addr->dn_type;
> >>>>>>>>  	char		n[DT_TYPE_NAMELEN];
> >>>>>>>>  	size_t		size;
> >>>>>>>> +	dt_module_t	*dmp;
> >>>>>>>>  
> >>>>>>>> +	dmp = dt_module_lookup_by_ctf(dtp, fp);
> >>>>>>>>  	type = ctf_type_reference(fp, type);
> >>>>>>>>  	if (type == CTF_ERR)
> >>>>>>>>  		longjmp(yypcb->pcb_jmpbuf, EDT_CTF);
> >>>>>>>> @@ -2842,9 +2845,11 @@ dt_cg_act_print(dt_pcb_t *pcb, dt_node_t *dnp, dtrace_actkind_t kind)
> >>>>>>>>  			"print( ) argument #1 reference has type '%s' with size 0; cannot print( ) it.\n",
> >>>>>>>>  			ctf_type_name(fp, type, n, sizeof(n)));
> >>>>>>>>  
> >>>>>>>> -	/* reserve space for addr/type, data/size */
> >>>>>>>> +	/* reserve space for addr/type, module identification, data/size */
> >>>>>>>>  	addr_off = dt_rec_add(dtp, dt_cg_fill_gap, DTRACEACT_PRINT,
> >>>>>>>>  			      sizeof(uint64_t), 8, NULL, type);
> >>>>>>>> +	dt_rec_add(dtp, dt_cg_fill_gap, DTRACEACT_PRINT,
> >>>>>>>> +		   sizeof(uint64_t), 8, NULL, (uint64_t)dmp->dm_id);
> >>>>>>>>  	data_off = dt_rec_add(dtp, dt_cg_fill_gap, DTRACEACT_PRINT,
> >>>>>>>>  			      size, 8, NULL, size);
> >>>>>
> >>>>> Here you add a data item in the trace buffer that will actually not store
> >>>>> anything at all.  That got me thinking...  And then you store the data size
> >>>>> both as the specified size of the data item *and* as extra argument.  That use
> >>>>> of the extra argument is unnecessary since the size is already being added
> >>>>> to the record as dtrd_size (which gets its value from the 4th arg to
> >>>>> dt_rec_add().  That means we can use the dtrd_arg for something else.
> >>>>>
> >>>>> So...  how about you do the following:
> >>>>>
> >>>>> 	/* reserve space for addr and data (modname and type as extra arg) */
> >>>>> 	addr_off = dt_rec_add(dtp, dt_cg_fill_gap, DTRACEACT_PRINT,
> >>>>> 			      sizeof(uint64_t), 8, NULL, (uint64_t)dmp->dm_name);
> >>>>
> >>>> Maybe I'm misunderstanding here but isn't there a concern if dmp goes
> >>>> away before we collect trace data this string will no longer be valid? I
> >>>> _think_ if we were to store the module name we'd need to store the
> >>>> string itself in the trace record. The module id approach was an attempt
> >>>> to remove the need to store and retrieve full strings while not exposing
> >>>> us to the potential risk of invalid pointer dereference. Sounds like the
> >>>> trace record storage implementation was inefficient but that was the
> >>>> intent at least.
> >>>
> >>> Easy... replace
> >>> 	(uint64_t)dmp->dm_name
> >>> above with
> >>> 	(uint64_t)strdup(dmp->dm_name)
> >>>
> >>
> >> Right but then I've got to
> >>
> >> - allocate a string per print() action
> >> - free them later (presumably on DTrace cleanup?)
> > 
> > These do not really pose an issue.  Unless there are going to be a very large
> > amount of print actions, but then again, we have similar situations elsewhere.
> > 
> >> - when consuming the record, I've got to compute the hash on the name
> >> string to do the module lookup by name, rather than by id where it is
> >> pre-computed at cg time. This adds overhead to each consume action.
> > 
> > I did a comparitive test between your v3 patch and a modified one that stores
> > a copy of the module name as dtrd_arg to the data record.  I honestly did not
> > notice a difference between the two approaches in terms of performance.  While
> > the use of a string causes a hash value to be calculated, using an id also
> > incurs additional cost in its processing.  From the looks of it, it is a bit
> > of a trade-off where we incur a higher cost.
> > 
> > My preference would be for using the name primarily because it does seem to
> > provide the same level of performance, but without adding additional support
> > in htab to make dm_id work, and also without adding the dm_id management code.
> > 
> 
> Okay, sounds good. One thing I'm unclear about still is where to free
> the module name string. I _think_ the right answer is to associate it
> with the pcb, strdup()ing to a pcb->pcb_modname string which can then be
> passed via the arg and later freed via dt_pcb_pop(). Is that the right
> approach or should it be freed in a different way? Thanks!

Ah no, if you do that then it will be freed after compilation of the clause,
which means that the record in the datadesc will have an invalid pointer in it.

Records and associated data are freed in dt_datadesc_release.  There is a loop
already to handle records that hold a printf format.

Which actually got me thinking...  it is a bit of a stretch perhaps, but the
easiest way to do this would actually be to make use of the printf format data
(dt_pfargv_t) which isn't entirely unreasonable because this is a formatted
print and its output is handled by a function in dt_printf.c anyway.  So, in
that case you do something like this in dt_cg_act_print():

	dt_pfargv_t	*pfp = dt_printf_create(dtp, dmp->dm_name);
...

	addr_off = dt_rec_add(dtp, dt_cg_fill_gap, DTRACEACT_PRINT,
			      sizeof(uint64_t), 8, pfp, type);

and then you could even implement dt_print_type) as a regular printf-style
function without needing dt_print_print() to interpose.  Or alternatively,
use dt_print_print() to get the module name from dtrd_format->pfv_format,
and keep dt_print_type() as is.

Either way, the dt_printf_create() performs the strdup() and dt_printf_destroy()
(called from dt_datadesc_release)) will take care of it getting freed.

This is abusing the format string a little but on the other hand, one could
argue that the module name together with dtrd_arg specifying the ctf type does
in fact specify an output formatting.

	Kris