[DTrace-devel] [PATCH v2 10/13 (was 9/12)] Add BPF helper mapping

Thu Jul 21 03:04:15 UTC 2022

On Wed, Jul 20, 2022 at 03:34:30PM -0700, Eugene Loh wrote:
> On 7/20/22 15:33, Kris Van Hees wrote:
> 
> > On Wed, Jul 20, 2022 at 01:20:32PM -0700, Eugene Loh via DTrace-devel wrote:
> > > Mostly, I need some help understanding the patch.  I mean, I get the helper
> > > array, but I do not understand what scenarios we are trying to handle.  More
> > > comments below.
> > We compile against a certain kernel but we may be executing with an older
> > kernel.  And that runtime kernel may have a smaller set of helper functions
> > than the build kernel has.
> > 
> > > On 7/19/22 13:12, Kris Van Hees via DTrace-devel wrote:
> > > > BPF helpers can be very specific to kernel versions, and the set of
> > > > available helpers may differ between the system where DTrace was
> > > > compiled and the system where it is being used.
> > > Maybe it would be good to be more specific about what we anticipate... about
> > > what "very specific" means.  My impression is that helpers are always
> > > numbered consistently, but the range of helpers (__BPF_FUNC_MAX_ID) in the
> > > installed header can be higher or lower than in the running kernel.
> > > Further, the generic probe_read[_str] functions can have widely varying
> > > behaviors vis-a-vis whether they work for kernel or user or even are
> > > available.  I have a discussion of some kernel patches in 2ad269e9; perhaps
> > > that level of detail is unwarranted, but it'd be nice for someone who wants
> > > to know more.
> > I assume that where you say "installed hesder" you mean the header on the build
> > system.
> 
> Yes, but I think the key to all this is at the bottom of this message.
> 
> > The details of the bpf_probe_read_*() helpers and their availability and
> > behaviouor on various kernels is not really important when it comes to this
> > patch.  All we do is ensure that we can determine whether the running kernel
> > supports a specific helper or not.
> 
> Hmm.  Well, and we might map to another function that we do not check will
> do the job for us.  E.g., (I'm doing this from notes without
> double-checking, so I might be getting this wrong), on UEKR6 we would map
> probe_read_user to probe_read, which would then fail on ARM.

If bpf_probe_read_user() does not exist, and bpf_probe_read() cannot handle
userspace addresses, we're kind of out of luck.  I doubt we want to (or even
can) put in code such cases, especially because there is no workaround for it
that I know of.  And maintaining a list helper/kver/arch matrix of what works
and what does not is just not feasible.  Not to mention that disributions often
have kernel versions with backported features, so you cannot even go on the
kernel version itself.

> > > I assume the intent of this patch is to find some sort of mapping for
> > > certain helper functions, but without guaranteeing that those mappings will
> > > work.  Also, it seems we assume that the installed header has all the helper
> > > functions we are interested in.
> > Well, the case where we are executing on a kernel that is *newer* than the one
> > we are compiling for is not really that interesting because we certainly should
> > not expect DTrace to use features that are newer than the kernel we build on.
> > Or, we might decide to include the list of helplers we know about in the DTrace
> > source tree, and then that becomes our set of helpers.
> 
> Yeah.  Including in the source tree may be the way to go.  Again, more
> discussion at the bottom.

I am not sure why more discussion is necessary.  I already point that out
further in my email.

> > But if we then use that
> > DTrace on a kernel that has an extended set of helpers, we obviously wouldn't
> > care because we didn't know about them at the time of writing our code.
> > 
> > This patch provides two things:
> > 
> > - For helpers like bpf_probe_read*(), it offers a mechanism where the code
> >    generator dynamically selects what helper to use to access kernel or user
> >    memory.  If specific user/kernel ones are available, they will be used.  If
> >    they are not, we use the generic one.
> > - For helpers that were introdued in newer kernels while we still need to
> >    support DTrace on older kernels, we can detect whether a helper is supported
> >    and if so, use it.  If not supported, we can either use a workaround (an
> >    alternative imlementation) or report an error.
> 
> Yeah.  Again, I hope (at least) the commit message points out that the
> alternative might not work.

That's too much detail for a commit message in my opinion.  The fact that
bpf_probe_read() (as example) does not work when using userspace addresses
on arm64 on some range of kernel releases is not relevant to this patch.  It
is, if anything, a limitation in BPF but I don't think we're documenting all
those whenever we use a feature.  It would much more sense to capture that
kind of information in BPF-ISSUES (which is in dire need of updating anyway).
I certainly wouldn't be looking for that info in a commit message for DTrace.

> > > > We add runtime checking of specific BPF helpers, with support for
> > > > possible fallback hwlpers.  E.g. if probe_read_user() is not found,
> > > hwlpers
> > > 
> > > > we can use probe_read() instead (though that may not guarantee
> > > > successful execution).
> > > > 
> > > > This can also be used to check whether certain helpers exist.  By
> > > > convention, when the BPF helper function id maps to 0, the helper is
> > > > known not to be supported on the runtime system,
> > > I do not understand this last paragraph... like are you referring to an
> > > existing or new convention?  Do you mean that this mapping can be used to
> > > map helper functions to BPF_FUNC_unspec?
> > Well, obviously a new convention since this code is all new.  Since the map
> > elements are initialized to 0 (as part of the dtp initialization), any
> > helpers that we do not explicitly initialize will map to 0 and that can
> > therefore be used to determine that they are not supported on the runtime
> > kernel.
> 
> Okay but:
> 
> It looks like dt_init_helpers() explicitly starts by setting each [i] to i. 
> That is, each helper is explicitly initialized.

Yes, which is the way it ought to be.

Sure, but we do not have those yet.

> Unspecified alternative might as well be called BPF_FUNC_unspec.
> > > > diff --git a/libdtrace/dt_bpf.c b/libdtrace/dt_bpf.c
> > > > +
> > > > +	/*
> > > > +	 * If an 'invalid func' or 'unknown func' failure is reported, we
> > > > +	 * cannot use the helper.
> > > > +	 */
> > > > +	return strstr(ptr, "invalid func") == NULL &&
> > > > +	       strstr(ptr, "unknown func") == NULL;
> > > > +}
> > > > +
> > > > +void
> > > > +dt_bpf_init_helpers(dtrace_hdl_t *dtp)
> > > > +{
> > > > +	uint32_t	i;
> > > > +
> > > > +	/* Assume all helpers are available. */
> > > > +	for (i = 0; i < __BPF_FUNC_MAX_ID; i++)
> > > > +		dtp->dt_bpfhelper[i] = i;
> > > I would think we care about cases where the __BPF_FUNC_MAX_ID in the
> > > installed header file is not big enough to include all the helper functions
> > > we are interested in.  That is, __BPF_FUNC_MAX_ID is not big enough.
> > If we are compiling DTrace with a header file that lists less helpers than we
> > support, we are doing something very wrong.
> > 
> > As we start using helpers that may not be available in the lowest kernel
> > version we can support, we can add them to the code that follows to adjust the
> > basic assumption that all the helpers we know about are available.  In other
> > words, we'd add a BPF_HELPER_MAP() invocation to the list below and set the
> > mapping to 0 if the helpers is not supported.
> 
> Instead of 0, how about unspec?  Anyhow, we can cross that bridge when we
> come to it.

I was merely pointing out the ability to mark a helper as not supported.
Sure, we can use unspec.  Either way.

> > > > +
> > > > +	/* Verify the existance of some specific helpers. */
> > > > +#define BPF_HELPER_MAP(orig, alt)	\
> > > > +	if (!have_helper(BPF_FUNC_##orig)) \
> > > > +			dtp->dt_bpfhelper[BPF_FUNC_##orig] = BPF_FUNC_##alt;
> > > > +
> > > > +	BPF_HELPER_MAP(probe_read_user, probe_read);
> > > > +	BPF_HELPER_MAP(probe_read_user_str, probe_read_str);
> > > > +	BPF_HELPER_MAP(probe_read_kernel, probe_read);
> > > > +	BPF_HELPER_MAP(probe_read_kernel_str, probe_read_str);
> > > > +#undef BPF_HELPER_MAP
> > > Similar issue here.  Are macros like BPF_FUNC_probe_read_user necessarily
> > > known?  Will this code compile on all systems of interest?
> > It should, and if it does not, we will need to include the header file
> > ourselves to ensure it does.  Perhaps I will just add it in this patch and
> > not worry about that further.  That seems to be the best option anyway (and
> > one that is commonly used in tools).
> 
> Great.
> 
> I just went on Oracle Cloud (OCI) and created an OL7.9 instance. The
> /usr/include/linux/bpf.h is pretty old;  __BPF_FUNC_MAX_ID is 68.  (Note: we
> want to support BPF_FUNC_probe_read_*, which are 112-115.)  If users should
> be able to build DTrace on such a system, then it looks like we need a copy
> of the header file.  Note:
> 
> *)  DTrace otherwise runs the test suite just fine on such a system.
> 
> *)  With such an older header, we also cannot build due to missing macro
> BPF_FUNC_send_signal.
> 
> *)  The OCI OL7 instance was x86.  I think the header I get with ARM is even
> older... __BPF_FUNC_MAX_ID is 60, meaning that BPF_FUNC_getstack is also
> missing, and BPF_PSEUDO_CALL is also not defined.  Again, however, if one
> introduces workarounds to such build issues, the test suite runs just fine.

I already said I might as well include it.  So yes, this is already covered in
my email.