[DTrace-devel] [PATCH 1/2] Add support for the basename() action
Eugene Loh
eugene.loh at oracle.com
Fri Nov 19 19:53:22 UTC 2021
On 11/18/21 11:51 PM, Kris Van Hees wrote:
> On Thu, Nov 18, 2021 at 08:45:04PM -0500, Eugene Loh via DTrace-devel wrote:
>> On 11/18/21 7:04 PM, Kris Van Hees wrote:
>>
>>> On Tue, Nov 16, 2021 at 05:47:59PM -0500, eugene.loh--- via DTrace-devel wrote:
>>>> From: Eugene Loh <eugene.loh at oracle.com>
>>>>
>>>> diff --git a/bpf/basename.S b/bpf/basename.S
>>>> @@ -0,0 +1,139 @@
>>>> +// SPDX-License-Identifier: GPL-2.0
>>>> +/*
>>>> + * Copyright (c) 2021, Oracle and/or its affiliates. All rights reserved.
>>>> + */
>>>> +
>>>> +#define DT_STRLEN_BYTES 2
>>>> +
>>>> +#define BPF_FUNC_probe_read 4
>>>> +#define BPF_FUNC_probe_read_str 45
>>>> +
>>>> +/*
>>>> + * void dt_basename(char *src, char *dst);
>>>> + */
>>>> + .text
>>>> + .align 4
>>>> + .global dt_basename
>>>> + .type dt_basename, @function
>>>> +dt_basename :
>>>> +#define SRC %r6
>>>> +#define DST %r7
>>>> +#define BGN %r8
>>>> +#define END %r9
>>>> +
>>>> + /* store copies of input arguments */
>>>> + mov SRC, %r1
>>>> + mov DST, %r2
>>>> +
>>>> + /* ignore the string-length prefix */
>>>> + add SRC, DT_STRLEN_BYTES
>>>> +
>>>> + /* r0 = bpf_probe_read_str(dst, STRSZ + 1, src) */
>>>> + mov %r1, DST
>>>> + lddw %r2, STRSZ
>>>> + add %r2, 1
>>>> + mov %r3, SRC
>>>> + call BPF_FUNC_probe_read_str
>>>> +
>>>> + /* if (r0 s<= 1) goto Ldot */
>>>> + jsle %r0, 1, .Ldot
>>>> +
>>>> + /* end = r0 - 1 */
>>>> + mov END, %r0
>>>> + sub END, 1
>>>> +
>>>> + /*
>>>> + * Loop over end, backing it up to find a non-'/' char.
>>>> + */
>>>> +.Lend:
>>>> + /* end-- */
>>>> + sub END, 1
>>>> + /* if (end s< 0) goto Lslash */
>>>> + jslt END, 0, .Lslash
>>>> + /* if (src[end] == '/') goto Lend */
>>>> + mov %r1, SRC
>>>> + add %r1, END
>>>> + ldxb %r1, [%r1+0]
>>>> + and %r1, 0xff
>>>> + jeq %r1, '/', .Lend
>>>> +
>>>> + /*
>>>> + * Loop over bgn, backing it up to find a '/' char.
>>>> + */
>>>> + /* bgn = end */
>>>> + mov BGN, END
>>>> +.Lbgn:
>>>> + /* bgn-- */
>>>> + sub BGN, 1
>>>> + /* if (bgn s< 0) goto Lcopy */
>>>> + jslt BGN, 0, .Lcopy
>>>> + /* if (src[bgn] != '/') goto Lbgn */
>>>> + mov %r1, SRC
>>>> + add %r1, BGN
>>>> + ldxb %r1, [%r1+0]
>>>> + and %r1, 0xff
>>>> + jne %r1, '/', .Lbgn
>>>> +
>>>> +.Lcopy:
>>>> + /*
>>>> + * The output string is a copy of the designated substring.
>>>> + */
>>>> + /* end -= bgn (and help the BPF verifier) */
>>>> + sub END, BGN
>>> Err, this kind of breaks your whole scheme of using defines for register
>>> assignments because you just changed the meaning of END here. It was the
>>> index of the end and now it becomes a byte count.
>> I see what you're saying, but I don't think this any more severe than saying
>> "len means STRSZ+1; oh but now it means strlen." The pseudocode is short
>> enough (a few lines) that I think it's not hard to follow. Would it seem
>> better to you to call it "len"? So, "len" originally means from the
>> beginning of the old string but then it means from the beginning of the new
>> string?
> Well, no, because now you replace the issue with END meaning two things simply
> with having a LEN that means two things. And if that is OK then I think that
> the entire purpose of these defines is lost.
I'm really having a hard time understanding these objections. It is
routine that a variable's exact meaning morphs over the course of its
use. I was going to cite specific cases, but this practice is so
routine that that would be ridiculous. Again, all programmers do this
routinely.
We want to know the length of the output string. We call it LEN. We
read the input string and initialize LEN. Is its value really the value
we want? No, but I think most readers will understand that. We
decrement to strip off the NUL byte. Does that require a name change?
I would think readers generally will not think so. We strip away
trailing / chars, decrementing LEN. Should the name change? I would
think not. We figure out how much to strip off the front and subtract
that from LEN. Should the name change now? I would not think so.
Throughout all these changes -- starting with the length of the
NUL-terminated input string and ending with the stripped down length we
sought -- I think a name like "LEN" makes perfect sense.
> Of course, I find the use of
> defines here confusing as is anyway, because you invariably need to know what
> the macro is defined as to know what instructions to use, so the value seems to
> be a bit lost on me. E.g. SRC is a register here but in other functions you
> have defined it as a [%fp + OFFSET] style stack location. But the specifics
> of those definitions affects whether you can do a MOV on it or need to use LD
> to get the value or ST to set the value. But anyway, that is just that.
SRC DST BGN and LEN are all registers.
If there are problems in other patches, we can discuss them separately.
Presumably, however, whenever a macro refers to a stack offset rather
than a register, its name distinguishes it to avoid the very confusion
you mention.
> Point here is though that if you are going to use symbolic names, they should
> mean somrthing and be consistent. I think you'd be better served then to also
> set a define CNT %r9 and when you are done with END and need to start working
> with a byte count, start using CNT. And since you already get to deal in other
> code with needing to know what something is defined as, you simply have the
> same here where you need to know that END and CNT use the same register.
There is other code where a "variable" is no longer needed and the macro
is #undef'ed. Then, when that register is used for something else, a
new #define is introduced. That practice is analogous to what we do in
dt_cg.c with regset_alloc() and regset_free().
Here, though, we want a length that starts as the entirety of the input
string (plus NUL char) and we whittle it down to get what we want.
Changing its name along the way strikes me as arbitrary and confusing.
Meanwhile, I'll post a v3 of these patches with the str_of_str() to
path_helper() name change.
More information about the DTrace-devel
mailing list