[DTrace-devel] [PATCH v3] Add support for cleanpath() subroutine

Kris Van Hees kris.van.hees at oracle.com
Thu Jan 25 07:38:26 UTC 2024


Actually, there is one change that I think does make sense...  I would get rid
of trailing / *unless* that is the only character (i.e. then it is really a
leading /).  If the purpose of cleanpath() is to create a more minimal form of
the given path, then stripping off the trailing / seems to make sense, even if
the documentation doesn't address that case.

And in view of my commentary quoted below, I think it makes sense to implement
cleanpath() to be consistent.  Trailing / is certainly redundant because it is
usually only used as a way to represent that a path is a directory.  But the
more canonical path would not have that trailing /.

Makes sense?

On Thu, Jan 25, 2024 at 12:46:26AM -0500, Kris Van Hees via DTrace-devel wrote:
> On Wed, Jan 24, 2024 at 09:06:44PM -0500, Eugene Loh via DTrace-devel wrote:
> > On 1/24/24 16:04, Kris Van Hees wrote:
> > 
> > > On Thu, Dec 21, 2023 at 01:11:25PM -0500, eugene.loh at oracle.com wrote:
> > > > Replace all "//" with "/".
> > > > Replace all "/./" with "/".
> > > > Replace all "/foo/.." with "/".
> 
> This is not really an accurate description of what the patch does because
> these three transformations would never result in a "." path.
> 
> > > > 
> > > > Notice that the results differ in some cases from what was
> > > > done in legacy DTrace:
> > > Some comments on this (after reviewing the documentation again):
> > > 
> > > >      input string         current patch     legacy DTrace
> > > > 
> > > >      "/."                 "/"               "/."
> > > >      "a/."                "a"               "a/."
> > > >      "./a"                "a"               "./a"
> > > These three seem correct to me in how the current patch handles it.
> > > 
> > > >      "../.."              "../.."           "."
> > > >      "a/.."               "."               "a"
> > > >      "a/../.."            ".."              "a"
> > > A case could be made for all of them being "",
> > 
> > I do not see how "../.." and "" are equivalent.
> 
> It depends on the context.  Given that a relative path presented to cleanpath
> does not carry any information about its root, and in fact could very well be
> relative to / itself, remaining .. elements are meaningless.  In the absence
> of parent elements, .. (backing up to the parent) has no meaning.  So, as I
> said, a case could be made based on that to say that cleanpath("..") could
> validly yield "" as result.
> 
> > > given that cleanpath() is
> > > supposed to remove redundant elements (. and ..)
> > 
> > ".." by itself is not redundant, but the combination "/foo/.." can be
> > collapsed.
> 
> Again, by the reasoning I present above, .. by itself can be considered
> redundant because it represents backing up a level, and within the context of
> a cleanpath() invocation, the meaning of ".." becomes pointless.
> 
> But as I mention below, I agree that a valid case can be made for what you
> propose in this patch, albeit with the one exception I noted.
> 
> > > and no context is known
> > > for relative paths.  But I do agree that there is also a case to be made
> > > to not get rid of .. elements that would apply to the root of the relative
> > > path, becausae discarding them effectively causes information loss.
> > > 
> > > Except for the case of "a/..".  That one should be "" because the .. should
> > > be considered redundant in view of having a present path element, and thus
> > > it is possible to back up one path element, which would yield "".  I do not
> > > see how the result of "." would be considered valid because that in itself
> > > is a redundant element (and thus should still be removed).
> > 
> > I guess I just don't know what standard or use case to consider. Anyhow, the
> > documentation I saw speaks of removing "/./" without talking specifically
> > about "." in isolation.  There are a lot of cases when someone would not
> > consider "." and "" to be the same... e.g., "cd .", "pushd .", etc.  So
> > "a/.." is the same as ".".  Are they also the same as ""?  It depends on the
> > situation, and it's unclear to me where cleanpath() is used.  The
> > documentation I saw (Oracle Linux DTrace Guide) didn't say anything was
> > wrong with a bare ".".
> 
> The documentation talks about /./ and /../ but not in a literal sense.  If
> it was in a literal sense, then you would need to leave "a/.." alone because
> it does not contain /../ or /./.  Clearly, the meaning is in terms of path
> elements and the meaning of . and .. as path elements (which is almost certainly
> why they are shown as /./ and /../, i.e. path elements that are literal .
> and ..).
> 
> While it is not clear where cleanpath() is meant to be used (I do not find any
> real examples of its use cases), it is unlikely it is meant to be used to
> sanitize paths that are to be used in commands, or rather, it is certainly not
> guaranteed to be sensible to do so.  After all, the dpcumentation points out
> that "it is possible that cleanpath could take a valid path and return a
> shorter, invalid path."
> 
> More importantly though, the documentation is clear that cleanpath() should be
> removing redundant path elements, specifically, /./ and /../ (and again, the /
> are here to denote that these are path elements and not portions of path
> elements).  And /./ is certainly redundant in paths.
> 
> Also, the documentation does not talk about adding anything, and the case of
> "a/.." listed above shows "." as output for the current patch, yet the input
> string does not contain a literal . element, so rather than removing redundant
> elements, this ends up introducing an element, and one that is actually one of
> the aforementioned redundant elements that are meant to be removed.
> 
> Now, I fully admit that I am baffled by the mere existance of this cleanpath()
> subroutine, and I really wonder what the use case for it was and why it was
> implemented the way it was.  The legacy output clearly shows that it is not
> even correct in somme cases, and its documented behaviour is vague.  From what
> I can gather from some searches, it seems that cleanpath() was used in some
> translators on other OSes, most likely to present paths in a nicer form.
> No such use exists on Linux right now.
> 
> And (bear with me), then there is the issue that cleanpath() as implemented in
> the legacy implementation does not actually follow the documented behaviour
> either!
> 
> Personally, I would prefer to just get rid of cleanpath() given all of the
> above, because there clearly isn't a good sense of what it is meant to do in
> certainly situations and the documentation is no help.
> 
> So...
> 
> Thinking this through some more, I first apologize for the lengthy discourse
> above but I think it is important to establish that this cleanpath() routine
> and its documentation is a mess.  An ugly mess.
> 
> But I do acknowledge that you have done a fine job trying to make sense of the
> mess, and it provides a consistent behaviour for the subroutine albeit not in
> line with the documentation.  So, under the presumption that you can document
> the behaviour in the commit message (which I usually would object to, but I
> think we need to so that the documentation can be updated based on that), I
> think going with this implementation is the best course of action.
> 
> Maybe in terms of a sequence of search-and-replace regular expressions, in
> order?
> 
> > > >      "a/../b"             "b"               "/b"
> > > I agree.
> > > 
> > > > In the first set, legacy DTrace has unnecessary trailing "." or "/."
> > > > or unnecessary leading "./".  They would seem to be wrong.
> > > > 
> > > > In the next cases, the legacy results are clearly wrong, not
> > > > handling ".." correctly.
> > > > 
> > > > In the final case, a relative path is converted into an absolute
> > > > path, which is also clearly incorrect.
> > 
> > _______________________________________________
> > DTrace-devel mailing list
> > DTrace-devel at oss.oracle.com
> > https://oss.oracle.com/mailman/listinfo/dtrace-devel
> 
> _______________________________________________
> DTrace-devel mailing list
> DTrace-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/dtrace-devel



More information about the DTrace-devel mailing list