[DTrace-devel] [PATCH v3 01/21] lexer: fix top-level probe ending in wildcard ambiguity handling

Nick Alcock nick.alcock at oracle.com
Tue Jan 16 21:12:57 UTC 2024


The lexer contains code to determine whether

foo*bar

at the top level of a script is a wildcarded probe name or a variable
declaration (which would need a semicolon or an = or something else too,
but maybe the lexer hasn't got to it yet). Back in the Solaris days this
used to be done by just blindly looking up the stuff before the * as a
type and considering it not a type if it wasn't, but this forces loading
of all CTF basically no matter what (even the hardwired definition of
NULL forces it). I rejigged this code way back in 5a7483d86 in 2012 to
stop it looking up quite so many types unnecessarily, but in the process
broke the code that puts the parts of the token after the * back on
the parser stack again.  You need to be very unlucky for this to trigger
anything, but more recent flex has started actually doing what it has
long promised and made unput() *actually* corrupt yytext: put the two
together and bugs jump out and you start getting garbage or repeated
junk after the * in cases like int*x (seen in
test/unittest/options/err.pspec-default.d).

Getting it right is actually simpler than getting it wrong: we don't
need to take a temporary copy of the yytext to avoid the problem of its
getting corrupted if we use yyless(), which does not corrupt yytext,
rather than unput(); doing this simplifies the code that preserves the
content beyond the * as well.  (The rest of the lexer is still using
unput() everywhere and needs fixing later.)

This does not fix err.pspec-default.d, but it's a prerequisite for a fix
that works (so err.pspec-default.d will serve as a test for it).

Signed-off-by: Nick Alcock <nick.alcock at oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees at oracle.com>
---
 libdtrace/dt_lex.l | 23 ++++++-----------------
 1 file changed, 6 insertions(+), 17 deletions(-)

diff --git a/libdtrace/dt_lex.l b/libdtrace/dt_lex.l
index ceebeae5f8e8..2f80035ac073 100644
--- a/libdtrace/dt_lex.l
+++ b/libdtrace/dt_lex.l
@@ -574,6 +574,9 @@ if (yypcb->pcb_token != 0) {
 			 * case of the tick and profile probes in the profile
 			 * provider and a bunch of other possibilities too.
 			 */
+			if ((yylval.l_str = strdup(yytext)) == NULL)
+				longjmp(yypcb->pcb_jmpbuf, EDT_NOMEM);
+
 			if (!(yypcb->pcb_cflags & DTRACE_C_PSPEC) &&
 			    strpbrk(yytext, ":-.?!`") == NULL &&
 			    strcmp(yytext, "BEGIN") != 0 &&
@@ -581,26 +584,15 @@ if (yypcb->pcb_token != 0) {
 			    strcmp(yytext, "ERROR") != 0) {
 
 				char *p = strchr(yytext, '*');
-				char *q = yytext + yyleng - 1;
 
 				if (p != NULL && p > yytext)
 					*p = '\0'; /* prune yytext */
 
 				if (dt_type_lookup(yytext, NULL) == 0) {
-					yylval.l_str = strdup(yytext);
 
-					if (yylval.l_str == NULL) {
-						longjmp(yypcb->pcb_jmpbuf,
-						    EDT_NOMEM);
-					}
-
-					if (p != NULL)
-						p = yylval.l_str + (p - yytext);
-					q = yylval.l_str + (q - yytext);
-
-					if (p != NULL && p > yylval.l_str) {
-						for (*p = '*'; q >= p; q--)
-							unput(*q);
+					if (p != NULL && p > yytext) {
+						*p = '*';
+						yyless(p - yytext);
 					}
 
 					yybegin(YYS_EXPR);
@@ -611,9 +603,6 @@ if (yypcb->pcb_token != 0) {
 					*p = '*'; /* restore yytext */
 			}
 
-			if ((yylval.l_str = strdup(yytext)) == NULL)
-				longjmp(yypcb->pcb_jmpbuf, EDT_NOMEM);
-
 			return DT_TOK_PSPEC;
 		}
 
-- 
2.43.0.272.gce700b77fd




More information about the DTrace-devel mailing list