Documentation for RegExReplace should indicate, for the replacement parameter, that backreferences cannot be used directly in inner functions or as numbers in expressions.
viewtopic.php?f=76&t=113022
RegExReplace backreferences cannot be manipulated
Re: RegExReplace backreferences cannot be manipulated
Replacement:
Type: String
Type: String
?The string to be substituted for each match, which is plain text (not a regular expression).
Re: RegExReplace backreferences cannot be manipulated
"Plain text" is misleading & incomplete, as a variable can be used, and some expressions, but the backreference itself cannot be included in a function that may return plain text.
To me, this is not a description of plain text, though it does evaluate to plain text. A function can also evaluate to plain text.
We need a clearer way to describe why some of the following parameter values work, while others do not.
Hence, the idea:
Code: Select all
#Requires AutoHotkey v2.0
str := "abcd"
regex := "ab(c)d"
f := "123"
MsgBox RegExReplace(str, regex, 'J' f "$1q9")
We need a clearer way to describe why some of the following parameter values work, while others do not.
Code: Select all
#Requires AutoHotkey v2.0
str := "abcd"
regex := "ab(c)d"
f := "123$1q9"
MsgBox RegExReplace(str, regex, 'J' f "$1q9") ; A backreference
MsgBox RegExReplace(str, regex, 'J' InStr(f, "$1q9")) ; Not a backreference
MsgBox RegExReplace(str, regex, 'J' SubStr("$1", 1, 1)) ; Not a backreference
MsgBox RegExReplace(str, regex, 3 + "$1") ; Error
Backreferences cannot be used directly in inner functions or as numbers in expressions.
Re: RegExReplace backreferences cannot be manipulated
+1mikeyww wrote: ↑23 Jan 2023, 11:12"Plain text" is misleading & incomplete, as a variable can be used, and some expressions, but the backreference itself cannot be included in a function that may return plain text.
[...]
To me, this is not a description of plain text, though it does evaluate to plain text. A function can also evaluate to plain text.
Re: RegExReplace backreferences cannot be manipulated
I find it quite understandable as it is but maybe a different possibility:
The string that represents the substitutions of each match (before it is actually made).
The string that represents the substitutions of each match (before it is actually made).
Re: RegExReplace backreferences cannot be manipulated
What could be plainer than plain text, but a string literal?
Edit: link inserted
Edit: link inserted
Last edited by lmstearn on 26 Jan 2023, 21:06, edited 1 time in total.
itros "ylbbub eht tuO kaerB" a ni kcuts m'I pleH
Re: RegExReplace backreferences cannot be manipulated
Since the "Type" definition for the Replacement parameter already says: "String", the pharse "plain text" is simply confusing. The normal reader wonders if that means something special that goes beyond "String". The brackets seem to say what really is meant: "(not a regular expression)". But then, why not leave "plain text" and the brackets away and write only "The string to be substituted for each match, which is not a regular expression." On top of it, the following sentence contraticts the feeling that one gets when reading "plain text", because now the "plain" text string may after all contain strings with special meaning, namely backreferences. So, it is not just "plain text" after all. It is a string that may have special meaning but is not a regular expression. The phrase "plain text" is simply confusing.
Re: RegExReplace backreferences cannot be manipulated
The suggested documentation addition is based on a false expectation: a backreference $n only means something to a specific RegExReplace invocation within its completed replacement string, i.e. within the final calculated result of whatever expression constitutes its third parameter. It doesn't magically acquire meaning (a value) visible to operators and functions within the expression merely because that expression happens to be the third parameter of a RegExReplace invocation.
Documentation could get pretty dense if every possible misreading of an otherwise well-defined item had to be accounted for.
JB
Documentation could get pretty dense if every possible misreading of an otherwise well-defined item had to be accounted for.
JB
Re: RegExReplace backreferences cannot be manipulated
Great. So put that explanation in the documentation, for the reader to understand it. I view the documentation as doing exactly that: helping the reader not only with a catalog of functions and parameters, but with tips, guidance, and clarifications about common misunderstandings.
Re: RegExReplace backreferences cannot be manipulated
I guess "common" is the key word here: it never occurred to me that $n or ${n} could possibly be thought to mean anything in an expression.
JB
JB
Re: RegExReplace backreferences cannot be manipulated
That's good. You are more advanced than some others who are trying to figure it out. Some of the AHK users have no experience with regex or expressions.
Re: RegExReplace backreferences cannot be manipulated
I was agreeing with you! If it occurred to you of all people, then maybe it does qualify as a common misunderstanding.
How would you phrase this usage note?
JB
How would you phrase this usage note?
JB
Re: RegExReplace backreferences cannot be manipulated
Thank you. My suggestion:
Backreferences cannot be used directly in inner functions or as numbers in expressions.
Perhaps someone else has a clearer way to put it, but this is how I think of it.
The beauty of AutoHotkey is that it is designed for the user. For example, why can one add a GUI control using "AddText" or "Add('Text')"? Well, I think it's just because it's convenient, so the program was designed to accommodate both approaches. I think it's conceivable that RegExReplace could be designed to handle the following, but it simply isn't. I'm not suggesting that it should be, but am suggesting that some users believe that such a thing could make sense.
This clarification about backreferences would be helpful especially due to how AHK handles strings and numbers.
Thus, some users may be accustomed to adding a number to what appears to be a string-- call it "plain text", too, if you wish-- when the "string" is really a number.
Good discussion here. I conclude that some readers here would like a clarification, while others view it as unnecessary or redundant. I think that one more sentence won't hurt! Redundancy has its place, even in AutoHotkey!
Backreferences cannot be used directly in inner functions or as numbers in expressions.
Perhaps someone else has a clearer way to put it, but this is how I think of it.
The beauty of AutoHotkey is that it is designed for the user. For example, why can one add a GUI control using "AddText" or "Add('Text')"? Well, I think it's just because it's convenient, so the program was designed to accommodate both approaches. I think it's conceivable that RegExReplace could be designed to handle the following, but it simply isn't. I'm not suggesting that it should be, but am suggesting that some users believe that such a thing could make sense.
Code: Select all
#Requires AutoHotkey v2.0
MsgBox RegExReplace("a1c", "a(\d)c", "$1" + 3)
Code: Select all
#Requires AutoHotkey v2.0
f := "3"
MsgBox f + 2
g := "b"
MsgBox g + 2
Good discussion here. I conclude that some readers here would like a clarification, while others view it as unnecessary or redundant. I think that one more sentence won't hurt! Redundancy has its place, even in AutoHotkey!
Re: RegExReplace backreferences cannot be manipulated
Hmm, that may be a little too brief, and the meaning of "inner functions" won't be obvious. Maybe something like this, added to the description of the Replacement parameter:
When the Replacement parameter is an expression, backreferences like $1 and ${12} are only meaningful in the result of the expression passed to RegExReplace() -- they cannot be used as arguments and parameters to the operators and functions that make up the expression.
JB
Re: RegExReplace backreferences cannot be manipulated
I appreciate the revision. I have no objections. Using your text, another option is below.
Backreferences cannot be used as operands or function parameters.
Backreferences cannot be used as operands or function parameters.
Re: RegExReplace backreferences cannot be manipulated
The documentation is already 100% perfectly clear what Replacement is: it is a string.
You seem to think that it RegExReplace takes some special and magical argument, but no, it is nothing more than a string.
The catch here is that RegExReplace also interprets Replacement and replaces any $number, ${number}, ${capturename} etc. in Replacement before replacing the found Needle in Haystack with the munged Replacement.
The first MsgBox will show J123$1q9$1q9
The second MsgBox will show J4
The third MsgBox will show J$
The fourth MsgBox will throw an error: Expected a Number but got a String
Therefore, the code at the top behaves identically to the following:
Notice how there are no "function calls" "inside" Replacement or any "variables". That's because there is no magical behavior in Replacement. Replacement will be the result of whatever expression is in there, exactly as with every single other function call in AHK 2.0.
In the first RegExReplace, every $1 in Replacement was replaced with "c" because the first capture captured the "c". Then, the entire input string (because the Needle matches all of it) is replaced with the processed Replacement string, which is now "J123cq9cq9".
In the second RegExReplace, the entire string is replaced with "J4". That's because InStr(f, "$1q9") evaluates to the number 4 and "J" concatenated with 4 is "J4". There is no backreference there because there is no backreference there. There is no $ followed by a number or a { here, so nothing is changed in Replacement.
The same happens in the third RegExReplace because the $ at the end of the string by itself is not a backreference, either.
The fourth one throws an error not because of any RegExReplace behavior, but because there is no possible way AutoHotkey could determine what you meant by 3 + "$1". What do you think this should result in? It's adding apples and oranges. In JavaScript, perhaps you could get a fruit salad because it specifies that adding two things together with + results in the concatenation of the things after converting them both to strings. But in AutoHotkey, + adds together two numbers and "$1" cannot be cleanly interpreted as a number, so it tells you that it expected a number on the other side of that plus sign.
Now look at this:
This will output "Jd". This is because Replacement is "J$4" because (d) is the fourth capture and InStr(f, "$1q9") evaluates to 4.
I do agree that the documentation for Replacement should assume less about what the reader knows and introduce the backreferences a bit more cleanly and gently.
All in all, you should have said what you actually expected each of these lines of code to do.
You seem to think that it RegExReplace takes some special and magical argument, but no, it is nothing more than a string.
The catch here is that RegExReplace also interprets Replacement and replaces any $number, ${number}, ${capturename} etc. in Replacement before replacing the found Needle in Haystack with the munged Replacement.
Consider this:mikeyww wrote: ↑23 Jan 2023, 11:12Code: Select all
#Requires AutoHotkey v2.0 str := "abcd" regex := "ab(c)d" f := "123$1q9" MsgBox RegExReplace(str, regex, 'J' f "$1q9") ; A backreference MsgBox RegExReplace(str, regex, 'J' InStr(f, "$1q9")) ; Not a backreference MsgBox RegExReplace(str, regex, 'J' SubStr("$1", 1, 1)) ; Not a backreference MsgBox RegExReplace(str, regex, 3 + "$1") ; Error
Code: Select all
str := "abcd"
regex := "ab(c)d"
f := "123$1q9"
MsgBox 'J' f "$1q9" ; A backreference
MsgBox 'J' InStr(f, "$1q9") ; Not a backreference
MsgBox 'J' SubStr("$1", 1, 1) ; Not a backreference
MsgBox 3 + "$1" ; Error
The second MsgBox will show J4
The third MsgBox will show J$
The fourth MsgBox will throw an error: Expected a Number but got a String
Therefore, the code at the top behaves identically to the following:
Code: Select all
MsgBox RegExReplace("abcd", "ab(c)d", "J123$1q9$1q9")
MsgBox RegExReplace("abcd", "ab(c)d", "J4")
MsgBox RegExReplace("abcd", "ab(c)d", "J$")
MsgBox RegExReplace("abcd", "ab(c)d", 3 + "$1")
In the first RegExReplace, every $1 in Replacement was replaced with "c" because the first capture captured the "c". Then, the entire input string (because the Needle matches all of it) is replaced with the processed Replacement string, which is now "J123cq9cq9".
In the second RegExReplace, the entire string is replaced with "J4". That's because InStr(f, "$1q9") evaluates to the number 4 and "J" concatenated with 4 is "J4". There is no backreference there because there is no backreference there. There is no $ followed by a number or a { here, so nothing is changed in Replacement.
The same happens in the third RegExReplace because the $ at the end of the string by itself is not a backreference, either.
The fourth one throws an error not because of any RegExReplace behavior, but because there is no possible way AutoHotkey could determine what you meant by 3 + "$1". What do you think this should result in? It's adding apples and oranges. In JavaScript, perhaps you could get a fruit salad because it specifies that adding two things together with + results in the concatenation of the things after converting them both to strings. But in AutoHotkey, + adds together two numbers and "$1" cannot be cleanly interpreted as a number, so it tells you that it expected a number on the other side of that plus sign.
Now look at this:
Code: Select all
str := "abcd"
regex := "(a)(b)(c)(d)"
f := "123$1q9"
MsgBox RegExReplace(str, regex, 'J$' InStr(f, "$1q9"))
I do agree that the documentation for Replacement should assume less about what the reader knows and introduce the backreferences a bit more cleanly and gently.
All in all, you should have said what you actually expected each of these lines of code to do.
Re: RegExReplace backreferences cannot be manipulated
In my view, the following sum is 26, and so this is the meaning of adding the orange to the apple-- in some situations.
This would replace the entire match with three plus the backreference's value.
Many users are not knowledgeable about how AHK adds numbers, or how AHK differs from JavaScript. They may not have a deep understanding of how functions actually work.
At least for me, a regex tutorial is not what I am suggesting or seeking here. My observation is merely that the issue confuses some readers and users of AHK. If you agree that the documentation should say more, then you can recommend an alternative to my suggested text, unless you agree with it.
Code: Select all
#Requires AutoHotkey v2.0
num := RegExReplace("number23", ".*?(\d+)", "$1")
MsgBox 3 + num
MsgBox 3 + "23"
num := RegExReplace("number23", ".*?(\d+)", 3 + "$1")
Many users are not knowledgeable about how AHK adds numbers, or how AHK differs from JavaScript. They may not have a deep understanding of how functions actually work.
At least for me, a regex tutorial is not what I am suggesting or seeking here. My observation is merely that the issue confuses some readers and users of AHK. If you agree that the documentation should say more, then you can recommend an alternative to my suggested text, unless you agree with it.
Re: RegExReplace backreferences cannot be manipulated
Looks like the docs are confining the usage of plus and minus operators to (at least) one pair of variables, and of course, "raw" numbers. The effects of adding/subtracting strings to variables/numbers, or strings to strings would want a mention or link-on in there at some stage.
itros "ylbbub eht tuO kaerB" a ni kcuts m'I pleH
Re: RegExReplace backreferences cannot be manipulated
Already noted.
Backreferences cannot be used as operands or function parameters.
Backreferences cannot be used as operands or function parameters.
Return to “Suggestions on Documentation Improvements”
Who is online
Users browsing this forum: No registered users and 2 guests