RegEx - Error placement. Multi-line and single-line queries consisting only of groups ().
Errors in the results of the RegExReplace() and RegExMatch() functions.
The error occurs only in cases where the "condition" of the regular expression consists only of "groups" that do not have "outside groups" conditions.
A few rules:
1. Examples of "regular expressions" consisting only of groups () are considered. I believe that the error occurs only under this main condition.
2. Groups do not have outer quantifiers outside of main groups. Those. each group appears in the text only once in each line.
3. The code is considered as the desire to correctly convert each line, relative to the input regular expression and text example. The example provides various control test values of each lines.
4. Finding bugs focuses on the possibility of "compile/interpretation errors" by their functions, and not on a programmer's error in the "regular expression" string condition set.
5. The expected result of all examples is the output of the extracted and added information (based on the input):
"third group", then "some symbolic expression", and finally "first group".
Based on rule #5, I want the result "$3 --=-- $1". Here I emphasize that in your tests do not use the short result "$3$1", from groups 3 and 1, since in this case the error cannot be detected.
Introduction to the program code:
1. There is multi-line text in the code.
2. Attention is required when the results of RegExReplace() and RegExMatch() work. The functions in the example are used on their own or are in the middle of a loop.
3. The verification code consists of three options separated by comments, for example "; ##3##". Each option has its own number.
Code: Select all
TestText1 := "
(Join`n
a expression at the end Teach #346
a expression at the end, a expression is part of textTeach #2344
a expression at the end, a period before the expression . without a space .... . .Teach #8542
expression Teach #642 in the middle of the text
expression in the middle of the text, at the end of a period and a space, this line should be considered unchanged in the output Teach #113. .. .
)"
resultX := ""
matchX := ""
; ##1##
Loop, Parse, TestText1, "`n"
{
if (A_Index != 1)
resultX .= "`n"
RegExMatch(A_LoopField ,"(.*?)([\.\s]*?)(Teach[\s]#[\d]*|)$", matchX)
resultX .= matchX3 . " --=-- " . matchX1
}
MsgBox, % resultX
; ##2##
resultX := ""
Loop, Parse, TestText1, "`n"
{
if (A_Index != 1)
resultX .= "`n"
resultX .= RegExReplace(A_LoopField ,"(.*?)([\.\s]*?)(Teach[\s]#[\d]*|)$", "$3 --=-- $1")
; or RegExMatch , then it will give 0/1 (a negative/positive response to the number of lines, and in some cases it will detect extra !!!!)
; resultX .= RegExMatch(A_LoopField ,"(.*?)([\.\s]*?)(Teach[\s]#[\d]*|)$", "$3 --=-- $1")
}
MsgBox, % resultX
; ##3##
TestText11 := RegExReplace(TestText1 ,"m)(*ANYCRLF)(.*?)([\.\s]*?)(Teach[\s]#[\d]*|)$", "$3 --=-- $1")
MsgBox, % TestText11
Statement:
In this example, only the first verification code ##1## works correctly.
Verification code #2, #3 - do not work correctly.
These code options create unnecessary duplicate output of "some symbolic expression" to strings, which is not expected by the basics of programming.
empty value of the 3rd group, as a result on the 3rd group - this is normal for some lines, according to the condition `(Teach[\s]#[\d]*|)`