Page 1 of 1
RegExMatch works in one scenario but not in another
Posted: 20 Jan 2022, 17:36
by zvit
I have a text file called test.txt. The file contains the following text:
Code: Select all
<VERSION>1</VERSION>
<TITLE/>
<AUTHOR/>
<COPYRIGHT/>
<WORDS>
<WORD dir="ACROSS" id="0" isTheme="false" num="1"/>
<WORD dir="ACROSS" id="1" isTheme="false" num="4"/>
</WORDS>
<NOTES/>
My code will use RegExReplace to replace the rows between the <WORDS> and </WORDS> tags.
I am first testing this with RegExMatch and have an issue.
If I put the file contents directly into a variable in my code it runs fine. But when I retrieve the text from the file, it doesn't work. Here's my code:
THIS WORKS - AND THE MESSAGE BOX SHOWS "found."
Code: Select all
SrcString =
(
<VERSION>1</VERSION>
<TITLE/>
<AUTHOR/>
<COPYRIGHT/>
<WORDS>
<WORD dir="ACROSS" id="0" isTheme="false" num="1"/>
<WORD dir="ACROSS" id="1" isTheme="false" num="4"/>
</WORDS>
<NOTES/>
)
if RegExMatch(SrcString, "\<WORDS>.*?\</WORDS>")
msgbox, found
Else
msgbox not found
THIS DOES NOT WORK - AND THE MESSAGE BOX SHOWS "not found."
Code: Select all
FileRead, OutputVar, F:\test.txt
if RegExMatch(OutputVar, "\<WORDS>.*?\</WORDS>")
msgbox, found
Else
msgbox not found
I know the code is reading the file correctly because a message box shows the correct file contents when I put one after FileRead:
What am I doing wrong?
Re: RegExMatch works in one scenario but not in another
Posted: 20 Jan 2022, 18:11
by sofista
Looks as a file encoding issue, try to save the "test.txt" file as UTF-8 with BOM.
Re: RegExMatch works in one scenario but not in another Topic is solved
Posted: 20 Jan 2022, 18:14
by teadrinker
Try RegExMatch(OutputVar, "s)\<WORDS>.*?\</WORDS>")
Re: RegExMatch works in one scenario but not in another
Posted: 20 Jan 2022, 18:14
by zvit
It is already UTF-8, I didn't post all of the content of the file here but this is the first line:
<?xml version="1.0" encoding="utf-8" standalone="no"?>
Re: RegExMatch works in one scenario but not in another
Posted: 20 Jan 2022, 18:19
by zvit
teadrinker wrote: ↑20 Jan 2022, 18:14
Try
RegExMatch(OutputVar, "s)\<WORDS>.*?\</WORDS>")
That worked! I'm sure s) is documented in the RegEx Options page, but could you tell me what it does?
Re: RegExMatch works in one scenario but not in another
Posted: 20 Jan 2022, 18:21
by teadrinker
Re: RegExMatch works in one scenario but not in another
Posted: 20 Jan 2022, 18:25
by zvit
There's now a different issue this caused. It is now replacing the <WORDS> and </WORDS> tags themselves as well. I only want to replace what's between them but keep the tags.
Re: RegExMatch works in one scenario but not in another
Posted: 20 Jan 2022, 18:27
by teadrinker
How are you trying?
Re: RegExMatch works in one scenario but not in another
Posted: 20 Jan 2022, 18:28
by zvit
I see there what the 's' does but it doesn't mention the parentheses ')' after the 's' how did you know what that does?
Re: RegExMatch works in one scenario but not in another
Posted: 20 Jan 2022, 18:29
by zvit
teadrinker wrote: ↑20 Jan 2022, 18:27
How are you trying?
What do you mean? Just as before, I only added your s)
Re: RegExMatch works in one scenario but not in another
Posted: 20 Jan 2022, 18:31
by zvit
Here's the exact code I'm using:
Code: Select all
else
{
xlApp := ComObjCreate("Excel.Application")
xlWB := xlApp.Workbooks.Open(FilePath,0,0)
xlApp.Visible := false
xlWS := xlApp.Sheets("USED")
xlApp.Range("rng_HELPER_SORT").Sort(xlApp.Range("A1"),,,,,,,1) ; with Header
HtmlArr := xlApp.Range("rng_HTML")
for row in HtmlArr
for cell in row
data.= cell.text "`n"
FileRead, OutputVar, %MyPath%003_13x13.cfp
NewVal := RegExReplace(OutputVar, "s)\<WORDS>.*?\</WORDS>", data)
FileAppend, %NewVal%, %MyPath%TestHTML.txt
}
Re: RegExMatch works in one scenario but not in another
Posted: 20 Jan 2022, 18:40
by teadrinker
zvit wrote: ↑the parentheses ')' after the 's'
Options wrote:At the very beginning of a regular expression, specify zero or more of the following options followed by a close-parenthesis. For example, the pattern im)abc ...
Try
NewVal := RegExReplace(OutputVar, "s)<WORDS>\K.*?(?=</WORDS>)", data)
Re: RegExMatch works in one scenario but not in another
Posted: 20 Jan 2022, 19:26
by zvit
teadrinker wrote: ↑20 Jan 2022, 18:40
Try
NewVal := RegExReplace(OutputVar, "s)<WORDS>\K.*?(?=</WORDS>)", data)
Yes, that works. However, even though not a big deal, I'm trying to keep a clean code, and with the last version, the tag <WORDS> needs to be on its own line.
I tried adding `n and \`n to the code, but it didn't work. p.s. The last </WORDS> tag IS on a new line, which is great.
Now it looks like this:
I need it like this:
Why is this RegEx so complicated?
Re: RegExMatch works in one scenario but not in another
Posted: 20 Jan 2022, 19:35
by zvit
I figured it out:
Code: Select all
NewVal := RegExReplace(OutputVar, "s)(?=<WORDS>).*?(?=</WORDS>)\K", data)