Page 1 of 1

RegExMatch works in one scenario but not in another

Posted: 20 Jan 2022, 17:36
by zvit
I have a text file called test.txt. The file contains the following text:

Code: Select all

<VERSION>1</VERSION>
<TITLE/>
<AUTHOR/>
<COPYRIGHT/>
<WORDS>
<WORD dir="ACROSS" id="0" isTheme="false" num="1"/>
<WORD dir="ACROSS" id="1" isTheme="false" num="4"/>
</WORDS>
<NOTES/>
My code will use RegExReplace to replace the rows between the <WORDS> and </WORDS> tags.
I am first testing this with RegExMatch and have an issue.
If I put the file contents directly into a variable in my code it runs fine. But when I retrieve the text from the file, it doesn't work. Here's my code:

THIS WORKS - AND THE MESSAGE BOX SHOWS "found."

Code: Select all

SrcString =
(
<VERSION>1</VERSION>
<TITLE/>
<AUTHOR/>
<COPYRIGHT/>
<WORDS>
<WORD dir="ACROSS" id="0" isTheme="false" num="1"/>
<WORD dir="ACROSS" id="1" isTheme="false" num="4"/>
</WORDS>
<NOTES/>
)

  if RegExMatch(SrcString, "\<WORDS>.*?\</WORDS>")
    msgbox, found
  Else
    msgbox not found
THIS DOES NOT WORK - AND THE MESSAGE BOX SHOWS "not found."

Code: Select all

FileRead, OutputVar, F:\test.txt

  if RegExMatch(OutputVar, "\<WORDS>.*?\</WORDS>")
    msgbox, found
  Else
    msgbox not found
I know the code is reading the file correctly because a message box shows the correct file contents when I put one after FileRead:

Code: Select all

msgbox % OutputVar
What am I doing wrong?

Re: RegExMatch works in one scenario but not in another

Posted: 20 Jan 2022, 18:11
by sofista
Looks as a file encoding issue, try to save the "test.txt" file as UTF-8 with BOM.

Re: RegExMatch works in one scenario but not in another  Topic is solved

Posted: 20 Jan 2022, 18:14
by teadrinker
Try RegExMatch(OutputVar, "s)\<WORDS>.*?\</WORDS>")

Re: RegExMatch works in one scenario but not in another

Posted: 20 Jan 2022, 18:14
by zvit
It is already UTF-8, I didn't post all of the content of the file here but this is the first line:

<?xml version="1.0" encoding="utf-8" standalone="no"?>

Re: RegExMatch works in one scenario but not in another

Posted: 20 Jan 2022, 18:19
by zvit
teadrinker wrote:
20 Jan 2022, 18:14
Try RegExMatch(OutputVar, "s)\<WORDS>.*?\</WORDS>")
That worked! I'm sure s) is documented in the RegEx Options page, but could you tell me what it does?

Re: RegExMatch works in one scenario but not in another

Posted: 20 Jan 2022, 18:21
by teadrinker

Re: RegExMatch works in one scenario but not in another

Posted: 20 Jan 2022, 18:25
by zvit
There's now a different issue this caused. It is now replacing the <WORDS> and </WORDS> tags themselves as well. I only want to replace what's between them but keep the tags.

Re: RegExMatch works in one scenario but not in another

Posted: 20 Jan 2022, 18:27
by teadrinker
How are you trying?

Re: RegExMatch works in one scenario but not in another

Posted: 20 Jan 2022, 18:28
by zvit
I see there what the 's' does but it doesn't mention the parentheses ')' after the 's' how did you know what that does?

Re: RegExMatch works in one scenario but not in another

Posted: 20 Jan 2022, 18:29
by zvit
teadrinker wrote:
20 Jan 2022, 18:27
How are you trying?
What do you mean? Just as before, I only added your s)

Re: RegExMatch works in one scenario but not in another

Posted: 20 Jan 2022, 18:31
by zvit
Here's the exact code I'm using:

Code: Select all

    else
	{
		xlApp := ComObjCreate("Excel.Application")
		xlWB := xlApp.Workbooks.Open(FilePath,0,0)
		xlApp.Visible := false
		xlWS := xlApp.Sheets("USED")
		xlApp.Range("rng_HELPER_SORT").Sort(xlApp.Range("A1"),,,,,,,1) ; with Header
		
		HtmlArr := xlApp.Range("rng_HTML")
 
		for row in HtmlArr
			for cell in row
				data.= cell.text "`n"

   FileRead, OutputVar, %MyPath%003_13x13.cfp
    NewVal := RegExReplace(OutputVar, "s)\<WORDS>.*?\</WORDS>", data)
    FileAppend, %NewVal%, %MyPath%TestHTML.txt 	
	}

Re: RegExMatch works in one scenario but not in another

Posted: 20 Jan 2022, 18:40
by teadrinker
zvit wrote: the parentheses ')' after the 's'
Options wrote:At the very beginning of a regular expression, specify zero or more of the following options followed by a close-parenthesis. For example, the pattern im)abc ...
Try NewVal := RegExReplace(OutputVar, "s)<WORDS>\K.*?(?=</WORDS>)", data)

Re: RegExMatch works in one scenario but not in another

Posted: 20 Jan 2022, 19:26
by zvit
teadrinker wrote:
20 Jan 2022, 18:40
Try NewVal := RegExReplace(OutputVar, "s)<WORDS>\K.*?(?=</WORDS>)", data)
Yes, that works. However, even though not a big deal, I'm trying to keep a clean code, and with the last version, the tag <WORDS> needs to be on its own line.
I tried adding `n and \`n to the code, but it didn't work. p.s. The last </WORDS> tag IS on a new line, which is great.

Now it looks like this:

Code: Select all

<WORDS><WORD dir="ACROSS"
I need it like this:

Code: Select all

<WORDS>
<WORD dir="ACROSS
Why is this RegEx so complicated?

Re: RegExMatch works in one scenario but not in another

Posted: 20 Jan 2022, 19:35
by zvit
I figured it out:

Code: Select all

NewVal := RegExReplace(OutputVar, "s)(?=<WORDS>).*?(?=</WORDS>)\K", data)