RegExMatch works in one scenario but not in another Topic is solved

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
User avatar
zvit
Posts: 224
Joined: 07 Nov 2017, 06:15

RegExMatch works in one scenario but not in another

Post by zvit » 20 Jan 2022, 17:36

I have a text file called test.txt. The file contains the following text:

Code: Select all

<VERSION>1</VERSION>
<TITLE/>
<AUTHOR/>
<COPYRIGHT/>
<WORDS>
<WORD dir="ACROSS" id="0" isTheme="false" num="1"/>
<WORD dir="ACROSS" id="1" isTheme="false" num="4"/>
</WORDS>
<NOTES/>
My code will use RegExReplace to replace the rows between the <WORDS> and </WORDS> tags.
I am first testing this with RegExMatch and have an issue.
If I put the file contents directly into a variable in my code it runs fine. But when I retrieve the text from the file, it doesn't work. Here's my code:

THIS WORKS - AND THE MESSAGE BOX SHOWS "found."

Code: Select all

SrcString =
(
<VERSION>1</VERSION>
<TITLE/>
<AUTHOR/>
<COPYRIGHT/>
<WORDS>
<WORD dir="ACROSS" id="0" isTheme="false" num="1"/>
<WORD dir="ACROSS" id="1" isTheme="false" num="4"/>
</WORDS>
<NOTES/>
)

  if RegExMatch(SrcString, "\<WORDS>.*?\</WORDS>")
    msgbox, found
  Else
    msgbox not found
THIS DOES NOT WORK - AND THE MESSAGE BOX SHOWS "not found."

Code: Select all

FileRead, OutputVar, F:\test.txt

  if RegExMatch(OutputVar, "\<WORDS>.*?\</WORDS>")
    msgbox, found
  Else
    msgbox not found
I know the code is reading the file correctly because a message box shows the correct file contents when I put one after FileRead:

Code: Select all

msgbox % OutputVar
What am I doing wrong?

sofista
Posts: 650
Joined: 24 Feb 2020, 13:59
Location: Buenos Aires

Re: RegExMatch works in one scenario but not in another

Post by sofista » 20 Jan 2022, 18:11

Looks as a file encoding issue, try to save the "test.txt" file as UTF-8 with BOM.

teadrinker
Posts: 4326
Joined: 29 Mar 2015, 09:41
Contact:

Re: RegExMatch works in one scenario but not in another  Topic is solved

Post by teadrinker » 20 Jan 2022, 18:14

Try RegExMatch(OutputVar, "s)\<WORDS>.*?\</WORDS>")

User avatar
zvit
Posts: 224
Joined: 07 Nov 2017, 06:15

Re: RegExMatch works in one scenario but not in another

Post by zvit » 20 Jan 2022, 18:14

It is already UTF-8, I didn't post all of the content of the file here but this is the first line:

<?xml version="1.0" encoding="utf-8" standalone="no"?>

User avatar
zvit
Posts: 224
Joined: 07 Nov 2017, 06:15

Re: RegExMatch works in one scenario but not in another

Post by zvit » 20 Jan 2022, 18:19

teadrinker wrote:
20 Jan 2022, 18:14
Try RegExMatch(OutputVar, "s)\<WORDS>.*?\</WORDS>")
That worked! I'm sure s) is documented in the RegEx Options page, but could you tell me what it does?


User avatar
zvit
Posts: 224
Joined: 07 Nov 2017, 06:15

Re: RegExMatch works in one scenario but not in another

Post by zvit » 20 Jan 2022, 18:25

There's now a different issue this caused. It is now replacing the <WORDS> and </WORDS> tags themselves as well. I only want to replace what's between them but keep the tags.


User avatar
zvit
Posts: 224
Joined: 07 Nov 2017, 06:15

Re: RegExMatch works in one scenario but not in another

Post by zvit » 20 Jan 2022, 18:28

I see there what the 's' does but it doesn't mention the parentheses ')' after the 's' how did you know what that does?

User avatar
zvit
Posts: 224
Joined: 07 Nov 2017, 06:15

Re: RegExMatch works in one scenario but not in another

Post by zvit » 20 Jan 2022, 18:29

teadrinker wrote:
20 Jan 2022, 18:27
How are you trying?
What do you mean? Just as before, I only added your s)

User avatar
zvit
Posts: 224
Joined: 07 Nov 2017, 06:15

Re: RegExMatch works in one scenario but not in another

Post by zvit » 20 Jan 2022, 18:31

Here's the exact code I'm using:

Code: Select all

    else
	{
		xlApp := ComObjCreate("Excel.Application")
		xlWB := xlApp.Workbooks.Open(FilePath,0,0)
		xlApp.Visible := false
		xlWS := xlApp.Sheets("USED")
		xlApp.Range("rng_HELPER_SORT").Sort(xlApp.Range("A1"),,,,,,,1) ; with Header
		
		HtmlArr := xlApp.Range("rng_HTML")
 
		for row in HtmlArr
			for cell in row
				data.= cell.text "`n"

   FileRead, OutputVar, %MyPath%003_13x13.cfp
    NewVal := RegExReplace(OutputVar, "s)\<WORDS>.*?\</WORDS>", data)
    FileAppend, %NewVal%, %MyPath%TestHTML.txt 	
	}

teadrinker
Posts: 4326
Joined: 29 Mar 2015, 09:41
Contact:

Re: RegExMatch works in one scenario but not in another

Post by teadrinker » 20 Jan 2022, 18:40

zvit wrote: the parentheses ')' after the 's'
Options wrote:At the very beginning of a regular expression, specify zero or more of the following options followed by a close-parenthesis. For example, the pattern im)abc ...
Try NewVal := RegExReplace(OutputVar, "s)<WORDS>\K.*?(?=</WORDS>)", data)

User avatar
zvit
Posts: 224
Joined: 07 Nov 2017, 06:15

Re: RegExMatch works in one scenario but not in another

Post by zvit » 20 Jan 2022, 19:26

teadrinker wrote:
20 Jan 2022, 18:40
Try NewVal := RegExReplace(OutputVar, "s)<WORDS>\K.*?(?=</WORDS>)", data)
Yes, that works. However, even though not a big deal, I'm trying to keep a clean code, and with the last version, the tag <WORDS> needs to be on its own line.
I tried adding `n and \`n to the code, but it didn't work. p.s. The last </WORDS> tag IS on a new line, which is great.

Now it looks like this:

Code: Select all

<WORDS><WORD dir="ACROSS"
I need it like this:

Code: Select all

<WORDS>
<WORD dir="ACROSS
Why is this RegEx so complicated?

User avatar
zvit
Posts: 224
Joined: 07 Nov 2017, 06:15

Re: RegExMatch works in one scenario but not in another

Post by zvit » 20 Jan 2022, 19:35

I figured it out:

Code: Select all

NewVal := RegExReplace(OutputVar, "s)(?=<WORDS>).*?(?=</WORDS>)\K", data)

Post Reply

Return to “Ask for Help (v1)”