RegExMatch help

Get help with using AutoHotkey and its commands and hotkeys
User avatar
JoeWinograd
Posts: 1353
Joined: 10 Feb 2014, 20:00

RegExMatch help

28 May 2019, 15:49

Hi Folks,

RegEx continues to elude me. I have a large string with an integer somewhere in it that is between two other strings. Here's an example:

<hold>17,293 Plans</hold>

That appears in the middle of a large string. Note that the integer has commas as the thousands separator. I wrote this code that finds where the integer begins:

Code: Select all

Prefix:="<hold>"
Suffix:=" Plans</hold>"
Needle:=Prefix . ".*" . Suffix
IntPos:=RegExMatch(LargeString,Needle)
Works well...finds the beginning position of the integer in the large string. Now I need to get the integer between the Prefix and Suffix into a variable. That's where I need help. Thanks much, Joe
User avatar
sinkfaze
Posts: 613
Joined: 01 Oct 2013, 08:01

Re: RegExMatch help

28 May 2019, 15:54

Is there a reason that you can't use this?

Code: Select all

var=<hold>17,293 Plans</hold>
RegExMatch(var,"[\d`,]+",m)
MsgBox %	m
User avatar
jeeswg
Posts: 6904
Joined: 19 Dec 2016, 01:58
Location: UK

Re: RegExMatch help

28 May 2019, 16:02

A match variable/object returns the entire match, use parentheses to get partial matches, stored as 1/2/3 etc (as pseudo-array variables/array keys). E.g.:

Code: Select all

q:: ;RegExMatch with parentheses
LargeString := "abcdef<hold>17,293 Plans</hold>abcdef"
Prefix:="<hold>"
Suffix:=" Plans</hold>"
Needle:=Prefix . "(.*)" . Suffix

;to an variables:
IntPos:=RegExMatch(LargeString,Needle,Match)
MsgBox, % Match
MsgBox, % Match1

;to object (more forwards compatible):
IntPos:=RegExMatch(LargeString,"O)" Needle,Match)
MsgBox, % Match.0
MsgBox, % Match.1
return
Instead of parentheses, you could use \K and a look-ahead assertion (?=...):

Code: Select all

w:: ;RegExMatch with \K and a look-ahead assertion (?=...)
LargeString := "abcdef<hold>17,293 Plans</hold>abcdef"
Prefix:="<hold>\K"
Suffix:="(?= Plans</hold>)"
Needle:=Prefix . ".*" . Suffix

;to variables:
IntPos:=RegExMatch(LargeString,Needle,Match)
MsgBox, % Match

;to an object (more forwards compatible):
IntPos:=RegExMatch(LargeString,"O)" Needle,Match)
MsgBox, % Match.0
return
Last edited by jeeswg on 28 May 2019, 16:08, edited 2 times in total.
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
User avatar
JoeWinograd
Posts: 1353
Joined: 10 Feb 2014, 20:00

Re: RegExMatch help

28 May 2019, 16:05

Hi sinkfaze,
Can't use that because there are many other integers in the large string. However, there is only one occurrence of the Prefix and Suffix, and only one integer (with commas) between the Prefix and Suffix. That's the one it needs to find. Thanks, Joe
User avatar
JoeWinograd
Posts: 1353
Joined: 10 Feb 2014, 20:00

Re: RegExMatch help

28 May 2019, 16:44

Hi jeeswg,
That doesn't work because it matches a string that begins with the Prefix but doesn't have the Suffix. For example, it fails on this:

LargeString:="12345<hold>123 <hold>17,293 Plans</hold>abcdef"

It must match on the integer (with commas) between the Prefix and Suffix. There may be other occurrences of PARTS OF the Prefix and Suffix, but there is only one occurrence of the complete Prefix and complete Suffix, and only one integer (with commas) between them...that's the one it must find. Thanks, Joe
User avatar
jeeswg
Posts: 6904
Joined: 19 Dec 2016, 01:58
Location: UK

Re: RegExMatch help

28 May 2019, 16:51

Try [\d,]+ instead of .*.
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
User avatar
JoeWinograd
Posts: 1353
Joined: 10 Feb 2014, 20:00

Re: RegExMatch help

28 May 2019, 18:27

Hi jeeswg,
Thank you very much...[\d,]+ in there works a charm! While I have your attention, I can use your help on a similar issue. I have the same situation where there's a Prefix and a Suffix, but what is between them can be any string with any characters and any length (and it is inside quotes). I thought that * or .* would work as the needle, but they don't.

Using your code for variables:

Code: Select all

LargeString:="12345""Any string enclosed in quotes""abcde"
Prefix:="12345"
Suffix:="abcde"
Needle:=Prefix . "what is the Needle" . Suffix
IntPos:=RegExMatch(LargeString,Needle,Match)
MsgBox, % Match
MsgBox, % Match1
The RegExMatch needs to return this in Match1:

Any string enclosed in quotes

What is the Needle to achieve that? Thanks again, Joe
User avatar
jeeswg
Posts: 6904
Joined: 19 Dec 2016, 01:58
Location: UK

Re: RegExMatch help

28 May 2019, 19:18

Are you referring to this issue: PfxPfxPfx Needle SfxSfxSfx, you want the needle after the last prefix, before the first suffix. I.e. you want 'Needle', not 'PfxPfx Needle SfxSfx'.

It might be something like: .*Pfx\K.*?(?=Sfx).
.* to get as much as you can, and thus after the last possible Pfx.
.*? to get as little as you can, ungreedy, and thus before the first possible Sfx.

Two general points about literal needles:

Use the DotAll (s) option to handle CRs/LFs.
Regular Expressions (RegEx) - Quick Reference | AutoHotkey
https://autohotkey.com/docs/misc/RegEx-QuickRef.htm

And to make a RegEx needle literal:

Code: Select all

;simplest way to make a RegEx needle literal? - AutoHotkey Community
;https://autohotkey.com/boards/viewtopic.php?f=5&t=30420

vNeedle := "\Q" RegExReplace(vNeedle, "\\E", "\E\\E\Q") "\E"
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
User avatar
JoeWinograd
Posts: 1353
Joined: 10 Feb 2014, 20:00

Re: RegExMatch help

28 May 2019, 19:43

I don't think literal needles is the issue.

I have a variable that contains what I'm calling a prefix. For example:

Prefix:="12345"

I have a variable that contains what I'm calling a suffix. For example:

Suffix:="abcde"

The combination of the prefix followed by the suffix (in a large string) is unique...occurs only once. I want the string that is between the prefix and the suffix. That string happens to be in quotes, but if RegExMatch can't easily remove the quotes, that's fine...I'll simply do Match1:=StrReplace(Match1,"""") (that will work because the string has no embedded quotes...only the opening and closing ones). Cheers, Joe
User avatar
jeeswg
Posts: 6904
Joined: 19 Dec 2016, 01:58
Location: UK

Re: RegExMatch help

28 May 2019, 19:46

You may want this:

Code: Select all

Prefix:="12345"""
Suffix:="""abcde"

;RegEx uses \x22 for double quote also:
Prefix:="12345\x22"
Suffix:="\x22abcde"
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
User avatar
JoeWinograd
Posts: 1353
Joined: 10 Feb 2014, 20:00

Re: RegExMatch help

28 May 2019, 20:06

So, what is the needle for this:

Code: Select all

LargeString:="12345""Any string enclosed in quotes""abcde"
Prefix:="12345"
Suffix:="abcde"
Needle:=Prefix . "what is the Needle" . Suffix
IntPos:=RegExMatch(LargeString,Needle,Match)
MsgBox, % Match
MsgBox, % Match1
For the integer, [\d,]+ works a charm as the needle. What is the needle for an arbitrary string (enclosed in quotes) between the prefix and suffix?

To make it clearer, here's some non-RegEx code:

Code: Select all

LargeString:="12345""Any string enclosed in quotes""abcde"
Prefix:="12345"
Suffix:="abcde"
BeginPos:=InStr(LargeString,Prefix)+StrLen(Prefix)+1
EndPos:=InStr(LargeString,"""",,BeginPos)
ExtractedString:=SubStr(LargeString,BeginPos,EndPos-BeginPos)
MsgBox % ExtractedString
Of course, that's not really a solution, because it will find the Prefix even if it is not followed by the Suffix (in fact, the Suffix isn't even used in the code). But it should give you the idea of what I'm looking for. Thanks, Joe
User avatar
jeeswg
Posts: 6904
Joined: 19 Dec 2016, 01:58
Location: UK

Re: RegExMatch help

28 May 2019, 20:47

Perhaps:

Code: Select all

;this:
Needle:=Prefix . """(what is the Needle)""" . Suffix

;or this:
NeedleProper:="what is the Needle"
NeedleProper:="\Q" RegExReplace(NeedleProper, "\\E", "\E\\E\Q") "\E"
Needle:=Prefix . """(" NeedleProper ")""" . Suffix
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
User avatar
JoeWinograd
Posts: 1353
Joined: 10 Feb 2014, 20:00

Re: RegExMatch help

28 May 2019, 21:32

Sorry, I'm having a difficult time explaining to you what I want. The "(what is the Needle)" text was just a comment for you...I was hoping to have that imply what I was looking for, but obviously it didn't. I'll try again.

I'm looking for a needle that matches an arbitrary string between the prefix and suffix. You gave me the correct needle to extract the integer between the prefix and suffix...it is ([\d,]+)...works great! What is the needle to extract an arbitrary string between the prefix and suffix? If it can also remove the quotes around the string, great; if not, that's fine...I'll remove the quotes with StrReplace after doing the RegExMatch. I hope that clarifies what I'm looking for. Regards, Joe
User avatar
jeeswg
Posts: 6904
Joined: 19 Dec 2016, 01:58
Location: UK

Re: RegExMatch help

28 May 2019, 22:04

.* for 0 or more characters.
.+ for 1 or more characters.
And use the s option for DotAll, if you expects CRs/LFs.

You could play around with character classes, e.g. add more characters or add \w.
[\d,]+
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
User avatar
JoeWinograd
Posts: 1353
Joined: 10 Feb 2014, 20:00

Re: RegExMatch help

28 May 2019, 23:29

The first thing I tried a while ago was .* and it did not work, but now I realize that it's because I didn't include parentheses to identify it as a subpattern, so Match1 came back as null. Now it is working perfectly! Thanks for your help...much appreciated! Regards, Joe

Return to “Ask For Help”

Who is online

Users browsing this forum: Anthrazite, Bing [Bot], malcev and 201 guests