Jump to content


Photo

Regex match returns 1 when the variable doesn't even exist.


  • Please log in to reply
9 replies to this topic

#1 Honest Abe

Honest Abe
  • Members
  • 307 posts

Posted 28 March 2012 - 03:34 AM

I just noticed that if you pass a nonexistent variable to RegExMatch's second parameter it returns 1 ! What do I not understand?

string := "this is a test"

f1::
result := string ~= z
traytip,, % result
Return

aka

f1::
result := RegExMatch(string, z)
traytip,, % result
Return


#2 sinkfaze

sinkfaze
  • Moderators
  • 6089 posts

Posted 28 March 2012 - 04:07 AM

Where's the first place you're going to find nothing in a string? Position 1.

#3 Honest Abe

Honest Abe
  • Members
  • 307 posts

Posted 28 March 2012 - 04:12 AM

Hmmm. It also seems to find "t" at position 1.
Help me understand that.

#4 sinkfaze

sinkfaze
  • Moderators
  • 6089 posts

Posted 28 March 2012 - 05:32 AM

AHK distinguishes between when it's looking for "nothing" and looking for "something", so you can find both "nothing" and "something" in the same place depending on how you look for it.

#5 Honest Abe

Honest Abe
  • Members
  • 307 posts

Posted 28 March 2012 - 06:28 AM

To be clear, does this behavior exist simply for checking if a string is empty?
e.g.
f4::traytip,, % (var = "") ? ("empty") : ("not empty")

Or is there any other reason you know of?
Thanks for the help.

#6 Lexikos

Lexikos
  • Administrators
  • 8853 posts

Posted 28 March 2012 - 12:13 PM

It has little to do with AutoHotkey and nothing to do with the = comparison operator. An empty string, when compiled as a regex pattern, will match exactly zero characters at whatever position you attempt to match it. Think of it this way: For any position n in any string, the next 0 characters are always the same.
MsgBox % "pos " RegExMatch("abc", "", m, 1) " len " StrLen(m)
MsgBox % "pos " RegExMatch("abc", "", m, 2) " len " StrLen(m)
There are other ways that an empty string can match, such as the * and ? quantifiers. For example, "a*" will match zero or more of the character 'a'. There are also ways that a pattern can match zero characters at one position, but not another. For example, "(?=a)" matches zero characters, but only at a position where "a" matches.
MsgBox % "pos " RegExMatch("abc", "a*", m) " len " StrLen(m)
MsgBox % "pos " RegExMatch("abc", "d*", m) " len " StrLen(m)
MsgBox % "pos " RegExMatch("abc", "(?=a)", m) " len " StrLen(m)
MsgBox % "pos " RegExMatch("abc", "(?=b)", m) " len " StrLen(m)
The regex syntax and matching behaviour is defined by PCRE, with very little customization for AutoHotkey.

... if you pass a nonexistent variable ...

Of course, you can't pass something that doesn't exist. The variable does exist, and it contains an empty string.

#7 rtcvb32

rtcvb32
  • Members
  • 542 posts

Posted 28 March 2012 - 12:32 PM

I remember it commenting on this very thing in 'Mastering Regular Expressions' where an expression to express a number in it's regular, floating, scientific formats were accepted. Let's see if I can put it here. I've dropped unneeded parts from the text.

I once saw in a book by a respected author, in which he describes a regular expression to match a number, either integer or floating point.

His regex is '-?[0-9]*\.?[0-9]*'.

However, do you think it matches in a string like 'this has no number'? Look closely. Everything is optional. Nothing is required. This regex can match all non-number examples, matching the nothingness at the beginning of the string each time.



#8 Honest Abe

Honest Abe
  • Members
  • 307 posts

Posted 28 March 2012 - 12:53 PM

I appreciate the help!

#9 just me

just me
  • Members
  • 1175 posts

Posted 28 March 2012 - 01:02 PM

It's not a RegEx issue, InStr() is yielding the same result: 1.
Haystack := "ABC"
Needle := ""
MsgBox, % InStr(Haystack, Needle)
The search starts at position 1 of Haystack. StrLen(Needle) characters (i.e. 0 characters for an empty needle) of Haystack are compared with the (0) characters in the empty Needle, so the result is "equal" and position 1 will be returned.

#10 Honest Abe

Honest Abe
  • Members
  • 307 posts

Posted 28 March 2012 - 10:56 PM

It's not a RegEx issue, InStr() is yielding the same result: 1.


It would be more accurate to say that the issue is not confined to RegEx.

I can confirm the RegEx behavior using Python:

import re

var = ""
test = "this is a test"

result = re.match(var, test)
result.start()

Output:
0

Python starts counting from 0 and only produces a number when there is a match.