if var in/contains comma-separated list/array

Discuss the future of the AutoHotkey language
MrDoge
Posts: 151
Joined: 27 Apr 2020, 21:29

Re: if var in/contains comma-separated list/array

02 Aug 2021, 14:52

I'm making a v1 to v2 converter
https://github.com/FuPeiJiang/ahk_parser.js/issues/21

currently, in and contains are reserved for future use.

say I want to convert this:
if var in Mon,Tue,Wed,Thu,Fri,Sat,Sun
I would convert it(automatically) like this:
if (Map("mon",1,"tue",1,"wed",1,"thu1","fri",1,"sat",1,"sun",1).Has(StrLower(var))))
but this loses the point: it loses readability, AND loses Uppercase letters

(also hard to type) as you can see, in my github issue, I mistyped thu1
I've corrected it:
if (Map("mon",1,"tue",1,"wed",1,"thu",1,"fri",1,"sat",1,"sun",1).Has(StrLower(var)))
but this won't happen, because I'll be using script to convert it

I will suggest using this AND saving it to a variable for performance reasons
also, this helps explain what this Map() is about

Code: Select all

daysOfTheWeek := Map()
daysOfTheWeek.CaseSense := "Off"
;...
if (daysOfTheWeek.Has(var)
are there more performant or more readable ways ?
EDIT: I misexchanged if var in for if var contains
Last edited by MrDoge on 03 Aug 2021, 11:33, edited 2 times in total.
swagfag
Posts: 6222
Joined: 11 Jan 2017, 17:59

Re: if var in/contains comma-separated list/array

02 Aug 2021, 16:09

  1. do nothing and leave it as is. the converted script is uncompilable, the user gets a syntax error and has to decide for themselves how to handle it(get rid of contains, reimplement it according to their needs, etc)
  2. perform the conversion using whatever mumbo jumbo (most likely incorrect, or at the very least only partially correct) one-liners people have come up with over the years(if u think strings are easy, ive got a bridge to sell u)
  3. perform the conversion using ur own (most likely incorrect, or at the very least only partially correct) contains_v2stub(), the implementation of which u paste into the converted script(disregarding the minuscule likelihood of encountering duped function names). bridge offer is still up btw
id say go with A
MrDoge
Posts: 151
Joined: 27 Apr 2020, 21:29

Re: if var in/contains comma-separated list/array

02 Aug 2021, 17:26

sorry, I've mistaken if var in for if var contains
the issue author on github typed if var in
contains_v2stub(), the implementation of which u paste into the converted script
error: maybe the user-defined function already exists
don't like: this would add more lines to tiny scripts
one-liners people have come up with over the years
well, I think my converted one-liner has equivalent functionality
I don't know what I did, but I Am taking into account double commas during parsing, so I can correctly convert by script

using a Map() for this functionality should be the most performant

I propose, convert it like this
if (Map("mon",1,"tue",1,"wed",1,"thu1","fri",1,"sat",1,"sun",1).Has(StrLower(var))))
then add something to the left to cause a syntax error
if (####Map("mon",1,"tue",1,"wed",1,"thu1","fri",1,"sat",1,"sun",1).Has(StrLower(var))))
like this, the user can copy paste the Map() which is difficult to type by hand, to somewhere else, to assign it to a variable
or remove the ####...

OR, do not cause syntax error, and inform the user, which parts of the script needs attention (not by using comments, it can ruin the script)
I can do this because the converter is on a website
some people may just want to convert a library, and it works, and use it (not caring about readability of the library)

option A is appealing, personally, I like to optimize my scripts
User avatar
JoeSchmoe
Posts: 129
Joined: 08 Dec 2014, 08:58

Re: if var in/contains comma-separated list/array

02 Aug 2021, 19:14

Thanks for bringing up this important question.

I have a 3000 line script that I need to convert from v1 to v2beta. I've put it through @MrDoge's converter and now have a draft of the script in v2beta code.

As a user I love the idea that the converter would provide me with a suggested conversion, but would also draw my attention to it. For this reason I love your suggestion:
MrDoge wrote:
02 Aug 2021, 17:26

I propose, convert it like this
if (Map("mon",1,"tue",1,"wed",1,"thu1","fri",1,"sat",1,"sun",1).Has(StrLower(var))))
then add something to the left to cause a syntax error
if (####Map("mon",1,"tue",1,"wed",1,"thu1","fri",1,"sat",1,"sun",1).Has(StrLower(var))))
like this, the user can copy paste the Map() which is difficult to type by hand, to somewhere else, to assign it to a variable
or remove the ####...
I love optimizing my scripts, but when it comes to an automatic conversion, I just want something that has a chance of running the first time.

I've converted a couple of scripts just by manually doing control-f search and replace operations. The result wasn't optimized, but it was perfect because it started the process of me coding in v2. Now I code in v2 and much prefer it.

Bottom line: if we can just get code that runs, I think people will appreciate that. As a user, facing a complicated transition from a language I know reasonably well (v1) to a language I'm not familiar with (v2), I would want to minimize the intimidation factor.

Once we hook them into v2, they'll never look back. :D :twisted: :D

Here's another idea: replace "if var in Mon,Tue,Wed,Thu,Fri,Sat,Sun" with:
if (Map("mon",1,"tue",1,"wed",1,"thu1","fri",1,"sat",1,"sun",1).Has(StrLower(var)))) ; #warn: consider optimizing this code.

As a user, I think that would be my favorite, because it might work immediately, but I can always go back and optimize it later.
User avatar
kczx3
Posts: 1640
Joined: 06 Oct 2015, 21:39

Re: if var in/contains comma-separated list/array

02 Aug 2021, 19:33

You could try and convert it to something using the shorthand RegExMatch syntax of ~= and pipe-delimit the options.

if (var ~= "Mon|Tue|Wed|Thu|Fri|Sat|Sun")
MrDoge
Posts: 151
Joined: 27 Apr 2020, 21:29

Re: if var in/contains comma-separated list/array

02 Aug 2021, 21:45

this is the cleanest I've seen

this is the perfect equivalent in v2,
but I am one to care about performance..
disclaimer: you will likely never notice a difference

if only a syntax like this could be fastest, I think about python set: {"mon","tue","wed"}, which also uses a hash table, but without the cumbersome 1
I think AutoHotkey can store this in memory, and not recreate it every single time (pre-processing needed though)
it can know because in is a reserved word, what comes after should become a hash table stored in memory
but I wouldn't rely/bet on that coming

@JoeSchmoe
; #warn: consider optimizing this code.
what would you do if there was ALREADY a comment there ?
I think of options:
A. remove the comment
B. add this comment to the left
C. move the comment already there to the line below

I propose showing the suggestions on another textbox/textarea on the website
it will also show line number, column, you know what ? I should make a vscode extension, like that, you can easily go to the places
most importantly, you don't have to copy paste your code to the website AND copy paste back
and vscode has side-by-side diff: "compare function"
it can also read your files :) for #include
you never know what I'll do with them, I say: open source is useless if you can't understand the code, and my code is a mess
why am I inducing paranoia for no good reason? I have no solution, I install extensions without reading the code: for ex: GitLens, too much time/tl:dr
I can get a list of all network stuff, but how?
iseahound
Posts: 1427
Joined: 13 Aug 2016, 21:04
Contact:

Re: if var in/contains comma-separated list/array

02 Aug 2021, 22:29

I can tell you that regex is extremely fast, and string operations are in general faster than objects. Use the RegEx suggestion with the case insensitive option.

If you were to benchmark your code, you'd only be hitting the regex cache anyways. Even though objects are slow, they are not noticeably slower. I understand you want to benchmark, however, for a v1 to v2 converter, readability is much more important for facilitating adoption.
MrDoge
Posts: 151
Joined: 27 Apr 2020, 21:29

Re: if var in/contains comma-separated list/array

03 Aug 2021, 12:11

the correct versions (case-insensitive) are:
if (var ~= "i)Mon|Tue|Wed|Thu|Fri|Sat|Sun")
vs
if (Map("mon",1,"tue",1,"wed",1,"thu",1,"fri",1,"sat",1,"sun",1).Has(StrLower(var)))

-----------------
I did the benchmark and found out that this regex is not equivalent to if var in
@kczx3 @iseahound

Code: Select all

var:="monday"
if var in Mon,Tue,Wed,Thu,Fri,Sat,Sun
  msgbox 1
else
  msgbox 2
;2

Code: Select all

if (var ~= "i)Mon|Tue|Wed|Thu|Fri|Sat|Sun")
  msgbox 1
else
  msgbox 2
;1
if var in checks if var is any of these
while this regex is equivalent to InStr()

so I can't use this
-----------------
BUT, if you're still interested in the benchmarks :)

Code: Select all

ListLines 0
#SingleInstance force
SendMode "Input" ; Recommended for new scripts due to its superior speed and reliability.
SetWorkingDir A_ScriptDir ; Ensures a consistent starting directory.
#KeyHistory 0

;consider, true, false, lowercase, longer strings;
;consider loop x iterations

; var:="mon"
; var:="non"
; var:="MoN"
; var:="abcdefghijklmnopqrstuvwxyz"
; var:="monday" ;this will fail it
var:="sun"

Frequency := 0
Before := 0
After := 0
DllCall("QueryPerformanceFrequency", "Int64*", Frequency)
DllCall("QueryPerformanceCounter", "Int64*", Before)

QPC(1)
sleep 1000 ;0.988971
testN := QPC(0), QPC(1)

sum1:=0
loop 1234567 {
    if (var ~= "i)Mon|Tue|Wed|Thu|Fri|Sat|Sun") {
      sum1++
    }
}

test1 := QPC(0), QPC(1)

sum2:=0
loop 1234567 {
  if (Map("mon",1,"tue",1,"wed",1,"thu",1,"fri",1,"sat",1,"sun",1).Has(StrLower(var))) {
    sum2++
  }
}
test2 := QPC(0), QPC(1)

daysOfTheWeek:=Map()
daysOfTheWeek.CaseSense:="Off"
daysOfTheWeek.Set("mon",1,"tue",1,"wed",1,"thu",1,"fri",1,"sat",1,"sun",1)
sum3:=0
loop 1234567 {
  if (daysOfTheWeek.Has(var)) {
    sum3++
  }
}
test3 := QPC(0), QPC(1)

d "`"" var "`": " array_p([test1, test2, test3, sum1, sum2, sum3, testN])
; "mon": [0.43705080000000002, 3.2957676, 0.37818469999999998, 1234567, 1234567, 1234567, 0.99087840000000005]
; "sun": [0.50800420000000002, 3.3012972, 0.36907889999999999, 1234567, 1234567, 1234567, 1.0067090999999999]
; "non": [0.74503560000000002, 4.1980212000000003, 0.25033460000000002, 0, 0, 0, 1.0015638]
; "MoN": [0.4249423, 3.2777126000000001, 0.39381719999999998, 1234567, 1234567, 1234567, 0.99355260000000001]
; "abcdefghijklmnopqrstuvwxyz": [3.3660597999999999, 3.2118014000000001, 0.25318950000000001, 0, 0, 0, 0.98656829999999995]

Exitapp

QPC(R := 0)
{
    static P := 0, F := 0, Q := DllCall("QueryPerformanceFrequency", "Int64P", F)
    return ! DllCall("QueryPerformanceCounter", "Int64P", Q) + (R ? (P := Q) / F : (Q - P) / F) 
}

return

f3::Exitapp
interesting part here:
d "`"" var "`": " array_p([test1, test2, test3, sum1, sum2, sum3, testN])
; "mon": [0.43705080000000002, 3.2957676, 0.37818469999999998, 1234567, 1234567, 1234567, 0.99087840000000005]
; "sun": [0.50800420000000002, 3.3012972, 0.36907889999999999, 1234567, 1234567, 1234567, 1.0067090999999999]
; "non": [0.74503560000000002, 4.1980212000000003, 0.25033460000000002, 0, 0, 0, 1.0015638]
; "MoN": [0.4249423, 3.2777126000000001, 0.39381719999999998, 1234567, 1234567, 1234567, 0.99355260000000001]
; "abcdefghijklmnopqrstuvwxyz": [3.3660597999999999, 3.2118014000000001, 0.25318950000000001, 0, 0, 0, 0.98656829999999995]

"mon": regex is surprisingly fast
but saving Map() to variable is slightly faster
I realize that remaking the Map() every time is extremely slow
"sun": since |sun is the last |OR in the regex, it's slightly slower, while Map() is the same
"non": regex a bit slower when no match, Map() a bit faster!, I don't understand how
and somehow remaking Map() becomes a bit slower
"MoN": this is testing uppercase: I think there's no difference since to do case insensitive, you have to turn to lowercase anyways
"abcdefghijklmnopqrstuvwxyz": regex WAY slower, because it's like InStr() every possibility, while for Map() it's no match
you'd only be hitting the regex cache anyways.
wdym ? can you explain ?
readability is much more important for facilitating adoption.
can we get python set for this exact purpose ?
daysOfTheWeek:={"mon","tue","wed","thu","fri","sat","sun"}
or a built-in Map() creator function where every value is true or 1 ?
MapFill(true, "mon","tue","wed","thu","fri","sat","sun")
TrueMap("mon","tue","wed","thu","fri","sat","sun")
guest3456
Posts: 3453
Joined: 09 Oct 2013, 10:31

Re: if var in/contains comma-separated list/array

03 Aug 2021, 12:18

MrDoge wrote:
03 Aug 2021, 12:11
this regex is not equivalent to if var in
some ideas in this old thread:

https://www.autohotkey.com/boards/viewtopic.php?f=37&t=23033

MrDoge
Posts: 151
Joined: 27 Apr 2020, 21:29

Re: if var in/contains comma-separated list/array

03 Aug 2021, 13:48

thank you

by HotKeyIt:
!!InStr(del list del, del item del)
how do I convert this ?
if var in hello,,world!,oof
note that ,, is escape sequence for literal comma
this works for "hello,world!", but also works for "world!", which is incorrect
InStr(del "hello,world!,oof" del, del "hello,world!" del) ;1
InStr(del "hello,world!,oof" del, del "world!" del) ;7

by guest3456:
if RegExMatch(var, "^exe$|^bat$|^com$")
wow, that's some regex

Code: Select all

var:="monday"
if (var ~= "i)^Mon$|^Tue$|^Wed$|^Thu$|^Fri$|^Sat$|^Sun$")
  msgbox 1
else
  msgbox 2
; 2
perfect, we've got it

I will have to learn how to escape regex...
and the user too... "why does it have unintended results ?"

-----
I've thought about it: behold, my bias
inline Map() : those who don't care about performance won't notice,
those who care about performance will copy paste it to somewhere else to get best performance
and if they(or I) see someone else's code, it better be a Map() that they can copy to somewhere else than a regex, which takes more time to convert to Map() for best performance
those who care about readability should use a custom function, because this regex also hard to read, and it gets worse with regex escape sequence
I will provide such function as a suggestion
---
regex is readable inline "^Mon$|^Tue$" vs Map shows the name of the group: daysOfTheWeek
regex is fast
---
here are my benchmarks:

Code: Select all

ListLines 0
#SingleInstance force
SendMode "Input" ; Recommended for new scripts due to its superior speed and reliability.
SetWorkingDir A_ScriptDir ; Ensures a consistent starting directory.
#KeyHistory 0

;consider, true, false, lowercase, longer strings;
;consider loop x iterations

; var:="mon"
; var:="sun"
; var:="non"
; var:="MoN"
; var:="abcdefghijklmnopqrstuvwxyz"
var:="monday" ;this will fail it

Frequency := 0
Before := 0
After := 0
DllCall("QueryPerformanceFrequency", "Int64*", Frequency)
DllCall("QueryPerformanceCounter", "Int64*", Before)

QPC(1)
sleep 1000 ;0.988971
testN := QPC(0), QPC(1)

sum1:=0
loop 1234567 {
    if (var ~= "i)^Mon$|^Tue$|^Wed$|^Thu$|^Fri$|^Sat$|^Sun$") {
      sum1++
    }
}

test1 := QPC(0), QPC(1)

sum2:=0
loop 1234567 {
  if (Map("mon",1,"tue",1,"wed",1,"thu",1,"fri",1,"sat",1,"sun",1).Has(StrLower(var))) {
    sum2++
  }
}
test2 := QPC(0), QPC(1)

daysOfTheWeek:=Map()
daysOfTheWeek.CaseSense:="Off"
daysOfTheWeek.Set("mon",1,"tue",1,"wed",1,"thu",1,"fri",1,"sat",1,"sun",1)
sum3:=0
loop 1234567 {
  if (daysOfTheWeek.Has(var)) {
    sum3++
  }
}
test3 := QPC(0), QPC(1)

d "`"" var "`": " array_p([test1, test2, test3, sum1, sum2, sum3, testN])
; "mon": [0.43705080000000002, 3.2957676, 0.37818469999999998, 1234567, 1234567, 1234567, 0.99087840000000005]
; "sun": [0.50800420000000002, 3.3012972, 0.36907889999999999, 1234567, 1234567, 1234567, 1.0067090999999999]
; "non": [0.74503560000000002, 4.1980212000000003, 0.25033460000000002, 0, 0, 0, 1.0015638]
; "MoN": [0.4249423, 3.2777126000000001, 0.39381719999999998, 1234567, 1234567, 1234567, 0.99355260000000001]
; "abcdefghijklmnopqrstuvwxyz": [3.3660597999999999, 3.2118014000000001, 0.25318950000000001, 0, 0, 0, 0.98656829999999995]

; "mon": regex is surprisingly fast
; but saving Map() to variable is slightly faster
; I realize that remaking the Map() every time is extremely slow
; "sun": since |sun is the last |OR in the regex, it's slightly slower, while Map() is the same
; "non": regex a bit slower when no match, Map() a bit faster!, I don't understand how
; and somehow remaking Map() becomes a bit slower
; "MoN": this is testing uppercase: I think there's no difference since to do case insensitive, you have to turn to lowercase anyways
; "abcdefghijklmnopqrstuvwxyz": regex WAY slower, because it's like InStr() every possibility, while for Map() it's no match

;FIXED REGEX
; "mon": [0.48089080000000001, 3.3009287, 0.3795789, 1234567, 1234567, 1234567, 1.0023203999999999]
; "non": [0.39596930000000002, 4.2182253000000003, 0.2509229, 0, 0, 0, 0.98815609999999998]
; "MoN": [0.4906201, 4.3829558000000004, 0.38037850000000001, 1234567, 1234567, 1234567, 1.0023477000000001]
; "abcdefghijklmnopqrstuvwxyz": [0.40325549999999999, 3.1940783000000001, 0.24873770000000001, 0, 0, 0, 0.99239500000000003]
; "monday": [0.45882659999999997, 3.1579256, 0.25448559999999998, 0, 0, 0, 0.9965735]
; "sun": [0.58207660000000006, 3.3259618999999998, 0.36763050000000003, 1234567, 1234567, 1234567, 0.99519139999999995]

Exitapp

QPC(R := 0)
{
    static P := 0, F := 0, Q := DllCall("QueryPerformanceFrequency", "Int64P", F)
    return ! DllCall("QueryPerformanceCounter", "Int64P", Q) + (R ? (P := Q) / F : (Q - P) / F) 
}

return

f3::Exitapp
;FIXED REGEX
; "mon": [0.48089080000000001, 3.3009287, 0.3795789, 1234567, 1234567, 1234567, 1.0023203999999999]
; "non": [0.39596930000000002, 4.2182253000000003, 0.2509229, 0, 0, 0, 0.98815609999999998]
; "MoN": [0.4906201, 4.3829558000000004, 0.38037850000000001, 1234567, 1234567, 1234567, 1.0023477000000001]
; "abcdefghijklmnopqrstuvwxyz": [0.40325549999999999, 3.1940783000000001, 0.24873770000000001, 0, 0, 0, 0.99239500000000003]
; "monday": [0.45882659999999997, 3.1579256, 0.25448559999999998, 0, 0, 0, 0.9965735]
; "sun": [0.58207660000000006, 3.3259618999999998, 0.36763050000000003, 1234567, 1234567, 1234567, 0.99519139999999995]
regex is never really slower than Map()
regex can be faster if if var in has less elements
while inline Map() is VERY slow

I have a bias against regex (regex=slow), I can't forget this is regex if I have to edit the list
User avatar
kczx3
Posts: 1640
Joined: 06 Oct 2015, 21:39

Re: if var in/contains comma-separated list/array

03 Aug 2021, 14:28

if RegExMatch(var, "^exe$|^bat$|^com$"
I considered that approach as well which would work as long as what you're trying to match is only a single line string.

You could probably get rid of all the ^ and $ though.

Code: Select all

var := "tue"
if (var ~= "i)^(Mon|Tue|Wed|Thu|Fri|Sat|Sun)$")
  msgbox 1
else
  msgbox 2
; 2
MrDoge
Posts: 151
Joined: 27 Apr 2020, 21:29

Re: if var in/contains comma-separated list/array

03 Aug 2021, 15:19

omg regex madness, works
I'll add/make it a non-capturing group
"i)^(?:Mon|Tue|Wed|Thu|Fri|Sat|Sun)$"
User avatar
JoeSchmoe
Posts: 129
Joined: 08 Dec 2014, 08:58

Re: if var in/contains comma-separated list/array

03 Aug 2021, 20:24

I'm glad you folks are able to think about this because it is waaaaayyyyy over my head. :facepalm: :facepalm: :bravo: :clap:
SOTE
Posts: 1426
Joined: 15 Jun 2015, 06:21

Re: if var in/contains comma-separated list/array

03 Aug 2021, 21:14

Sorry, but from a readability and user friendly perspective, the v2 version looks worse, not better. Guess that's just how it is.
User avatar
vvhitevvizard
Posts: 454
Joined: 25 Nov 2018, 10:15
Location: Russia

Re: if var in/contains comma-separated list/array

04 Aug 2021, 04:04

In my humble opinion, readability and user-friendliness of V2 is MUCH BETTER compared to V1!!! Since I had moved from V1 to V2 ~5 years ago, I never thought to come back.
V1's syntax, weirdness and nonuniformity is atrocious.
MrDoge
Posts: 151
Joined: 27 Apr 2020, 21:29

Re: if var in/contains comma-separated list/array

04 Aug 2021, 11:37

ty @kczx3

I've benchmarked
"i)^Mon$|^Tue$|^Wed$|^Thu$|^Fri$|^Sat$|^Sun$"
"i)^(Mon|Tue|Wed|Thu|Fri|Sat|Sun)$"
"i)^(?:Mon|Tue|Wed|Thu|Fri|Sat|Sun)$"

Code: Select all

ListLines 0
#SingleInstance force
SendMode "Input" ; Recommended for new scripts due to its superior speed and reliability.
SetWorkingDir A_ScriptDir ; Ensures a consistent starting directory.
#KeyHistory 0

; https://www.autohotkey.com/boards/viewtopic.php?p=413490#p413507

;consider, true, false, lowercase, longer strings;
;consider loop x iterations

; var:="mon"
; var:="sun"
; var:="non"
; var:="MoN"
; var:="abcdefghijklmnopqrstuvwxyz"
var:="monday" ;this will fail it

Frequency := 0
Before := 0
After := 0
DllCall("QueryPerformanceFrequency", "Int64*", Frequency)
DllCall("QueryPerformanceCounter", "Int64*", Before)

QPC(1)
sleep 1000 ;0.988971
testN := QPC(0), QPC(1)

sum1:=0
loop 1234567 {
    if (var ~= "i)^Mon$|^Tue$|^Wed$|^Thu$|^Fri$|^Sat$|^Sun$") {
      sum1++
    }
}
test1 := QPC(0), QPC(1)

sum2:=0
loop 1234567 {
  if (var ~= "i)^(Mon|Tue|Wed|Thu|Fri|Sat|Sun)$") {
      sum2++
  }
}
test2 := QPC(0), QPC(1)

sum3:=0
loop 1234567 {
  if (var ~= "i)^(?:Mon|Tue|Wed|Thu|Fri|Sat|Sun)$") {
    sum3++
  }
}
test3 := QPC(0), QPC(1)

d "`"" var "`": " array_p([test1, test2, test3, sum1, sum2, sum3, testN])
; "mon": [0.47936200000000001, 0.47338550000000001, 0.48064449999999997, 1234567, 1234567, 1234567, 0.98830070000000003]
; "sun": [0.58661470000000004, 0.56225239999999999, 0.53599859999999999, 1234567, 1234567, 1234567, 0.98620229999999998]
; "non": [0.39592349999999998, 0.37366129999999997, 0.36871480000000001, 0, 0, 0, 1.0000568000000001]
; "MoN": [0.48930030000000002, 0.4619935, 0.46496920000000003, 1234567, 1234567, 1234567, 0.99696839999999998]
; "abcdefghijklmnopqrstuvwxyz": [0.3966017, 0.37425510000000001, 0.36768050000000002, 0, 0, 0, 0.99253190000000002]
; "monday": [0.44674700000000001, 0.43358439999999998, 0.41844819999999999, 0, 0, 0, 0.99075380000000002]

Exitapp

QPC(R := 0)
{
    static P := 0, F := 0, Q := DllCall("QueryPerformanceFrequency", "Int64P", F)
    return ! DllCall("QueryPerformanceCounter", "Int64P", Q) + (R ? (P := Q) / F : (Q - P) / F) 
}

return

f3::Exitapp
; "mon": [0.47936200000000001, 0.47338550000000001, 0.48064449999999997, 1234567, 1234567, 1234567, 0.98830070000000003]
; "sun": [0.58661470000000004, 0.56225239999999999, 0.53599859999999999, 1234567, 1234567, 1234567, 0.98620229999999998]
; "non": [0.39592349999999998, 0.37366129999999997, 0.36871480000000001, 0, 0, 0, 1.0000568000000001]
; "MoN": [0.48930030000000002, 0.4619935, 0.46496920000000003, 1234567, 1234567, 1234567, 0.99696839999999998]
; "abcdefghijklmnopqrstuvwxyz": [0.3966017, 0.37425510000000001, 0.36768050000000002, 0, 0, 0, 0.99253190000000002]
; "monday": [0.44674700000000001, 0.43358439999999998, 0.41844819999999999, 0, 0, 0, 0.99075380000000002]

pretty much no difference
grouping it even makes it faster

I found a function to escape regex : https://stackoverflow.com/questions/3561493/is-there-a-regexp-escape-function-in-javascript#3561711
OR I could wrap each match in \Q...\E : https://lexikos.github.io/v2/docs/misc/RegEx-QuickRef.htm#fundamentals
I also need to escape doubleQuote first: " -> `"

I'll convert to this by default, and suggest Map() for maximum performance..
User avatar
nnnik
Posts: 4500
Joined: 30 Sep 2013, 01:01
Location: Germany

Re: if var in/contains comma-separated list/array

25 Sep 2021, 06:31

iseahound wrote:
02 Aug 2021, 22:29
I can tell you that regex is extremely fast, and string operations are in general faster than objects. Use the RegEx suggestion with the case insensitive option.

If you were to benchmark your code, you'd only be hitting the regex cache anyways. Even though objects are slow, they are not noticeably slower. I understand you want to benchmark, however, for a v1 to v2 converter, readability is much more important for facilitating adoption.
What makes you think that objects are slower?
In the backend the lookup operation of an object is a limited subset of what regex can do with a more optimized implementation.
So I do not see how you can assume that regex is faster.

On another note I want to tell anyone that does care that the Map far outperforms the RegExMatch method for larger (still small but in comparison to 7 entries) lookup sizes.
This is due to the fact how the algorythms search.

Example script:

Code: Select all

var:="2000" ; last entry

QPC(1)
sleep 1000 ;0.988971
testN := QPC(0), QPC(1)

sum1:=0
regex := generateRegex()
loop 1234567 {
    if (var ~= regex) {
      sum1++
    }
}

test1 := QPC(0), QPC(1)

daysOfTheWeek:=Map()
daysOfTheWeek.CaseSense:="Off"
addToMap(daysOfTheWeek)
sum3:=0
loop 1234567 {
  if (daysOfTheWeek.Has(var)) {
    sum3++
  }
}
test3 := QPC(0), QPC(1)

Msgbox "RegexMatch produced " . sum1 " hits in " . test1 . " seconds"
      . "`nthe map produced " . sum3 " hits in " . test3 . " seconds"

Exitapp

QPC(R := 0)
{ ; Had to adapt some stuff here - not sure why - did I not installthe latest version?
    static P := 0, F := 0, Q := DllCall("QueryPerformanceFrequency", "Int64*", &F)
    return ! DllCall("QueryPerformanceCounter", "Int64*", &Q) + (R ? (P := Q) / F : (Q - P) / F)
}

generateRegex() {
    s := "iS)^(" ;S does not appear to make a large difference
    loop (1999) {
        s .= A_Index . "|"
    }
    return s . "|2000)$"
}

addToMap(mapToAddTo) {
    Loop 2000 {
        mapToAddTo.Set(A_Index . "", 1)
    }
}

return

f3::Exitapp
The results are:
RegexMatch produced 1234567 hits in 25.340122900000001 seconds
the map produced 1234567 hits in 0.21450569999999999 seconds
For such small lookups you usually don't need to care about the speed of such an operation.
Recommends AHK Studio
iseahound
Posts: 1427
Joined: 13 Aug 2016, 21:04
Contact:

Re: if var in/contains comma-separated list/array

25 Sep 2021, 15:45

Sorry, can you use ^(?i:Mon|Tue|Wed|Thu|Fri|Sat|Sun)$ instead? Placing the case insensitive option as part of the non-capturing group makes it less dependent on AutoHotkey specific syntax, and looks cleaner.



Even though objects are slow, they are not noticeably slower
@nnnik I'm talking about the fact the object overhead is greater than strings and regex. Both is fine, but for a converter... you are more likely to encounter one-off matching... and therefore the fixed overhead of calling Map() would be greater than a regex string literal. Likewise, hitting the regex cache is a good thing... in a converter a map object would have to be identified and saved in order to achieve efficiency... but if you use regex strings they are compiled and stored in the cache, resulting in cleaner overall code if a script relies on the same matches.

I also modified your benchmark script to run the tests side-by-side. This minimizes conditional branching (x86 processor optimization) and your map lookup has doubled. I also removed the study option on the regex, it helps but its not that noticeable. Since you put map creation outside the loop, you are not really benchmarking a map vs a regex.
code
Just to give you an idea of what is going on: A map lookup is a binary lookup of O(log n). A Regex search is linear search O(n). So here map wins.
But the cost of sorting your list via Map() is probably whatever the sort algorithm uses: at least O(n). A regex string literal is a fixed constant k. So here regex wins.
When I put the map creation inside the loop it took forever so i just exited the program.

EDIT: I meant to say Branch Prediction instead of conditional branching. Got the terms mixed up.
Last edited by iseahound on 03 Oct 2021, 20:05, edited 1 time in total.
User avatar
nnnik
Posts: 4500
Joined: 30 Sep 2013, 01:01
Location: Germany

Re: if var in/contains comma-separated list/array

25 Sep 2021, 17:49

@iseahound I don't think I will adapt my regex as most regexes are language specific so I don't see the point in even attempting at making that one common syntax.
string operations are in general faster than objects
Could you explain what makes you say that?
This clearly isn't about object creation - this is about normal string operations generally being faster than objects.
I do not deny that object creation in AutoHotkey is painfully slow.

I also doubt that most performance gains happens in the branch lookup but rather in the caching - the branch prediction should still happen anyways for that particular piece of code.
I also think that most of the performance losses in your code stem from executing a DllCall every time you do a Map lookup.
If we do want to prevent that though we will have to try accessing random entries in the list.

Weren't there plans to switch to a hash based map or am I misremembering things?
Just for fun I scaled up the size of the entries in the map and had time increases that were essentially neglecteable indicating that most of the execution time does not stem from the actual binary search but that its mostly overhead from the interpreter.
Recommends AHK Studio
iseahound
Posts: 1427
Joined: 13 Aug 2016, 21:04
Contact:

Re: if var in/contains comma-separated list/array

25 Sep 2021, 18:39

^(?i:Mon|Tue|Wed|Thu|Fri|Sat|Sun)$ is for the person writing the interpreter.

Return to “AutoHotkey Development”

Who is online

Users browsing this forum: No registered users and 18 guests