 |
AutoHotkey Community Let's help each other out
|
| View previous topic :: View next topic |
| Author |
Message |
jaco0646
Joined: 07 Oct 2006 Posts: 2300 Location: MN, USA
|
Posted: Sat Mar 21, 2009 12:54 am Post subject: "if Var in MatchList" is very slooooow |
|
|
[Moved from Ask for Help forum.]
Could the solution offered here be added to the official release?
Can someone explain why "if Var in MatchList" is so much slower than a parsing loop, when they should be doing the same thing? Actually, I would expect the parsing loop to be a bit slower since it references a third variable, A_LoopField, instead of just a needle and matchlist.
| Code: | SetBatchLines, -1
Loop, 100000
matchlist .= A_Index ","
needle = 100000
Start_Time := A_TickCount
If needle IN %matchlist%
{
Stop_Time := A_TickCount
time1 := (Stop_Time - Start_Time)/1000 " seconds."
}
Start_Time := A_TickCount
Loop, Parse, matchlist, `,
If (A_LoopField = needle)
{
Stop_Time := A_TickCount
time2 := (Stop_Time - Start_Time)/1000 " seconds."
}
MsgBox, The "IN" operator took:`t%time1%`nThe parsing loop took:`t%time2% |
On my laptop, the IN operator is more than 10 times slower (1.53 seconds versus 0.13)  _________________ http://autohotkey.net/~jaco0646/
Last edited by jaco0646 on Wed Mar 25, 2009 5:31 am; edited 2 times in total |
|
| Back to top |
|
 |
Guest
|
Posted: Sat Mar 21, 2009 3:37 am Post subject: |
|
|
I believe that the number of comaprisons required for the IN command is what is causing the extra time.
Look at it this way. Your needle is 588895 bytes long and contains 100000 ","s.
For the loop, it must do:
588895 searches for ","
and compare the previous 6 bytes (100000 compares - one compare for each found ",")
For the IN it must do:
588895 compares (one 6 byte compare at each character postion in matchlist) |
|
| Back to top |
|
 |
Frankie
Joined: 02 Nov 2008 Posts: 828
|
Posted: Sat Mar 21, 2009 3:39 am Post subject: |
|
|
I tested it and your right. A work around: | Code: | Fruits := "Apple,Orange,Banana"
If In("Apple", Fruits)
Msgbox In("Apple", Fruits) = True
Else
Msgbox In("Apple", Fruits) = False
If In("Cookie", Fruits)
Msgbox In("Cookie", Fruits) = True
Else
Msgbox In("Cookie", Fruits) = False
In(String, MatchList, D=",") {
Loop, Parse, matchlist, `,
If (A_LoopField = String)
Return, True
Return, False
} |
_________________ Click here to join #AHK channel in IRC for general chat and quick help. |
|
| Back to top |
|
 |
jaco0646
Joined: 07 Oct 2006 Posts: 2300 Location: MN, USA
|
Posted: Sat Mar 21, 2009 4:37 am Post subject: |
|
|
I guess my question was, "Is there any difference between the IN operator and a parsing loop?" Because I don't see one... whatever the IN operator is doing is wasteful: it's accomplishing the same task, but 10x slower. It's like a compact car with the fuel mileage of a Hummer. What's the point?
Shouldn't it be programmed to just perform a parsing loop internally if that's the faster method? _________________ http://autohotkey.net/~jaco0646/ |
|
| Back to top |
|
 |
Lexikos
Joined: 17 Oct 2006 Posts: 4726 Location: Australia
|
Posted: Sat Mar 21, 2009 5:08 am Post subject: |
|
|
They work differently, since if-in must support ,, to mean a literal comma. The actual work is done by IsStringInList() in util.cpp. I think the "wasteful" part might be:
| Code: | while (*this_field) // For each field in aList.
{
// To avoid the need to constantly check for buffer overflow (i.e. to keep it simple),
// just copy up to the limit of the buffer:
strlcpy(buf, this_field, sizeof(buf));
... | For each field in the list, it copies up to LINE_SIZE (16384+1) characters from the list itself. That is, it potentially copies from the beginning of the current field to the end of the list, including all fields after it. (If each field was exactly 16384 characters, there would be no wastage.)
Parsing loops, on the other hand, create a single copy of the entire input string once. This copy is referred to for A_LoopField, rather than copying each field/at each iteration.
Note that "10x slower" isn't really accurate, as it depends on the length of each field and the overall length of the list. On my system, your test shows it is "20x slower". |
|
| Back to top |
|
 |
Lexikos
Joined: 17 Oct 2006 Posts: 4726 Location: Australia
|
Posted: Sat Mar 21, 2009 5:34 am Post subject: |
|
|
Unless I have made an error,
| Code: | char *this_field = aList, *next_field, *cp;
while (*this_field) // For each field in aList.
{
// To avoid the need to constantly check for buffer overflow (i.e. to keep it simple),
// just copy up to the limit of the buffer:
strlcpy(buf, this_field, sizeof(buf));
// Find the end of the field inside buf. In keeping with the tradition set by the Input command,
// this always uses comma rather than g_delimiter.
for (cp = buf, next_field = this_field; *cp; ++cp, ++next_field)
{
if (*cp == ',')
{
if (cp[1] == ',') // Make this pair into a single literal comma.
{
memmove(cp, cp + 1, strlen(cp + 1) + 1); // +1 to include the zero terminator.
++next_field; // An extra increment since the source string still has both commas of the pair.
}
else // this comma marks the end of the field.
{
*cp = '\0'; // Terminate the buffer to isolate just the current field.
break;
}
}
}
if (*next_field) // The end of the field occurred prior to the end of aList.
++next_field; // Point it to the character after the delimiter (otherwise, leave it where it is). | may be replaced with | Code: | char *this_field = aList, *next_field, *cp, *end_buf = buf + sizeof(buf) - 1;
while (*this_field) // For each field in aList.
{
for (cp = buf, next_field = this_field; cp < end_buf && *next_field; ++cp, ++next_field)
{
if (*next_field == ',') // Check if this is a delimiter or literal comma.
{
++next_field;
if (*next_field != ',') // Was it a delimiter?
break;
// Otherwise next_field now points at the second comma, so continue copying.
}
*cp = *next_field;
}
// Terminate the string in the buffer.
*cp = '\0'; | to the same effect, but considerably faster in most cases. Actually, with fields of approximately 16300 characters, both if-in methods were roughly the same, and both were faster than Loop,Parse. I think short fields are probably more common. |
|
| Back to top |
|
 |
nick
Joined: 24 Aug 2005 Posts: 550 Location: Berlin / Germany
|
Posted: Sat Mar 21, 2009 1:30 pm Post subject: |
|
|
"If Var in" should be marked as obsolet:
| Code: | #NoEnv
SetBatchLines, -1
SetFormat, Float, 0.15
Loop, 100
matchlist .= A_Index . ","
needle = 50
T()
If needle IN %matchlist%
time1 := T() . " seconds."
T()
If InStr("," . Matchlist, "," . Needle . ",")
time2 := T() . " seconds."
T()
Loop, Parse, matchlist, `,
If (A_LoopField = needle) {
time3 := T() . " seconds."
break
}
MsgBox, The "IN" operator took:`t%time1%`nInStr() function took:`t%time2%`nThe parsing loop took:`t%time3%
ExitApp
T() {
Static freq, last_count
if !freq
DllCall("QueryPerformanceFrequency","int64*",freq)
DllCall("QueryPerformanceCounter","int64*",count)
return (count-last_count)/freq, last_count:=count
} |
_________________ nick  |
|
| Back to top |
|
 |
Lexikos
Joined: 17 Oct 2006 Posts: 4726 Location: Australia
|
Posted: Sat Mar 21, 2009 3:55 pm Post subject: |
|
|
It is pointless having 15 digits of "precision" when the "benchmark" results are inconsistent even with 5 digits. To be more accurate, repeat the test many times and average the result.
| Code: | #NoEnv
SetBatchLines, -1
SetFormat, Float, 0.7
i = 10000
Loop, 100
matchlist .= A_Index . ","
needle = 50
T()
Loop %i%
If needle IN %matchlist%
continue ; Do nothing.
time1 := T()/i . " seconds."
T()
Loop %i%
If InStr("," . Matchlist, "," . Needle . ",")
continue ; Do nothing.
time2 := T()/i . " seconds."
T()
Loop %i%
Loop, Parse, matchlist, `,
If (A_LoopField = needle)
break ; Found match, just break inner loop and continue benchmark.
time3 := T()/i . " seconds."
MsgBox, The "IN" operator took:`t%time1%`nInStr() function took:`t`t%time2%`nThe parsing loop took:`t%time3%
ExitApp | Anyway, what was your point? I already pointed out an optimization, which brings if-in close to the InStr method. If-in still has readability benefits, which is likely to be more important than performance in many cases. |
|
| Back to top |
|
 |
jaco0646
Joined: 07 Oct 2006 Posts: 2300 Location: MN, USA
|
Posted: Sun Mar 22, 2009 5:24 am Post subject: |
|
|
| Lexikos wrote: | | If-in still has readability benefits, which is likely to be more important than performance in many cases. | Absolutely, I would guess that readability is the main reason IF-IN was added to AHK at all. I just thought the performance gap between the two methods could be shortened, which your optimization proves. Thank you for weighing in on this, Lexikos; I had hoped you would. _________________ http://autohotkey.net/~jaco0646/ |
|
| Back to top |
|
 |
nick
Joined: 24 Aug 2005 Posts: 550 Location: Berlin / Germany
|
Posted: Sun Mar 22, 2009 9:44 am Post subject: |
|
|
Well, so keep it, I don't need it.  _________________ nick  |
|
| Back to top |
|
 |
hugov
Joined: 27 May 2007 Posts: 3273
|
Posted: Mon Mar 23, 2009 1:10 pm Post subject: |
|
|
Thanks jaco0646, I noticed IF IN Matchlist was slow for some of the
scripts I made, but never considered to change to InStr, it really
is much much faster now I changed it. _________________ Tut 4 Newbies
TF : Text file & string lib, TF Forum |
|
| Back to top |
|
 |
Lexikos
Joined: 17 Oct 2006 Posts: 4726 Location: Australia
|
Posted: Thu Mar 26, 2009 11:52 am Post subject: |
|
|
| Anyone interested in testing my optimization may download the latest revision (currently 22) of AutoHotkey_L. |
|
| Back to top |
|
 |
hugov
Joined: 27 May 2007 Posts: 3273
|
Posted: Fri Mar 27, 2009 10:52 am Post subject: |
|
|
| Lexikos wrote: | | Anyone interested in testing my optimization may download the latest revision (currently 22) of AutoHotkey_L. | Yep, If IN Matchlist is now as fast as InStr for the scripts I've tested it with. _________________ Tut 4 Newbies
TF : Text file & string lib, TF Forum |
|
| Back to top |
|
 |
Chris Site Admin
Joined: 02 Mar 2004 Posts: 10666
|
Posted: Wed Apr 15, 2009 6:45 pm Post subject: |
|
|
| This change has been applied to v1.0.48.01. Thanks for the code and the testing. |
|
| Back to top |
|
 |
|
|
You can post new topics in this forum You can reply to topics in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|