| View previous topic :: View next topic |
| Author |
Message |
Atomhrt
Joined: 02 Sep 2004 Posts: 128 Location: Sunnyvale
|
Posted: Thu Apr 06, 2006 4:59 pm Post subject: |
|
|
| Laszlo wrote: | | You can use the CharChange function below, which is slower, but can replace a char with an arbitrary string. |
Cool. Thanks! _________________ I am he of whom he speaks! |
|
| Back to top |
|
 |
corrupt
Joined: 29 Dec 2004 Posts: 2436
|
Posted: Thu Apr 06, 2006 8:58 pm Post subject: |
|
|
| A safer way would probably be to use the addresses returned for the Null characters to retrieve the results and do what needs to be done at that time and/or save each result to a separate variable. The function could be modified to create an array (well... a sudo array in AHK) containing the results and output the number of results that were stored. That way you would have each result in a separate variable and know how many results were received. Instead of an array the results could be stored in a listview control or wherever needed. As the position of each Null character is known, the strings can be extracted and stored (maybe using lstrcpyn with DllCall) and/or used as needed. |
|
| Back to top |
|
 |
Laszlo
Joined: 14 Feb 2005 Posts: 4078 Location: Pittsburgh
|
Posted: Thu Apr 06, 2006 9:10 pm Post subject: |
|
|
| corrupt wrote: | | A safer way would probably be to use the addresses returned for the Null characters to retrieve the results and do what needs to be done at that time and/or save each result to a separate variable. | What dangers do you see? The functions only manipulate as many characters as the dll call sets, and the CharChange function only reads from there. If you replace all the NUL's with `n, which should never occur in the result, you can follow up with any of the standard AHK commands, like StringSplit, LoopParse or StringReplace. It is flexible and intuitive and you don't compute what is not needed. |
|
| Back to top |
|
 |
corrupt
Joined: 29 Dec 2004 Posts: 2436
|
Posted: Thu Apr 06, 2006 10:11 pm Post subject: |
|
|
| Maybe I'm misunderstanding (I'll take a closer look at your code)... Is there an advantage to using the CharChange function over the method I used in the generic function I posted (other than changing the default character used) for replacing the null characters? At a glance, the CharChange function seems to loop through the resulting string character by character which doesn't look like it would be as efficient. |
|
| Back to top |
|
 |
Laszlo
Joined: 14 Feb 2005 Posts: 4078 Location: Pittsburgh
|
Posted: Thu Apr 06, 2006 10:36 pm Post subject: |
|
|
| corrupt wrote: | | ...it would be as efficient. | You have to do some benchmarking to learn the speed relations at typical applications. StringReplace goes over the characters in the buffer once, that is, its running time is proportional to the length of the buffer. Your code checks the length of the string, which requires a scan to the first NUL (internal to AHK). If the string length is less than the buffer, replace the NUL and repeat. The beginning of the buffer is scanned again, and again, which leads to a quadratic running time. For long buffers containing many NUL's, this is slower.
But the main advantage of StringReplace or StringChange is their generality: you can use them for replacing other funny characters, too. I am not sure, whether "StringReplace x, x, % Chr(c1), % Chr(c2), All" always works. |
|
| Back to top |
|
 |
PhiLho
Joined: 27 Dec 2005 Posts: 6721 Location: France (near Paris)
|
Posted: Thu Apr 06, 2006 10:43 pm Post subject: |
|
|
The CharChange loops over the buffer only once, changing chars on the fly.
Your loop seems efficient, but I think that internally (inside AutoHotkey), the StrLen calls C language' strlen, which loops over the string until it finds a zero byte. So you loop over the first part, replace the zero, loop again on first part then second part, replace, loop on the first, second and third part, and so on... Even if C is faster than an AHK loop, it is probably still more costly than Laszlo's code.
Again, I doubt user will see a difference on a modern computer, but if you like micro optimizations...
In C, you could have optimized your code by doing the StrLen starting on the last replacement position, but I am not sure if it is possible in AutoHotkey.
[EDIT] Laszlo typed faster than me.  _________________
vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2") |
|
| Back to top |
|
 |
corrupt
Joined: 29 Dec 2004 Posts: 2436
|
Posted: Sat Apr 08, 2006 12:07 am Post subject: |
|
|
I was aware that the method I used scanned multiple times internally but it does test faster with the files I used for testing. However, since I needed to test both methods using large loops to see much of a difference and since actual usage probably wouldn't require multiple loops and will depend on the data that is being read, it's probably more a matter of preference. Thanks for the answers . |
|
| Back to top |
|
 |
corrupt
Joined: 29 Dec 2004 Posts: 2436
|
Posted: Sun Apr 09, 2006 1:10 am Post subject: |
|
|
After doing a bit more testing I found a couple ways to speed things up a bit more for fun . Instead of using StrLen to search for the next NULL character from the beginning each time, lstrlen can be used with DllCall to specify a different starting address to search from. Surprisingly though, one of the largest speed increases was noticed by using quotes around variable types in the DllCall lines. For example, using "UInt" seems to be approximately twice as fast as UInt. Here's a generic function for either replacing or removing NULL characters that seems quite fast. It only removes/replaces NULL characters though. My interest in this code is that it will also help to speed up CMDret functions.
| Code: | NullReplace(Byref StrInOut, StrSize=0, RepChar=0)
{
NULLptr=0
If (StrSize > 0)
TRead = %StrSize%
Else {
TRead := VarSetCapacity(StrInOut)
StrSize = %TRead%
}
IF (StrLen(StrInOut) < TRead) {
Repeat, %StrSize%
NULLptr += DllCall("lstrlen", "UInt", (&StrInOut + NULLptr))
IF (NULLptr = TRead)
break
If RepChar
DllCall("RtlFillMemory", "UInt",(&StrInOut + NULLptr), "UInt","1", "UChar",RepChar)
Else {
DllCall("RtlMoveMemory", "UInt", (&StrInOut + NULLptr), "UInt", (&StrInOut + NULLptr + 1), "Int", Tread - NULLptr)
Tread --
}
EndRepeat
}
Return
}
|
If StrSize isn't specified or is set to 0 the function will search through the entire variable (VarSetCapacity). If RepChar isn't specified or is set to 0 the function will remove the NULL characters instead of replacing them.
Edit: I had missed adding quotes around UInt in the lstrlen call which now seems to add quite a bit more speed .
Last edited by corrupt on Sun Apr 09, 2006 3:36 am; edited 1 time in total |
|
| Back to top |
|
 |
Laszlo
Joined: 14 Feb 2005 Posts: 4078 Location: Pittsburgh
|
Posted: Sun Apr 09, 2006 2:41 am Post subject: |
|
|
Nice! (I changed the <UInt>'s to <"UInt">'s in my version of Dippy46's analog clock, which speeds it up noticeably, as you say. Thanks!)
When removing the NUL's you shift the whole tail of the string one position to the left, repeatedly, which makes this version still quadratic in running time. If you moved only the portion to the next NUL, and remember to jump over the garbage left, you could save a lot of memory move operations. I wonder if the added complexity is worth the speedup at long buffers. |
|
| Back to top |
|
 |
corrupt
Joined: 29 Dec 2004 Posts: 2436
|
Posted: Sun Apr 09, 2006 3:53 am Post subject: |
|
|
Good point with the memory move operations . I'm a bit surprised that the speed is as good as it is considering... |
|
| Back to top |
|
 |
Chris Site Admin
Joined: 02 Mar 2004 Posts: 10480
|
Posted: Sun Apr 09, 2006 11:39 pm Post subject: |
|
|
| Thanks for mentioning the performance difference between a quoted "int" and an unquoted one. I'll check if that can be remedied. |
|
| Back to top |
|
 |
PhiLho
Joined: 27 Dec 2005 Posts: 6721 Location: France (near Paris)
|
Posted: Mon Apr 10, 2006 5:52 am Post subject: |
|
|
Aha, I always used the version with double quotes, because it looked nicer in my editor... So it was a good deal too... _________________
vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2") |
|
| Back to top |
|
 |
Chris Site Admin
Joined: 02 Mar 2004 Posts: 10480
|
Posted: Mon Apr 10, 2006 1:21 pm Post subject: |
|
|
I think the reason for the worse performance is that any empty variable is considered a possible environment variable, which causes a call to the OS's GetEnvironmentVariable() [which is slow in performance].
There is a plan to have a directive such as #EnvVar Off to turn off automatic fetching of environment variables, which should solve this (as most of you know, it will also solve other serious problems that have been raised). |
|
| Back to top |
|
 |
PhiLho
Joined: 27 Dec 2005 Posts: 6721 Location: France (near Paris)
|
Posted: Mon Apr 10, 2006 1:26 pm Post subject: |
|
|
So these are really variables falling to the now classical pitfall?
I suppose these variables are defined only when calling DllCall, because if I write MsgBox %UInt%/%Str%, I get empty strings... _________________
vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2") |
|
| Back to top |
|
 |
Chris Site Admin
Joined: 02 Mar 2004 Posts: 10480
|
Posted: Mon Apr 10, 2006 1:54 pm Post subject: |
|
|
Edit: In v1.0.43.08+, if you use #NoEnv, the performance issue with DllCall is solved.
Older comments (somewhat obsolete):
When you use an unquoted type like "int" with DllCall, all the early stages of expression evaluation see it as an empty variable. Only when DllCall is actually called does it recognize these special variable names as what they were intended. This does not make variables like "int" into reserved names (keywords): they can still be used as though they're normal variables. In fact, you could assign a single character to each of them to solve the performance reduction mentioned earlier. |
|
| Back to top |
|
 |
|