Strlen(str) and StrSplit(str).Length should be the same number but aren't for some strings Topic is solved

Report problems with documented functionality
20170201225639
Posts: 144
Joined: 01 Feb 2017, 22:57

Strlen(str) and StrSplit(str).Length should be the same number but aren't for some strings

15 Aug 2023, 12:17

Code: Select all

Str := ComObject("Scriptlet.TypeLib").Guid
Msgbox StrLen(Str) ; 40
Msgbox StrSplit(Str).Length ; 38


What's the reason for this? Is it because of any hidden trimming operation performed?
User avatar
SKAN
Posts: 1551
Joined: 29 Sep 2013, 16:58

Re: Strlen(str) and StrSplit(str).Length should be the same number but aren't for some strings

15 Aug 2023, 12:38

20170201225639 wrote:
15 Aug 2023, 12:17
Is it because of any hidden trimming operation performed?
 
Seems like incorrect length!
 

Code: Select all

#Requires AutoHotkey v2.0
#SingleInstance

Str := ComObject("Scriptlet.TypeLib").Guid
VarSetStrCapacity(&Str, -1)
MsgBox StrLen(Str) ; 38
20170201225639
Posts: 144
Joined: 01 Feb 2017, 22:57

Re: Strlen(str) and StrSplit(str).Length should be the same number but aren't for some strings

15 Aug 2023, 13:28

SKAN wrote:
15 Aug 2023, 12:38
20170201225639 wrote:
15 Aug 2023, 12:17
Is it because of any hidden trimming operation performed?
 
Seems like incorrect length!
 

Code: Select all

#Requires AutoHotkey v2.0
#SingleInstance

Str := ComObject("Scriptlet.TypeLib").Guid
VarSetStrCapacity(&Str, -1)
MsgBox StrLen(Str) ; 38

Interesting! So I guess the '40' result represent the internally-cached string length?

https://www.autohotkey.com/docs/v2/lib/VarSetStrCapacity.htm
Specify -1 for RequestedCapacity to update the variable's internally-stored string length to the length of its current contents. This is useful in cases where the string has been altered indirectly, such as by passing its address via DllCall or SendMessage. In this mode, VarSetStrCapacity returns the length rather than the capacity.


I tripped over this problem when converting my v1 script to v2. Both v1 and v2 give the StrLen of the initial guid string as 40 (2 more than the expected 38 (the 36 characters + the 2 curly brackets)). However, in my v1 script, I first StringLower-ed the initial string, which (unbeknownst to me) had the side effect of correctly updating the StrLen to 38, so when I then do SubStr(2,-1) to remove the brackets, I get the 36 character GUID.

In v2 however, StrLower-ing the initial guid string does not update the internal length. So when I then do SubStr(2,-1) to try and remove the brackets, something very unexpected results ...

This v1 code

Code: Select all

❶ := ComObjCreate("Scriptlet.TypeLib").Guid
StringLower, ❷, ❶
❸ := SubStr(❷, 2, -1)
print_as_json([❶, StrLen(❶), ❷, StrLen(❷), ❸, StrLen(❸)])
produces:
[
"{A944D6AF-1B95-4D92-ABF4-91DED0B9757E}",
40,
"{a944d6af-1b95-4d92-abf4-91ded0b9757e}",
38,
"a944d6af-1b95-4d92-abf4-91ded0b9757e",
36
]

But this v2 code

Code: Select all

❶ := ComObject("Scriptlet.TypeLib").Guid
❷ := StrLower(❶)
❸ := SubStr(❷, 2, -1)
print_as_json([❶, StrLen(❶), ❷, StrLen(❷), ❸, StrLen(❸)])
produces:
[
"{61123CC8-D58C-44B4-AE55-8A1DFF3DFD51}",
40,
"{61123cc8-d58c-44b4-ae55-8a1dff3dfd51}",
40,
"61123cc8-d58c-44b4-ae55-8a1dff3dfd51}", {{👈!}}
38
]
User avatar
SKAN
Posts: 1551
Joined: 29 Sep 2013, 16:58

Re: Strlen(str) and StrSplit(str).Length should be the same number but aren't for some strings

15 Aug 2023, 13:51

20170201225639 wrote:
15 Aug 2023, 13:28
when I then do SubStr(2,-1) to remove the brackets, I get the 36 character GUID.
 
I would use Trim(), even that don't work properly without VarSetStrCapacity().
 

Code: Select all

#Requires AutoHotkey v2.0
#SingleInstance

Str := ComObject("Scriptlet.TypeLib").Guid
Str := Trim(Str, "{}")
MsgBox StrLen(Str) "`n" Str ; 39

Str := ComObject("Scriptlet.TypeLib").Guid
VarSetStrCapacity(&Str, -1)
Str := Trim(Str, "{}")
MsgBox StrLen(Str) "`n" Str ; 36
iseahound
Posts: 1451
Joined: 13 Aug 2016, 21:04
Contact:

Re: Strlen(str) and StrSplit(str).Length should be the same number but aren't for some strings

15 Aug 2023, 14:58

The xtra two bytes at the end are two zeros. That's because a zero terminated string was allocated. For some functions, you need those two extra zeros, for other functions you don't. It all depends on the specific function you're dealing with.
lexikos
Posts: 9629
Joined: 30 Sep 2013, 04:07
Contact:

Re: Strlen(str) and StrSplit(str).Length should be the same number but aren't for some strings  Topic is solved

17 Aug 2023, 04:24

It is not because the string is zero-terminated. It is because the .Guid property returns a string with two embedded zeros. The length is exactly as returned by SysStringLen.

IDispatch uses BSTR for all strings. BSTR includes the length of the string, and therefore supports embedded null characters. AutoHotkey uses this length exactly. It would not be safe to assume that two embedded zeroes at the end of the string were unintended. BSTR guarantees that there will be a null terminator not included in the length.

AutoHotkey v1 primarily relies on null-termination. Expression evaluation does not support temporary strings with embedded null characters, so the string is truncated.

AutoHotkey v2 uses counted strings throughout expression evaluation, variables, properties, array elements, etc. but also null-terminates them. There are still some things that do not use counted strings, like parameters of many functions, property names and map keys.

Trim(Str, "{}) shouldn't work, because "{}" does not include the null character. Similarly, Trim("{..}!!", "{}") should not remove "!!".

Perhaps Trim(Str, Chr(0)) should work, but it does not, because Trim doesn't support embedded null characters in the second parameter.
Note: Due to reliance on null-termination, many built-in functions and most expression operators do not support strings with embedded null characters, and instead read only up to the first null character. However, basic manipulation of such strings is supported; e.g. concatenation, ==, !==, Chr(0), StrLen, SubStr, assignments, parameter values and return.
Source: Concepts and Conventions | AutoHotkey v2
User avatar
SKAN
Posts: 1551
Joined: 29 Sep 2013, 16:58

Re: Strlen(str) and StrSplit(str).Length should be the same number but aren't for some strings

17 Aug 2023, 07:12

lexikos wrote:
17 Aug 2023, 04:24
Trim(Str, "{}) shouldn't work, because "{}" does not include the null character. Similarly, Trim("{..}!!", "{}") should not remove "!!".
I see. Noted.
I believed it would work because API function does work:
 

Code: Select all

#Requires AutoHotkey v2.0
#SingleInstance

Str := ComObject("Scriptlet.TypeLib").Guid
DllCall("Shlwapi\StrTrimW", "str",Str, "str","{}")
MsgBox StrLen(Str) "`n" Str ; 36
20170201225639
Posts: 144
Joined: 01 Feb 2017, 22:57

Re: Strlen(str) and StrSplit(str).Length should be the same number but aren't for some strings

17 Aug 2023, 07:34

Thanks for the explanations!

What's a recommended way to reliably get rid of any potentially present embedded zeros from strings, preferably a way that works for both v1 and v2?

Currently I just StrSplit() then join them, seems to work both in v1 and v2.

I also found some relevant discussions here:
https://stackoverflow.com/questions/413367/outputting-a-guid-in-vbscript-ignores-all-text-after-it
I encountered the same problem. When I try to print (FileAppend stdout) the returned guid to vscode's debug console, I noticed a very strange behavior, where only the first attempt sort of succeeds, each subsequent attempt removes a bit of text from the console ...
User avatar
SKAN
Posts: 1551
Joined: 29 Sep 2013, 16:58

Re: Strlen(str) and StrSplit(str).Length should be the same number but aren't for some strings

17 Aug 2023, 07:40

20170201225639 wrote:
17 Aug 2023, 07:34
What's a recommended way to reliably get rid of any potentially present embedded zeros from strings, preferably a way that works for both v1 and v2?
 
Pass -1 to:
V1: VarSetCapacity()
V2: VarSetStrCapacity()
lexikos
Posts: 9629
Joined: 30 Sep 2013, 04:07
Contact:

Re: Strlen(str) and StrSplit(str).Length should be the same number but aren't for some strings

17 Aug 2023, 16:46

SKAN wrote:
17 Aug 2023, 07:12
I believed it would work because API function does work:
 
You aren't passing the string length to this API, only a pointer to a null-terminated string. The API doesn't trim null characters; it just doesn't consider them to be part of the string in the first place. If the string contains an embedded null character followed by some non-null characters, the latter will be lost as well.

Because you used the "str" type, DllCall expects that the string may be modified and adjusts the length after the call.
If the called function modifies the string and the argument is a naked variable or VarRef, its contents will be updated.
It isn't quite accurate: DllCall doesn't know whether the function modifies the string; it updates the variable unconditionally. It is assumed that this update has no effect if the function didn't modify the string, but that isn't necessarily the case.

Code: Select all

z := Chr(0)
MsgBox StrLen(z)  ; 1
DllCall("MulDiv", "str", z, "int", 1, "int", 1)
MsgBox StrLen(z)  ; 0
lexikos
Posts: 9629
Joined: 30 Sep 2013, 04:07
Contact:

Re: Strlen(str) and StrSplit(str).Length should be the same number but aren't for some strings

30 Aug 2023, 03:34

It didn't occur to me until now to mention that the reason for the inconsistency in the topic title is that StrSplit is one of the functions to which the quote from before applies.
Note: Due to reliance on null-termination, many built-in functions and most expression operators do not support strings with embedded null characters, and instead read only up to the first null character. However, basic manipulation of such strings is supported; e.g. concatenation, ==, !==, Chr(0), StrLen, SubStr, assignments, parameter values and return.
Source: Concepts and Conventions | AutoHotkey v2
In other words, StrLen is correct and StrSplit is not (but I'd call it a known limitation, rather than a bug).

Loop Parse shares some code with StrSplit and is subject to the same limitations.

These core string functions should be fixed to support embedded null characters consistently, but I doubt it will happen in the near future.

Return to “Bug Reports”

Who is online

Users browsing this forum: No registered users and 10 guests