Hi all,
I am wondering how StrGet(addressA) figures out the string length when I don't particularly specify it as parameter for the function.
Does ahk use a string terminator? If so, is it the null-terminator mentioned in the help? (The help does not seem very clear to me, or I did not find the right page. For reference purposes: https://www.autohotkey.com/docs/commands/StrPutGet.htm)
Thank for any helpful feedback. Regards, S.
Basic question about the string terminator in ahk Topic is solved
-
- Posts: 4331
- Joined: 29 Mar 2015, 09:41
- Contact:
Re: Basic question about the string terminator in ahk
Thx and in the ANSI version, I assume it is a one byte null character?
Besides, if I wrote the string to memory somehow without a null-terminator character at all, and I don't specify a length in StrGet, what happens then?
Besides, if I wrote the string to memory somehow without a null-terminator character at all, and I don't specify a length in StrGet, what happens then?
-
- Posts: 4331
- Joined: 29 Mar 2015, 09:41
- Contact:
Re: Basic question about the string terminator in ahk
Right.
AHK will read bytes until first null-byte appears (in ANSI).
Re: Basic question about the string terminator in ahk
Thank u very much, teadrinker,
do u think it is different in the Unicode version since u wrote "(in ANSI)"?
do u think it is different in the Unicode version since u wrote "(in ANSI)"?
-
- Posts: 4331
- Joined: 29 Mar 2015, 09:41
- Contact:
Re: Basic question about the string terminator in ahk Topic is solved
In the unicode version AHK will read two bytes at a time until two zero bytes are encountered.
Re: Basic question about the string terminator in ahk
- ANSI and UTF-8 use 1 null byte as the null character.
- UTF-16 uses 2 null bytes (at an even offset) as the null character.
- StrGet/StrPut should work identically on AHK v1.1 ANSI/AHK Unicode. [EDIT:] One exception is that when Encoding is omitted: it is UTF-16 in AHK Unicode, and CP0 in AHK ANSI.
- Based on some tests:
- A script should load correctly, with AHK v1.1 (AHK v1.1 ANSI/AHK Unicode) as long as the script is ANSI/UTF-8 (with a BOM)/UTF-16 LE (with a BOM).
- If AHK v1.1 ANSI tries to run a script that contains literal non-ASCII characters, those literal non-ASCII are converted to their 'best-fit' equivalent, e.g. square root to 'v', or otherwise to a question mark.
- AHK v1.0 (AHK Basic) can only handle ANSI files. It can open a UTF-8 file with a BOM (it ignores the BOM), it treats the file as though it's ANSI, e.g. square root to '√'.
- UTF-16 uses 2 null bytes (at an even offset) as the null character.
- StrGet/StrPut should work identically on AHK v1.1 ANSI/AHK Unicode. [EDIT:] One exception is that when Encoding is omitted: it is UTF-16 in AHK Unicode, and CP0 in AHK ANSI.
- Based on some tests:
- A script should load correctly, with AHK v1.1 (AHK v1.1 ANSI/AHK Unicode) as long as the script is ANSI/UTF-8 (with a BOM)/UTF-16 LE (with a BOM).
- If AHK v1.1 ANSI tries to run a script that contains literal non-ASCII characters, those literal non-ASCII are converted to their 'best-fit' equivalent, e.g. square root to 'v', or otherwise to a question mark.
- AHK v1.0 (AHK Basic) can only handle ANSI files. It can open a UTF-8 file with a BOM (it ignores the BOM), it treats the file as though it's ANSI, e.g. square root to '√'.
Last edited by jeeswg on 20 Jun 2019, 21:29, edited 1 time in total.
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
-
- Posts: 4331
- Joined: 29 Mar 2015, 09:41
- Contact:
Re: Basic question about the string terminator in ahk
- Thanks teadrinker, I've added a clarification above.
- Here is a test demonstrating what happens when the Encoding parameter is omitted.
- (I've read through the StrPut/StrGet documentation multiple times, but haven't noticed any other AHK v1.1 Unicode/ANSI differences.)
- There are some issues with the documentation:
- Under 'Encoding', it half-implies that omit parameter = CP0, but actually omit parameter = UTF-16 or CP0. (As omitting a parameter and using a blank string are commonly equivalent in functions.)
- Under 'Encoding', it should say something like:
- If Encoding is not specified, it is UTF-16 (on Unicode versions) or CP0 (on ANSI versions).
- The documentation does say this, but it's meaning is not immediately apparent:
Suggestions on documentation improvements - Page 29 - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=13&t=1434&p=281716#p281716
- Here is a test demonstrating what happens when the Encoding parameter is omitted.
- (I've read through the StrPut/StrGet documentation multiple times, but haven't noticed any other AHK v1.1 Unicode/ANSI differences.)
Code: Select all
q:: ;test StrGet/StrPut (AHK v1.1 Unicode/ANSI)
VarSetCapacity(vData, 10*2, 0)
Loop 4
NumPut(96+A_Index, &vData, A_Index-1, "UChar")
MsgBox, % StrGet(&vData)
VarSetCapacity(vData, 10*2, 0)
MsgBox, % StrPut("abcd", &vData)
MsgBox, % Format("0x{:08X}", NumGet(&vData, 0, "UInt"))
return
- Under 'Encoding', it half-implies that omit parameter = CP0, but actually omit parameter = UTF-16 or CP0. (As omitting a parameter and using a blank string are commonly equivalent in functions.)
- Also, people might expect Encoding to match A_FileEncoding, if Encoding is omitted.Specify an empty string or "CP0" to use the system default ANSI code page.
- Under 'Encoding', it should say something like:
- If Encoding is not specified, it is UTF-16 (on Unicode versions) or CP0 (on ANSI versions).
- The documentation does say this, but it's meaning is not immediately apparent:
- I've mentioned the problem, here:If no Encoding is specified, the string is simply measured or copied without any conversion taking place.
Suggestions on documentation improvements - Page 29 - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=13&t=1434&p=281716#p281716
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
-
- Posts: 4331
- Joined: 29 Mar 2015, 09:41
- Contact:
Re: Basic question about the string terminator in ahk
Of course, I meant "if both parameters (length and encoding) are omitted".teadrinker wrote: ↑20 Jun 2019, 16:05In the unicode version AHK will read two bytes at a time until two zero bytes are encountered.
Re: Basic question about the string terminator in ahk
No Encoding parameter means no conversion and the return value is in the native encoding, it is pretty clear imo.remarks wrote:Note that the String parameter of StrPut and return value of StrGet are always in the native encoding of the current executable, whereas Encoding specifies the encoding of the string written to or read from the given Address. If no Encoding is specified, the string is simply measured or copied without any conversion taking place.
Any example?omitting a parameter and using a blank string are commonly equivalent in functions.
- Also, people might expect Encoding to match A_FileEncoding, if Encoding is omitted.
Cheers.FileEncoding wrote:Sets the default encoding for FileRead, FileReadLine, Loop Read, FileAppend, and FileOpen.
-
- Posts: 4331
- Joined: 29 Mar 2015, 09:41
- Contact:
Re: Basic question about the string terminator in ahk
IMO, not quite accurate definition.If no Encoding is specified, the string is simply measured or copied without any conversion taking place
Code: Select all
str := "hello"
VarSetCapacity(buff, StrPut(str, "UTF-8"))
StrPut(str, &buff, "UTF-8")
MsgBox, % StrGet(&buff)
Re: Basic question about the string terminator in ahk
Hi teadrinker .
Your example works exactly as I expected, and the definition is accurate, there is no conversion when you omit the encoding parameter. (Edit: Disregarding that the string is not guaranteed to be null terminated on Unicode build, which could cause undefined behaviour )
Cheers
Your example works exactly as I expected, and the definition is accurate, there is no conversion when you omit the encoding parameter. (Edit: Disregarding that the string is not guaranteed to be null terminated on Unicode build, which could cause undefined behaviour )
Cheers
-
- Posts: 4331
- Joined: 29 Mar 2015, 09:41
- Contact:
Re: Basic question about the string terminator in ahk
Hi @Helgef
Yes, you are right, now I see.
Cheers
Yes, you are right, now I see.
Cheers
Re: Basic question about the string terminator in ahk
How the buffer is filled in the example doesn't matter, utf-16 strings require two byte null to be zero terminated, utf-8 only requires one byte. Meaning that if you try to interpret what was meant to be a utf-8 string as utf-16, the double zero isn't guaranteed.
Disregarding that,
Disregarding that,
return value wrote: If Length is exactly the length of the converted string, the string is not null-terminated; otherwise the returned count includes the null-terminator.
Re: Basic question about the string terminator in ahk
Got it. You were referring only to that particular example. I thought that was a general statement. Thx.
Re: Basic question about the string terminator in ahk
The general statement would be that you shouldn't specify the wrong encoding, either explicitly or by omitting the parameter.
Re: Basic question about the string terminator in ahk
[EDIT] Never mind. Found the answer in the next post on the next page.[/EDIT]
Help,
this test example with a 20 char string does not work right:
The result, which I get in return is "12345678901234567890ݸȚ" (when I paste it here, although in the MsgBox it looked like in the picture)
and the script seems to crash as well (upon closing the dialog) since the tray icon does not go away until I hover the mouse over it (without clicking).
The returned string seems to be ok with 19 chars and less but the tray icon still hangs.
The tray icon only disappears on its own with 12 chars or less.
Ideas, anyone?
Help,
this test example with a 20 char string does not work right:
Code: Select all
string := "12345678901234567890" ;"TEST"
address := ""
StrPut(string, &address, 21)
InputBox, OutputVar, , , , , , , , , , % """" StrGet(&address) """"
and the script seems to crash as well (upon closing the dialog) since the tray icon does not go away until I hover the mouse over it (without clicking).
The returned string seems to be ok with 19 chars and less but the tray icon still hangs.
The tray icon only disappears on its own with 12 chars or less.
Ideas, anyone?
Last edited by autocart on 21 Jun 2019, 19:34, edited 1 time in total.