Passing UTF-8 strings to AHK script Topic is solved

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
et2010
Posts: 18
Joined: 25 Jan 2017, 22:37

Passing UTF-8 strings to AHK script

14 Feb 2017, 12:47

I got an external app whose output is a UTF-8 string, I can check the {Output} with cmd:

Code: Select all

 cmd /K echo {output} >> output.txt
And when I check the output.txt with Notepad.exe, it has UTF-8 encoding and the unicode string is showing correctly:

Code: Select all

 D:\James\Dropbox\zim\data\Notebooks\Notes\Test_Page_中文
But when I pass that same {output} to an AHK script like this:

Code: Select all

 AutoHotKey.exe test.ahk {output}
test.ahk:

Code: Select all

tooltip, %1%
sleep 5000
return
It got strange outcome:
snipaste20170215_013211.png
snipaste20170215_013211.png (2.14 KiB) Viewed 7217 times
As you can see, the last two Chinese characters are not showing correctly in the tooltip. Any idea what is causing this problem?

BTW, I'm using AHK 1.1.24.02 unicode on Windows 8.1 64bit with simplified Chinese as default locale.
Guest

Re: Passing UTF-8 strings to AHK script

14 Feb 2017, 13:11

I take you use fileread to read {output}? There is a specific parameter for FileRead (see docs) so you can set the encoding. And I also assume you've saved your script as UTF-8 (unicode as well) by default your editor may be set to ANSI, can't hurt to check.
tmplinshi
Posts: 1604
Joined: 01 Oct 2013, 14:57

Re: Passing UTF-8 strings to AHK script

14 Feb 2017, 13:33

Try:

Code: Select all

FileRead, str, *P65001 output.txt
AutoHotKey.exe test.ahk "%str%"
et2010
Posts: 18
Joined: 25 Jan 2017, 22:37

Re: Passing UTF-8 strings to AHK script

14 Feb 2017, 19:08

Guest wrote:I take you use fileread to read {output}? There is a specific parameter for FileRead (see docs) so you can set the encoding.
Sorry I didn't make it clear. I didn't use fileread to read the {output}.
Guest wrote:And I also assume you've saved your script as UTF-8 (unicode as well) by default your editor may be set to ANSI, can't hurt to check.
Yes, my script is saved as UTF-8. I double checked that.
tmplinshi wrote:Try:

Code: Select all

FileRead, str, *P65001 output.txt
AutoHotKey.exe test.ahk "%str%"
Thanks for the suggestion but it's not what I want. That output.txt is only for me to demonstrate the issue, and test.ahk too. In the real situation, It's just a parameter passed to my ahk script - the same script I use to send a message.

I hope I didn't make things more complicated, and thanks for your help.
et2010
Posts: 18
Joined: 25 Jan 2017, 22:37

Re: Passing UTF-8 strings to AHK script

14 Feb 2017, 19:32

To elaborate a bit, that external app I mentioned is actually zim - a desktop wiki software. It has a Custom Tool functionality, and provides several parameters to be passed to your custom command. And in my case, the custom command is an AHK script.
et2010
Posts: 18
Joined: 25 Jan 2017, 22:37

Re: Passing UTF-8 strings to AHK script

14 Feb 2017, 21:15

OK, I got a way to reproduce this issue, here is the script:

Code: Select all

UTF8Str = 中文
tooltip, %UTF8Str%
sleep, 5000
return
save the script in UTF-8 (without BOM) and run. You should get something like this:
snipaste20170215_101443.png
snipaste20170215_101443.png (598 Bytes) Viewed 7149 times
Last edited by et2010 on 14 Feb 2017, 22:10, edited 1 time in total.
qwerty12
Posts: 468
Joined: 04 Mar 2016, 04:33
Contact:

Re: Passing UTF-8 strings to AHK script

14 Feb 2017, 21:41

et2010 wrote:save the script in UTF-8 (without BOM) and run
AFAIK, the BOM is required if you want the actual contents of the script to be handled as Unicode: https://autohotkey.com/board/topic/9171 ... /?p=578486

But, saying that, I'd be surprised if that's actually your issue. Presumably the Chinese characters are set on the command line by Zim. AutoHotkey calls the Unicode version of GetCommandLine which should return the Chinese characters properly assuming Zim starts AutoHotkey properly. I tested that theory by saving test.ahk as a plain ANSI file. It appears to work fine here - I grabbed the portable version of Zim, made sure the page name and the location of the notebook contained "中文" as per your post, added a new tool that runs AutoHotkey_L Unicode with test.ahk and %n after that and the tooltip appears to display correctly.
et2010
Posts: 18
Joined: 25 Jan 2017, 22:37

Re: Passing UTF-8 strings to AHK script

14 Feb 2017, 21:52

qwerty12 wrote:
et2010 wrote:save the script in UTF-8 (without BOM) and run
AFAIK, the BOM is required if you want the actual contents of the script to be handled as Unicode: https://autohotkey.com/board/topic/9171 ... /?p=578486

But, saying that, I'd be surprised if that's actually your issue. Presumably the Chinese characters are set on the command line by Zim. AutoHotkey calls the Unicode version of GetCommandLine which should return the Chinese characters properly assuming Zim starts AutoHotkey properly. I tested that theory by saving test.ahk as a plain ANSI file. It appears to work fine here - I grabbed the portable version of Zim, made sure the page name and the location of the notebook contained "中文" as per your post, added a new tool that runs AutoHotkey_L Unicode with test.ahk and %n after that and the tooltip appears to display correctly.
This is weird cuz it's not working here even if I saved it as ANSI. Is this a system specific problem? What is your code page in cmd?
et2010
Posts: 18
Joined: 25 Jan 2017, 22:37

Re: Passing UTF-8 strings to AHK script

14 Feb 2017, 21:58

@tmplinshi has also reproduced my issue
qwerty12
Posts: 468
Joined: 04 Mar 2016, 04:33
Contact:

Re: Passing UTF-8 strings to AHK script

14 Feb 2017, 22:02

(I hope I'm on the right track here and not going off in a tangent, because I don't really have any other ideas...)

What does your actual Zim tool command look like? Mine is "C:\Program Files\AutoHotkey\AutoHotkey.exe" "C:\Users\Me\Desktop\New AutoHotkey Script.ahk" %n. I know there AutoHotkey.exe on my system is the 64-bit Unicode build. What happens if you, say, try AutoHotkeyU32.exe or AutoHotkeyU64.exe?
@tmplinshi has also reproduced my issue
Hmm, evidently I'm the odd one out here, then. My locale is set to "English (United Kingdom)" - maybe that's playing a part?
guest3456
Posts: 3463
Joined: 09 Oct 2013, 10:31

Re: Passing UTF-8 strings to AHK script

14 Feb 2017, 22:05

et2010 wrote:OK, I got a way to reproduce this issue, here is the script:

Code: Select all

UTF8Str = 中文
tooltip, UTF8Str
sleep, 5000
return
save the script in UTF-8 (without BOM) and run. You should get something like this:
snipaste20170215_101443.png
you're missing the %%s to deref the tooltip var. but yes, it fails for me with no BOM. and WITH BOM, it works.

but this is as expected:

https://autohotkey.com/docs/Scripts.htm#cp

sounds like you just need to use the /CPn command line ahk option

maybe something like:

AutoHotKey.exe /CP65001 test.ahk {output}

et2010
Posts: 18
Joined: 25 Jan 2017, 22:37

Re: Passing UTF-8 strings to AHK script

14 Feb 2017, 22:07

@qwerty12, thanks very much for your reply.
I use exactly the same command as yours, and my autohotkey is 64bit too. So maybe I should try to switch my locale?
qwerty12
Posts: 468
Joined: 04 Mar 2016, 04:33
Contact:

Re: Passing UTF-8 strings to AHK script

14 Feb 2017, 22:17

et2010 wrote:I use exactly the same command as yours, and my autohotkey is 64bit too. So maybe I should try to switch my locale?
Doing so might be a bit drastic. IMHO, if there is a problem, it's likely to be in Zim. What happens if you run test.ahk D:\James\Dropbox\zim\data\Notebooks\Notes\Test_Page_中文 from the command line manually? (With chcp saying my active code page is 850, the tooltip's contents are as expected - even though Command Prompt won't actually show the Chinese characters on my system) If it works there, you can probably rule out AutoHotkey being the problem
et2010
Posts: 18
Joined: 25 Jan 2017, 22:37

Re: Passing UTF-8 strings to AHK script

14 Feb 2017, 22:19

you're missing the %%s to deref the tooltip var. but yes, it fails for me with no BOM. and WITH BOM, it works.
Thanks for the heads up.
guest3456 wrote:maybe something like:

AutoHotKey.exe /CP65001 test.ahk {output}
I didn't know that option, thanks
but it still doesn't work.
Last edited by et2010 on 14 Feb 2017, 22:32, edited 2 times in total.
et2010
Posts: 18
Joined: 25 Jan 2017, 22:37

Re: Passing UTF-8 strings to AHK script

14 Feb 2017, 22:20

qwerty12 wrote:
et2010 wrote:I use exactly the same command as yours, and my autohotkey is 64bit too. So maybe I should try to switch my locale?
Doing so might be a bit drastic. IMHO, if there is a problem, it's likely to be in Zim. What happens if you run test.ahk D:\James\Dropbox\zim\data\Notebooks\Notes\Test_Page_中文 from the command line manually? (With chcp saying my active code page is 850, the tooltip's contents are as expected - even though Command Prompt won't actually show the Chinese characters on my system) If it works there, you can probably rule out AutoHotkey being the problem
It works as expected if I run it directly in the cmd
qwerty12
Posts: 468
Joined: 04 Mar 2016, 04:33
Contact:

Re: Passing UTF-8 strings to AHK script

14 Feb 2017, 22:32

et2010 wrote:It works as expected if I run it directly in the cmd
Right, at this point my gut feeling says it's a Zim issue, but I note that in your first post, you say writing to a file instead works fine. What happens if you save this script to a file and get Zim to run that instead? Do the Chinese characters show up properly (they do here) in the MessageBox produced by cscript?
et2010
Posts: 18
Joined: 25 Jan 2017, 22:37

Re: Passing UTF-8 strings to AHK script

14 Feb 2017, 22:44

Still the same, the Chinese characters not showing correctly here
Attachments
snipaste20170215_114424.png
snipaste20170215_114424.png (3.03 KiB) Viewed 7113 times
qwerty12
Posts: 468
Joined: 04 Mar 2016, 04:33
Contact:

Re: Passing UTF-8 strings to AHK script  Topic is solved

14 Feb 2017, 22:54

Thanks. To me, then, this appears to be a Zim issue (while I'm not 100% sure about that, I also don't see why you should have to change your locale - if anything, if that actually is a possible workaround, I'm surprised I'm having no problems with my en_GB locale), as cscript also shows the characters properly here when invoked from Zim with the %n specifier.

What version of Zim are you running? I've been using 0.65 here.

EDIT: Definitely a Zim issue. I set the "Command does not modify data" option (I'm sorry for bringing that up now, I forgot I had even done it). As soon as I untick it, I do not see the expected result at all
4GForce
Posts: 553
Joined: 25 Jan 2017, 03:18
Contact:

Re: Passing UTF-8 strings to AHK script

14 Feb 2017, 23:54

I think you guyz are looking at the wrong place, the script file encoding has nothing to do with it
et2010 wrote:And when I check the output.txt with Notepad.exe, it has UTF-8 encoding and the unicode string is showing correctly:
So the output was UTF-8 ! ( you've read a UTF-8 value and output it to a UTF-8 file )
et2010 wrote:OK, I got a way to reproduce this issue, here is the script:

Code: Select all

UTF8Str = 中文
tooltip, %UTF8Str%
sleep, 5000
return
save the script in UTF-8 (without BOM) and run.
Hard coding the symbols in a UTF-8 script file and sending it to a unicode output is sure to fail ( tooltip is produced by unicode AutoHotkey.exe in a Windows unicode control )
If you want to test, you should read the value from a unicode environment like the clipboard.

So, known facts: you're reading a UTF-8 value from outside the script and you want to Output that value as unicode ...
The script file encoding has nothing to do with that, everything is happening in memory, the value is never written to that file.

I don't know how ... but you must read that UTF-8 value, translate it to unicode then output it!
Last edited by 4GForce on 14 Feb 2017, 23:58, edited 1 time in total.
4GForce
Posts: 553
Joined: 25 Jan 2017, 03:18
Contact:

Re: Passing UTF-8 strings to AHK script

14 Feb 2017, 23:55

4GForce wrote:I think you guyz are looking at the wrong place, the script file encoding has nothing to do with it
et2010 wrote:And when I check the output.txt with Notepad.exe, it has UTF-8 encoding and the unicode string is showing correctly:
So the output was UTF-8 ! ( you've read a UTF-8 value and output it to a UTF-8 file )
et2010 wrote:OK, I got a way to reproduce this issue, here is the script:

Code: Select all

UTF8Str = 中文
tooltip, %UTF8Str%
sleep, 5000
return
save the script in UTF-8 (without BOM) and run.
Hard coding the symbols in a UTF-8 script file and sending it to a unicode output is sure to fail ( tooltip is produced by unicode AutoHotkey.exe in a Windows unicode control )
If you want to test, you should read the value from a unicode environment like the clipboard.

So, known facts: you're reading a UTF-8 value from outside the script and you want to Output that value as unicode ...
The script file encoding has nothing to do with that, everything is happening in memory, the value is never written to that file.

I don't know how ... but you must read that UTF-8 value, translate it to unicode then output it!
Edit: Oh fuck, missclicked :sick:

Return to “Ask for Help (v1)”

Who is online

Users browsing this forum: doodles333, Frogrammer, Google [Bot] and 264 guests