how to dection the input of Chinese

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
woshichuanqilz72
Posts: 117
Joined: 05 Oct 2015, 21:23

how to dection the input of Chinese

20 Oct 2015, 06:25

I can use the func input to detect the input of the English user input, but I want to get the Chinese input in time , what should I do?
Image
As you see the input will be hold by a input software panel first.

What I want is to make the pc speak out the Chinese when I type the Chinese?
Last edited by woshichuanqilz72 on 21 Oct 2015, 04:17, edited 1 time in total.
Heezea
Posts: 59
Joined: 30 Sep 2013, 21:33

Re: how to dection the input of Chinese

20 Oct 2015, 08:13

I'd try a RegExMatch against the input. I found some reference material for the \P{Han} character parameter here: http://www.pcre.org/pcre.txt

Code: Select all

Test := "键"
Test := RegExMatch(Test, "\P{Han}")
MsgBox, 0x40000,, % Test, 2
ExitApp
woshichuanqilz72
Posts: 117
Joined: 05 Oct 2015, 21:23

Re: how to dection the input of Chinese

21 Oct 2015, 04:17

Heezea wrote:I'd try a RegExMatch against the input. I found some reference material for the \P{Han} character parameter here: http://www.pcre.org/pcre.txt

Code: Select all

Test := "键"
Test := RegExMatch(Test, "\P{Han}")
MsgBox, 0x40000,, % Test, 2
ExitApp
tnx for your reply. I use your test and get the result of test is "0", which means haven't match the word "键".
What I want is to make the pc speak out the Chinese when I type the Chinese?
just me
Posts: 9553
Joined: 02 Oct 2013, 08:51
Location: Germany

Re: how to dection the input of Chinese

21 Oct 2015, 07:10

What AHK version are you running?

Code: Select all

MsgBox, % "AHK " . A_AhkVersion . " " . (A_IsUnicode ? "U" : "A") . (A_PtrSize = 8 ? "64" : "32")
Heezea
Posts: 59
Joined: 30 Sep 2013, 21:33

Re: how to dection the input of Chinese

22 Oct 2015, 10:21

Sorry about that, I'm not getting good results either. We'll need somebody else to jump in, I'm stuck. I tried the following, but none worked. I attached a screenshot of the results, but *I think* AHK is translating the Chinese symbols into ASCII characters.

Code: Select all

i := 0
i++, Test%i% := "aaaccc我但是对于女孩子有工作的ddd"
i++, Test%i% := RegExMatch(Test1, "\p{Han}")
i++, Test%i% := RegExReplace(Test1, "\p{Han}", "X")
i++, Test%i% := RegExReplace(Test1, "[\u4e00-\u9fa5]", "X")
i++, Test%i% := RegExReplace(Test1, "\X", "X")
Loop, % i
	strMsg := strMsg "Test" A_Index ">" Test%A_Index% "<`n"
MsgBox, 0x40000,, % strMsg, 5
ExitApp
Image

I also tried the Hifi Regex Tester here: http://www.gethifi.com/tools/regex. Here I was able to get matches by entering [\u4e00-\u9fa5]. However, I believe that is Java based.
User avatar
trismarck
Posts: 506
Joined: 30 Sep 2013, 01:48
Location: Poland

Re: how to dection the input of Chinese

22 Oct 2015, 18:54

Supposing that one uses the Unicode version of Autohotkey; the format of the file that you have saved is in UTF-8 without BOM. Make sure that you're saving the file as UTF-8 _with BOM_ or as UTF-16LE _with BOM_. Or specify that the file is in UTF-8 through command line with /CP65001.
Scripts wrote:The characters a script file may contain are restricted by the codepage used to load the file.
  • If the file begins with a UTF-8 or UTF-16 (LE) byte order mark, the appropriate codepage is used and the /CPn switch is ignored.
  • If the /CPn switch is passed on the command-line, codepage n is used. For a list of valid numeric codepage identifiers, see MSDN.
  • In all other cases, the system default ANSI codepage is used.
More info. The character has the code point U+006211, gets encoded in UTF-8 as 0xE6 0x88 0x91 and saved into the file. Because the file lacks BOM, AHK interprets the file according to the system default code page. In Windows-1252 code page, the bytes are equivalent to the following characters : æ, ˆ and .

Code: Select all

Test := "键"
Test := RegExMatch(Test, "\P{Han}")
MsgBox, 0x40000,, % Test, 2
ExitApp
This would work if one would use the small letter p.

//edit-2015-11-10: I guess this could also work if the text only contained English / Chinese characters and the file was saved in DBCS non-Unicode CP936. Wonder how that would work with PCRE in non-Unicode mode.
Another point would be that has different character numbers in ~Unicode and CP1252.
Last edited by trismarck on 09 Nov 2015, 18:43, edited 2 times in total.
woshichuanqilz72
Posts: 117
Joined: 05 Oct 2015, 21:23

Re: how to dection the input of Chinese

06 Nov 2015, 03:12

Heezea wrote:I'd try a RegExMatch against the input. I found some reference material for the \P{Han} character parameter here: http://www.pcre.org/pcre.txt

Code: Select all

Test := "键"
Test := RegExMatch(Test, "\P{Han}")
MsgBox, 0x40000,, % Test, 2
ExitApp
this is what I found on the web you post. Tnx a lot.

\p{xx} a character with the xx property
\P{xx} a character without the xx property
And the \p should in the lower case:

whatever tnx a lot.
woshichuanqilz72
Posts: 117
Joined: 05 Oct 2015, 21:23

Re: how to dection the input of Chinese

06 Nov 2015, 03:12

trismarck wrote:Supposing that one uses the Unicode version of Autohotkey; the format of the file that you have saved is in UTF-8 without BOM. Make sure that you're saving the file as UTF-8 _with BOM_ or as UTF-16LE _with BOM_. Or specify that the file is in UTF-8 through command line with /CP65001.
Scripts wrote:The characters a script file may contain are restricted by the codepage used to load the file.
  • If the file begins with a UTF-8 or UTF-16 (LE) byte order mark, the appropriate codepage is used and the /CPn switch is ignored.
  • If the /CPn switch is passed on the command-line, codepage n is used. For a list of valid numeric codepage identifiers, see MSDN.
  • In all other cases, the system default ANSI codepage is used.
More info. The character has the code point U+006211, gets encoded in UTF-8 as 0xE6 0x88 0x91 and saved into the file. Because the file lacks BOM, AHK interprets the file according to the system default code page. In Windows-1252 code page, the bytes are equivalent to the following characters : æ, ˆ and .

Code: Select all

Test := "键"
Test := RegExMatch(Test, "\P{Han}")
MsgBox, 0x40000,, % Test, 2
ExitApp
This would work if one would use the small letter p.
tnx a lot for your help the 'p' should in the lowercase.

Return to “Ask for Help (v1)”

Who is online

Users browsing this forum: AdsBot [Google], Bing [Bot], Google [Bot] and 96 guests