how to dection the input of Chinese

woshichuanqilz72 · 20 Oct 2015, 06:25

I can use the func input to detect the input of the English user input, but I want to get the Chinese input in time , what should I do?

As you see the input will be hold by a input software panel first.

What I want is to make the pc speak out the Chinese when I type the Chinese?

Heezea · 20 Oct 2015, 08:13

I'd try a RegExMatch against the input. I found some reference material for the \P{Han} character parameter here: http://www.pcre.org/pcre.txt

Code: Select all

Test := "键"
Test := RegExMatch(Test, "\P{Han}")
MsgBox, 0x40000,, % Test, 2
ExitApp

woshichuanqilz72 · 21 Oct 2015, 04:17

Heezea wrote:I'd try a RegExMatch against the input. I found some reference material for the \P{Han} character parameter here: http://www.pcre.org/pcre.txt
Code: Select all
Test := "键"
Test := RegExMatch(Test, "\P{Han}")
MsgBox, 0x40000,, % Test, 2
ExitApp

tnx for your reply. I use your test and get the result of test is "0", which means haven't match the word "键".
What I want is to make the pc speak out the Chinese when I type the Chinese?

just me · 21 Oct 2015, 07:10

What AHK version are you running?

Code: Select all

MsgBox, % "AHK " . A_AhkVersion . " " . (A_IsUnicode ? "U" : "A") . (A_PtrSize = 8 ? "64" : "32")

Heezea · 22 Oct 2015, 10:21

Sorry about that, I'm not getting good results either. We'll need somebody else to jump in, I'm stuck. I tried the following, but none worked. I attached a screenshot of the results, but *I think* AHK is translating the Chinese symbols into ASCII characters.

Code: Select all

i := 0
i++, Test%i% := "aaaccc我但是对于女孩子有工作的ddd"
i++, Test%i% := RegExMatch(Test1, "\p{Han}")
i++, Test%i% := RegExReplace(Test1, "\p{Han}", "X")
i++, Test%i% := RegExReplace(Test1, "[\u4e00-\u9fa5]", "X")
i++, Test%i% := RegExReplace(Test1, "\X", "X")
Loop, % i
	strMsg := strMsg "Test" A_Index ">" Test%A_Index% "<`n"
MsgBox, 0x40000,, % strMsg, 5
ExitApp

I also tried the Hifi Regex Tester here: http://www.gethifi.com/tools/regex. Here I was able to get matches by entering [\u4e00-\u9fa5]. However, I believe that is Java based.

trismarck · 22 Oct 2015, 18:54

Supposing that one uses the Unicode version of Autohotkey; the format of the file that you have saved is in UTF-8 without BOM. Make sure that you're saving the file as UTF-8 _with BOM_ or as UTF-16LE _with BOM_. Or specify that the file is in UTF-8 through command line with /CP65001.

Scripts wrote:The characters a script file may contain are restricted by the codepage used to load the file.

If the file begins with a UTF-8 or UTF-16 (LE) byte order mark, the appropriate codepage is used and the /CPn switch is ignored.

If the /CPn switch is passed on the command-line, codepage n is used. For a list of valid numeric codepage identifiers, see MSDN.

In all other cases, the system default ANSI codepage is used.

More info. The character 我 has the code point U+006211, gets encoded in UTF-8 as 0xE6 0x88 0x91 and saved into the file. Because the file lacks BOM, AHK interprets the file according to the system default code page. In Windows-1252 code page, the bytes are equivalent to the following characters : æ, ˆ and ‘.

Code: Select all

Test := "键"
Test := RegExMatch(Test, "\P{Han}")
MsgBox, 0x40000,, % Test, 2
ExitApp

This would work if one would use the small letter p.

//edit-2015-11-10: I guess this could also work if the text only contained English / Chinese characters and the file was saved in DBCS non-Unicode CP936. Wonder how that would work with PCRE in non-Unicode mode.
Another point would be that ‘ has different character numbers in ~Unicode and CP1252.

woshichuanqilz72 · 06 Nov 2015, 03:12

Heezea wrote:I'd try a RegExMatch against the input. I found some reference material for the \P{Han} character parameter here: http://www.pcre.org/pcre.txt
Code: Select all
Test := "键"
Test := RegExMatch(Test, "\P{Han}")
MsgBox, 0x40000,, % Test, 2
ExitApp

this is what I found on the web you post. Tnx a lot.

\p{xx} a character with the xx property
\P{xx} a character without the xx property
And the \p should in the lower case:

whatever tnx a lot.

woshichuanqilz72 · 06 Nov 2015, 03:12

trismarck wrote:Supposing that one uses the Unicode version of Autohotkey; the format of the file that you have saved is in UTF-8 without BOM. Make sure that you're saving the file as UTF-8 _with BOM_ or as UTF-16LE _with BOM_. Or specify that the file is in UTF-8 through command line with /CP65001.
Scripts wrote:The characters a script file may contain are restricted by the codepage used to load the file.

If the file begins with a UTF-8 or UTF-16 (LE) byte order mark, the appropriate codepage is used and the /CPn switch is ignored.

If the /CPn switch is passed on the command-line, codepage n is used. For a list of valid numeric codepage identifiers, see MSDN.

In all other cases, the system default ANSI codepage is used.
More info. The character 我 has the code point U+006211, gets encoded in UTF-8 as 0xE6 0x88 0x91 and saved into the file. Because the file lacks BOM, AHK interprets the file according to the system default code page. In Windows-1252 code page, the bytes are equivalent to the following characters : æ, ˆ and ‘.
Code: Select all
Test := "键"
Test := RegExMatch(Test, "\P{Han}")
MsgBox, 0x40000,, % Test, 2
ExitApp
This would work if one would use the small letter p.

tnx a lot for your help the 'p' should in the lowercase.

how to dection the input of Chinese

how to dection the input of Chinese

Re: how to dection the input of Chinese

Re: how to dection the input of Chinese

Re: how to dection the input of Chinese

Re: how to dection the input of Chinese

Re: how to dection the input of Chinese

Re: how to dection the input of Chinese

Re: how to dection the input of Chinese

Who is online