Format multiple characters by char code? Topic is solved

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
william_ahk
Posts: 469
Joined: 03 Dec 2018, 20:02

Format multiple characters by char code?

Post by william_ahk » 04 Oct 2022, 01:46

I read in the docs that Format("{:c}", charcode) can be used to infer a single character by its character code. What is the correct way to get multiple characters from a string of character codes like "\u4f60\u597d" that is common in Javascript? Do I have to use RegExReplace() for this?

Rohwedder
Posts: 7509
Joined: 04 Jun 2014, 08:33
Location: Germany

Re: Format multiple characters by char code?

Post by Rohwedder » 04 Oct 2022, 02:20

Hallo,
yes, if the solution is something like this:

Code: Select all

q::MsgBox,% Chr(0x4f60) Chr(0x597d)
it will run easiest with RegEx.

User avatar
mikeyww
Posts: 26437
Joined: 09 Sep 2014, 18:38

Re: Format multiple characters by char code?

Post by mikeyww » 04 Oct 2022, 06:54

Code: Select all

Send % decode("\u00a9\u2191") ; ©↑
Send % decode("\u4f60\u597d") ; 你好

decode(uCode) { ; Decode JavaScript Unicode
 Return RegExReplace(uCode, "i)\\u([0-9A-F]{1,6})", "{U+$1}")
}

william_ahk
Posts: 469
Joined: 03 Dec 2018, 20:02

Re: Format multiple characters by char code?

Post by william_ahk » 04 Oct 2022, 23:25

Thanks y'all for the pointers. I repurposed the Deref() function from the docs to decode the char code string.

Code: Select all

msgbox % decode_utf8("\u4f60\u597d") ; 你好

decode_utf8(charcode_string) {
    spo := 1
    out := ""
    while (fpo:=RegexMatch(charcode_string, "i)\\u([0-9A-F]{1,6})", m, spo))
    {
        out .= SubStr(charcode_string, spo, fpo-spo)
        spo := fpo + StrLen(m)
        if (m1)
            out .= Chr("0x" m1)
    }
    return out SubStr(charcode_string, spo)
}

Rohwedder
Posts: 7509
Joined: 04 Jun 2014, 08:33
Location: Germany

Re: Format multiple characters by char code?  Topic is solved

Post by Rohwedder » 06 Oct 2022, 05:49

Shorter:

Code: Select all

msgbox % decode_utf8("\u4f60\u597d") ; 你好

decode_utf8(charcode_string) {
    Global Out := ""
	Return, RegExReplace(charcode_string
	, "i)(\\u)([0-9A-F]{1,6})(?COut)") Out
}	Out(m) {
	Global Out .= Chr("0x" m2)
}
or:

Code: Select all

MsgBox,% (Out:="") RegExReplace("\u4f60\u597d", "i)(\\u)([0-9A-F]{1,6})(?COut)")
; 你好
Out(m) {
	Global Out .= Chr("0x" m2)
}

william_ahk
Posts: 469
Joined: 03 Dec 2018, 20:02

Re: Format multiple characters by char code?

Post by william_ahk » 27 Nov 2022, 05:15

@Rohwedder Thanks. :thumbup: Apologies for the late reply, I didn't check the thread after finding the solution.
Also recently I've discovered that RegEx is faster than iterating the string. I thought it was the opposite :facepalm:

Post Reply

Return to “Ask for Help (v1)”