Regex for detecting two language sets Topic is solved

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
paik1002
Posts: 355
Joined: 28 Nov 2015, 02:45

Regex for detecting two language sets

30 May 2017, 02:50

I am still doing the ABCs of regex, so perhaps this may not be a difficult question.

The clipboard constitutes text characters, including but not limited to text from one or two languages,
say Latin(0x0020 ~ 0x007F, 0x00A0 ~ 0x00FF) and/or Greek(0x0370 ~ 0x03FF) characters.
FYI, I have the unicode version of AHK installed.

In essence, this what I would like to do:
1. detect if clipboard includes both greek AND latin characters.
2. detect if clipboard includes greek but NOT latin characters (or vice versa).

Code: Select all

^+!F12::
{
    if (clipboard) ~= "[latin patterns AND greek pattern]"
    {
        tooltip % both latin AND greek detected
    }
    else if (clipboard) ~= "[greek pattern but NOT latin patterns]"
    {
        tooltip % greek but NOT latin detected
    }
    else if (clipboard) ~= "[latin patterns but NOT greek pattern]"
    {
        tooltip % latin but NOT greek detected
    }
    else
    {
        tooltip % none of the two were detected
    }
}

The object is to design each regex pattern that fits the bill.
Help would be appreciated.
Ovg
Posts: 23
Joined: 19 Feb 2017, 01:13

Re: Regex for detecting two language sets  Topic is solved

30 May 2017, 08:17

Code: Select all

^+!F12::
if (RegExMatch(clipboard,"[\x{0020}-\x{007F}]") || (RegExMatch(clipboard,"[\x{00A0}-\x{00FF}]"))) && RegExMatch(clipboard,"[\x{0370}-\x{03FF}]")
 ToolTip, Both Latin AND Greek detected
else if (RegExMatch(clipboard,"[\x{0370}-\x{03FF}]")) && (!RegExMatch(clipboars,"[\x{0020}-\x{007F}]") && !RegExMatch(clipboard,"[\x{00A0}-\x{00FF}]"))
 ToolTip, Greek but NOT Latin detected
else if (RegExMatch(clipboard,"[\x{0020}-\x{007F}]") || (RegExMatch(clipboard,"[\x{00A0}-\x{00FF}]"))) && !RegExMatch(clipboard,"[\x{0370}-\x{03FF}]")
 ToolTip, Latin but NOT Greek detected
else
 ToolTip, None of the two were detected
Last edited by Ovg on 30 May 2017, 11:23, edited 1 time in total.
It's impossible to lead us astray for we don't care even to choose the way.
IMEime
Posts: 750
Joined: 20 Sep 2014, 06:15

Re: Regex for detecting two language sets

30 May 2017, 09:22

Code: Select all

Kor := "가"
Grk := "α"
Num := 9
MsgBox % ""
.	"1.  " (Kor ~= "\p{Hangul}") "`n"
.	"2.  " (Kor ~= "\p{Greek}") "`n"
.	"3.  " (Kor ~= "\p{N}") "`n`n"    
.	"4.  " (Grk ~= "\p{Hangul}") "`n"
.	"5.  " (Grk ~= "\p{Greek}") "`n"
.	"6.  " (Grk ~= "\p{N}") "`n`n"    
.	"7.  " (Num ~= "\p{Hangul}") "`n"
.	"8.  " (Num ~= "\p{Greek}") "`n"
.	"9.  " (Num ~= "\p{N}")

;	POSIX Bracket Expressions
;	http://www.regular-expressions.info/posixbrackets.html

;	PCRE - Perl-compatible regular expressions 
;	http://www.pcre.org/pcre.txt

Code: Select all

"[가-힣ㄱ-ㅣ]"								HangulSyllables and HangulCompatibilityJamo
"[가-힣]"			[\x{AC00}-\x{D7A3}]		HangulSyllables
"[ㄱ-ㅣ]"			[\x{3131}-\x{3163}]		HangulCompatibilityJamo
paik1002
Posts: 355
Joined: 28 Nov 2015, 02:45

Re: Regex for detecting two language sets

30 May 2017, 19:30

Nicely done! Thanks to both.
And thanks for the tutorial link.

I prefer the short-hand version, but I will probably fuse the two versions for future extensibility.
Didn't know they had unicode categories! Very convenient.

Return to “Ask for Help (v1)”

Who is online

Users browsing this forum: sanmaodo, ymnaren and 136 guests