regex problem

Get help with using AutoHotkey and its commands and hotkeys
awcrt9316
Posts: 61
Joined: 03 Mar 2020, 20:06

regex problem

16 May 2020, 20:38

I'd like to get the second word of a string, that contains random words, with random spaces between them, and an occasional accent mark. Because of the accent mark, my regex script will not locate the entire second word and instead leave out the e. How do I make it so that if there is an e with an accent at the end, or if there is an accented letter at the beginning, it is still considered part of the word?

Code: Select all

line:= "él     	esté   	ellos	         estén"

RegExMatch(Line, "^(?:.*?\K\b\S+\b){2}", word1)	; <-- finds the 2nd word
msgbox, % word1 ; prints "est" instead of "esté"

return
esc::exitapp
User avatar
boiler
Posts: 6601
Joined: 21 Dec 2014, 02:44

Re: regex problem

16 May 2020, 20:55

Special characters like accented letters are not considered word characters, so the word boundary \b will not consider them part of the word. To get around that, just consider white space and non-white space characters by removing both \b markers from your needle. Alternatively:

Code: Select all

RegExMatch(Line, "\S+\s+\K\S+", word1)

Return to “Ask For Help”

Who is online

Users browsing this forum: No registered users and 26 guests