Extract all words that are separated by special characters? Topic is solved

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
zZB
Posts: 20
Joined: 13 Feb 2021, 20:02

Extract all words that are separated by special characters?

21 Apr 2021, 23:44

Sorry if my request is confusing, I don't really know how to explain. Is there an easy way I can take a long string of words like...
  • Hello One - Hello Two_Hello Three (Hello Four)
and create variables from all the words that are separated by special characters? So the result from the string above would be something like...
  • Var1 = Hello One
    Var2 = Hello Two
    Var3 = Hello Three
    Var4 = Hello Four
I could probably put something together but it would likely look pretty ugly. Just wondering if there was an easier way.
User avatar
boiler
Posts: 16900
Joined: 21 Dec 2014, 02:44

Re: Extract all words that are separated by special characters?

22 Apr 2021, 00:19

I think it would be better to use an array for the output, but here it is using separate variables since that's what you requested:

Code: Select all

Str := "Hello One - Hello Two_Hello Three (Hello Four)"

Count := 0
Start := 1
while Pos := RegExMatch(Str, "[ a-zA-Z]+", Match, Start) {
	Var%A_Index% := Trim(Match)
	Start := Pos + StrLen(Match)
	Count++
}

loop, % Count
	Output .= "Var" A_Index ": " Var%A_Index% "`n"
MsgBox, % Output
AHKStudent
Posts: 1472
Joined: 05 May 2018, 12:23

Re: Extract all words that are separated by special characters?

22 Apr 2021, 02:32

like @boiler said arrays is better and maybe even with the built in string split that creates an array

Code: Select all

var := "Hello One - Hello Two_Hello Three (Hello Four)"
Arr := StrSplit(var,["-","_","("])
loop, % arr.maxindex()
	MsgBox % Arr[a_index]
ExitApp
User avatar
boiler
Posts: 16900
Joined: 21 Dec 2014, 02:44

Re: Extract all words that are separated by special characters?

22 Apr 2021, 06:48

The StrSplit approach is OK, but it has a couple drawbacks. You have to know which special characters exist (or could exist) in the string in advance and list them explicitly rather than considering anything other than a letter and a space to be a special character. Also notice that the fourth resulting variable contains the ")" on the end of the words.

Here is how I would change my approach to use an array:

Code: Select all

Str := "Hello One - Hello Two_Hello Three (Hello Four)"
Var := []
Start := 1
while Pos := RegExMatch(Str, "[ a-zA-Z]+", Match, Start) {
	Var.Push(Trim(Match))
	Start := Pos + StrLen(Match)
}

for k, v in Var
	Output .= "Var[" k "]: " v "`n"
MsgBox, % Output
User avatar
Chunjee
Posts: 1417
Joined: 18 Apr 2014, 19:05
Contact:

Re: Extract all words that are separated by special characters?

22 Apr 2021, 13:51

https://biga-ahk.github.io/biga.ahk/#/?id=split allows for a regexp separator to split by. Because of the parenthesis on the end you would endup with one blank element however.

Code: Select all

A := new biga() ; requires https://www.npmjs.com/package/biga.ahk

Str := "Hello One - Hello Two_Hello Three (Hello Four)"
array := A.split(Str, "/\s*[_\-\(\)]\s*/")
; => ["Hello One", "Hello Two", "Hello Three", "Hello Four", ""]
array := A.compact(array)
; => ["Hello One", "Hello Two", "Hello Three", "Hello Four"]
I used the pattern \s*[_\-\(\)]\s*
zZB
Posts: 20
Joined: 13 Feb 2021, 20:02

Re: Extract all words that are separated by special characters?

22 Apr 2021, 22:31

boiler wrote:
22 Apr 2021, 00:19
I think it would be better to use an array for the output, but here it is using separate variables since that's what you requested:

Code: Select all

Str := "Hello One - Hello Two_Hello Three (Hello Four)"

Count := 0
Start := 1
while Pos := RegExMatch(Str, "[ a-zA-Z]+", Match, Start) {
	Var%A_Index% := Trim(Match)
	Start := Pos + StrLen(Match)
	Count++
}

loop, % Count
	Output .= "Var" A_Index ": " Var%A_Index% "`n"
MsgBox, % Output
The reason I asked for Variables is because arrays confuse me. I don't use them often because I don't know how to deal with them. I'll try to use it but maybe I'll come back to this one if I can't figure it out. Anyway, I added some characters to the RegExMatch to suit my needs but there is one thing I couldn't figure out.

Code: Select all

Str := "Hello One - Hello Two_Hello Three (Hello Four)   _List One, Two & Three - String's End_"
Var := []
Start := 1
while Pos := RegExMatch(Str, "[ a-zA-Z0-9'&,]+", Match, Start) {
	Var.Push(Trim(Match))
	Start := Pos + StrLen(Match)
}

for k, v in Var
	Output .= "Var[" k "]: " v "`n"
MsgBox, % Output

MsgBox % Var[Var.MaxIndex()]
exitapp
I made the String longer to test a few extra things
  • Hello One - Hello Two_Hello Three (Hello Four) _List One, Two & Three - String's End_
and it all works perfectly except that in this particular case it creates 1 variable in the array that is blank.
  • Var[1]: Hello One
    Var[2]: Hello Two
    Var[3]: Hello Three
    Var[4]: Hello Four
    Var[5]:
    Var[6]: List One, Two & Three
    Var[7]: String's End
It creates a variable from the space in between ) and _. Is there any way to filter out blank variables?
User avatar
boiler
Posts: 16900
Joined: 21 Dec 2014, 02:44

Re: Extract all words that are separated by special characters?  Topic is solved

22 Apr 2021, 23:31

zZB wrote: It creates a variable from the space in between ) and _. Is there any way to filter out blank variables?
Yes:

Code: Select all

Str := "Hello One - Hello Two_Hello Three (Hello Four)   _List One, Two & Three - String's End_"
Var := []
Start := 1
while Pos := RegExMatch(Str, "[ a-zA-Z0-9'&,]+", Match, Start) {
	if (Trim(Match) != "")
		Var.Push(Trim(Match))
	Start := Pos + StrLen(Match)
}

for k, v in Var
	Output .= "Var[" k "]: " v "`n"
MsgBox, % Output

MsgBox % Var[Var.MaxIndex()]
exitapp
zZB
Posts: 20
Joined: 13 Feb 2021, 20:02

Re: Extract all words that are separated by special characters?

22 Apr 2021, 23:50

boiler wrote:
22 Apr 2021, 23:31
zZB wrote: It creates a variable from the space in between ) and _. Is there any way to filter out blank variables?
Yes:

Code: Select all

Str := "Hello One - Hello Two_Hello Three (Hello Four)   _List One, Two & Three - String's End_"
Var := []
Start := 1
while Pos := RegExMatch(Str, "[ a-zA-Z0-9'&,]+", Match, Start) {
	if (Trim(Match) != "")
		Var.Push(Trim(Match))
	Start := Pos + StrLen(Match)
}

for k, v in Var
	Output .= "Var[" k "]: " v "`n"
MsgBox, % Output

MsgBox % Var[Var.MaxIndex()]
exitapp
Nice. Thank You. Now I just have to figure out how to incorporate this array into a GUI...

Return to “Ask for Help (v1)”

Who is online

Users browsing this forum: No registered users and 244 guests