Split String by RegEx Topic is solved

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
User avatar
Delta Pythagorean
Posts: 627
Joined: 13 Feb 2017, 13:44
Location: Somewhere in the US
Contact:

Split String by RegEx

23 May 2020, 16:11

I have looked everywhere I could to find a function to split a string by a regular expression. I've even looked in other languages for it and no luck.
Python has the ability to do this (I think natively) but I haven't seen anyone create a function to do this.
If anyone knows a function that can split a string based on a regular expression, I'd be more than happy.

[AHK]......: v2.0.12 | 64-bit
[OS].......: Windows 11 | 23H2 (OS Build: 22621.3296)
[GITHUB]...: github.com/DelPyth
[PAYPAL]...: paypal.me/DelPyth
[DISCORD]..: tophatcat

Meroveus
Posts: 44
Joined: 23 May 2016, 17:38

Re: Split String by RegEx

23 May 2020, 20:17

Can you give an example of what you want?
The question is a little vague.

Could you not just use SubStr() ?

I did a search on "split a string using regular expression" and got a host of replies, but there's no way to tell if any of them would help.
teadrinker
Posts: 4326
Joined: 29 Mar 2015, 09:41
Contact:

Re: Split String by RegEx  Topic is solved

23 May 2020, 21:17

Code: Select all

str := "ABCDEFGCVBCRTYECEFGH"
arr := StrSplitRegEx(str, "(?<=b)c|e(?=f)", "i")
for k, v in arr
   MsgBox, % v

StrSplitRegEx(str, pattern, options := "") {
   arr := []
   prevPos := 1, prevLen := 0
   while RegExMatch(str, options . "O)" . pattern, m, m ? m.Pos + m.Len : 1) {
      arr.Push( SubStr(str, prevPos + prevLen, m.Pos - prevPos - prevLen) )
      prevPos := m.Pos, prevLen := m.Len
   }
   arr.Push( SubStr(str, prevPos + prevLen) )
   Return arr
}
Something like this?
swagfag
Posts: 6222
Joined: 11 Jan 2017, 17:59

Re: Split String by RegEx

23 May 2020, 21:53

Code: Select all

for k, v in RegExSplit("ABCDEFGCVBCRTYECEFGH", "i)(?<=b)c|e(?=f)")
	s .= k " -> " v "`n"
MsgBox % RTrim(s)

RegExSplit(ByRef String, Delimiter := "", OmitChars := "", MaxParts := -1) {
	static uFFFF := Chr(0xFFFF)

	; early exit, split by chars
	if (Delimiter = "")
		return StrSplit(String, Delimiter, OmitChars, MaxParts)

	; has regex flags?
	if RegExMatch(Delimiter, "^([^\(\)]*\))(.*)$", m) 
		Delimiter := m1 "(" m2 ")"
	else
		Delimiter := "(" Delimiter ")"

	return StrSplit(RegExReplace(String, Delimiter, uFFFF), uFFFF, OmitChars, MaxParts)
}
strsplit accepts an array of delimiters, so its unclear how that should be handled in the case of regex(specifically, when subsequent patterns match what has already been matched by previous ones)
User avatar
Delta Pythagorean
Posts: 627
Joined: 13 Feb 2017, 13:44
Location: Somewhere in the US
Contact:

Re: Split String by RegEx

23 May 2020, 21:58

teadrinker wrote:
23 May 2020, 21:17

Code: Select all

str := "ABCDEFGCVBCRTYECEFGH"
arr := StrSplitRegEx(str, "(?<=b)c|e(?=f)", "i")
for k, v in arr
   MsgBox, % v

StrSplitRegEx(str, pattern, options := "") {
   arr := []
   prevPos := 1, prevLen := 0
   while RegExMatch(str, options . "O)" . pattern, m, m ? m.Pos + m.Len : 1) {
      arr.Push( SubStr(str, prevPos + prevLen, m.Pos - prevPos - prevLen) )
      prevPos := m.Pos, prevLen := m.Len
   }
   arr.Push( SubStr(str, prevPos + prevLen) )
   Return arr
}
Something like this?
Surprisingly, yes! That works perfectly!

[AHK]......: v2.0.12 | 64-bit
[OS].......: Windows 11 | 23H2 (OS Build: 22621.3296)
[GITHUB]...: github.com/DelPyth
[PAYPAL]...: paypal.me/DelPyth
[DISCORD]..: tophatcat

User avatar
Delta Pythagorean
Posts: 627
Joined: 13 Feb 2017, 13:44
Location: Somewhere in the US
Contact:

Re: Split String by RegEx

23 May 2020, 22:00

Meroveus wrote:
23 May 2020, 20:17
[...] Could you not just use SubStr()? [...]
This would imply that I know the position of the string I'm wanting to split by. However I can see how I would use RegExMatch's output to find the position. But the person right below you beat us to the punch.

[AHK]......: v2.0.12 | 64-bit
[OS].......: Windows 11 | 23H2 (OS Build: 22621.3296)
[GITHUB]...: github.com/DelPyth
[PAYPAL]...: paypal.me/DelPyth
[DISCORD]..: tophatcat

User avatar
Delta Pythagorean
Posts: 627
Joined: 13 Feb 2017, 13:44
Location: Somewhere in the US
Contact:

Re: Split String by RegEx

23 May 2020, 22:14

Both swagfag and teadrinker solved this, but teadrinker was first and seems to me that it's the most reliable. If anyone else has a better solution, then by all means do post it.

[AHK]......: v2.0.12 | 64-bit
[OS].......: Windows 11 | 23H2 (OS Build: 22621.3296)
[GITHUB]...: github.com/DelPyth
[PAYPAL]...: paypal.me/DelPyth
[DISCORD]..: tophatcat

teadrinker
Posts: 4326
Joined: 29 Mar 2015, 09:41
Contact:

Re: Split String by RegEx

23 May 2020, 22:30

swagfag wrote:

Code: Select all

	if RegExMatch(Delimiter, "^([^\(\)]*\))(.*)$", m) 
		Delimiter := m1 "(" m2 ")"
	else
		Delimiter := "(" Delimiter ")"
You add an extra subpattern by this, so this may cause confusion.

Code: Select all

str := "ABCDEFGCVBCRTYECEFGH"
arr := StrSplitRegEx(str, "(?<=cd)(..).*?\1", "i")
for k, v in arr
   MsgBox, % v

StrSplitRegEx(str, pattern, options := "") {
   arr := []
   prevPos := 1, prevLen := 0
   while RegExMatch(str, options . "O)" . pattern, m, m ? m.Pos + m.Len : 1) {
      arr.Push( SubStr(str, prevPos + prevLen, m.Pos - prevPos - prevLen) )
      prevPos := m.Pos, prevLen := m.Len
   }
   arr.Push( SubStr(str, prevPos + prevLen) )
   Return arr
}

Code: Select all

for k, v in RegExSplit("ABCDEFGCVBCRTYECEFGH", "i)(?<=cd)(..).*?\1")
   s .= k " -> " v "`n"
MsgBox % RTrim(s)

RegExSplit(ByRef String, Delimiter := "", OmitChars := "", MaxParts := -1) {
   static uFFFF := Chr(0xFFFF)

   ; early exit, split by chars
   if (Delimiter = "")
      return StrSplit(String, Delimiter, OmitChars, MaxParts)

   ; has regex flags?
   if RegExMatch(Delimiter, "^([^\(\)]*\))(.*)$", m) 
      Delimiter := m1 "(" m2 ")"
   else
      Delimiter := "(" Delimiter ")"

   return StrSplit(RegExReplace(String, Delimiter, uFFFF), uFFFF, OmitChars, MaxParts)
}
swagfag
Posts: 6222
Joined: 11 Jan 2017, 17:59

Re: Split String by RegEx

23 May 2020, 23:32

yeah, no clean way of resolving that using this approach unfortunately
scratch that, see https://www.autohotkey.com/boards/viewtopic.php?p=331034#p331034
Last edited by swagfag on 24 May 2020, 08:01, edited 1 time in total.
User avatar
Chunjee
Posts: 1420
Joined: 18 Apr 2014, 19:05
Contact:

Re: Split String by RegEx

24 May 2020, 00:30

https://biga-ahk.github.io/biga.ahk/#/?id=split does this I believe. Though regex patterns currently need to be specified with surrounding / characters

Code: Select all

A := new biga() ; requires https://www.npmjs.com/package/biga.ahk

A.split("a--b-c", "/[\-]+/")
; => ["a", "b", "c"]

currently no way to specify "i" regex option.

Code: Select all

A.split("ABCDEFGCVBCRTYECEFGH", "/(?<=B)C|E(?=F)/")
; => ["AB", "D", "FGCVB", "RTYEC", "FGH"]
swagfag
Posts: 6222
Joined: 11 Jan 2017, 17:59

Re: Split String by RegEx

24 May 2020, 08:01

actually i dont remember why i put an extra set of parens in there

Code: Select all

RegExSplit(ByRef String, Delimiter := "", OmitChars := "", MaxParts := -1) {
	static uFFFF := Chr(0xFFFF)

	; early exit, split by chars
	if (Delimiter = "")
		return StrSplit(String, Delimiter, OmitChars, MaxParts)

	return StrSplit(RegExReplace(String, Delimiter, uFFFF), uFFFF, OmitChars, MaxParts)
}
this seems to work just fine
teadrinker
Posts: 4326
Joined: 29 Mar 2015, 09:41
Contact:

Re: Split String by RegEx

24 May 2020, 13:50

Yeah, this seems to work :)

Return to “Ask for Help (v1)”

Who is online

Users browsing this forum: RandomBoy, scriptor2016 and 358 guests