Split String by RegEx Topic is solved

Get help with using AutoHotkey and its commands and hotkeys
User avatar
Delta Pythagorean
Posts: 546
Joined: 13 Feb 2017, 13:44
GitHub: DelPyth
Location: Somewhere in the US

Split String by RegEx

23 May 2020, 16:11

I have looked everywhere I could to find a function to split a string by a regular expression. I've even looked in other languages for it and no luck.
Python has the ability to do this (I think natively) but I haven't seen anyone create a function to do this.
If anyone knows a function that can split a string based on a regular expression, I'd be more than happy.

- [AHK].......: 1.1.32.00 Unicode 64-bit
- [OS].........: Windows 10.0.18362
- [GITHUB]...: github.com/DeltaPyth
- [PAYPAL]....: paypal.me/DelPyth
- [DISCORD]..: Delta#3324

Remember to use [code]CODE[/code] for your multi-line scripts.
Stay safe, stay inside, and remember to wash your hands for 20 seconds!
Meroveus
Posts: 31
Joined: 23 May 2016, 17:38

Re: Split String by RegEx

23 May 2020, 20:17

Can you give an example of what you want?
The question is a little vague.

Could you not just use SubStr() ?

I did a search on "split a string using regular expression" and got a host of replies, but there's no way to tell if any of them would help.
teadrinker
Posts: 1458
Joined: 29 Mar 2015, 09:41
Contact:

Re: Split String by RegEx  Topic is solved

23 May 2020, 21:17

Code: Select all

str := "ABCDEFGCVBCRTYECEFGH"
arr := StrSplitRegEx(str, "(?<=b)c|e(?=f)", "i")
for k, v in arr
   MsgBox, % v

StrSplitRegEx(str, pattern, options := "") {
   arr := []
   prevPos := 1, prevLen := 0
   while RegExMatch(str, options . "O)" . pattern, m, m ? m.Pos + m.Len : 1) {
      arr.Push( SubStr(str, prevPos + prevLen, m.Pos - prevPos - prevLen) )
      prevPos := m.Pos, prevLen := m.Len
   }
   arr.Push( SubStr(str, prevPos + prevLen) )
   Return arr
}
Something like this?
swagfag
Posts: 3617
Joined: 11 Jan 2017, 17:59

Re: Split String by RegEx

23 May 2020, 21:53

Code: Select all

for k, v in RegExSplit("ABCDEFGCVBCRTYECEFGH", "i)(?<=b)c|e(?=f)")
	s .= k " -> " v "`n"
MsgBox % RTrim(s)

RegExSplit(ByRef String, Delimiter := "", OmitChars := "", MaxParts := -1) {
	static uFFFF := Chr(0xFFFF)

	; early exit, split by chars
	if (Delimiter = "")
		return StrSplit(String, Delimiter, OmitChars, MaxParts)

	; has regex flags?
	if RegExMatch(Delimiter, "^([^\(\)]*\))(.*)$", m) 
		Delimiter := m1 "(" m2 ")"
	else
		Delimiter := "(" Delimiter ")"

	return StrSplit(RegExReplace(String, Delimiter, uFFFF), uFFFF, OmitChars, MaxParts)
}
strsplit accepts an array of delimiters, so its unclear how that should be handled in the case of regex(specifically, when subsequent patterns match what has already been matched by previous ones)
User avatar
Delta Pythagorean
Posts: 546
Joined: 13 Feb 2017, 13:44
GitHub: DelPyth
Location: Somewhere in the US

Re: Split String by RegEx

23 May 2020, 21:58

teadrinker wrote:
23 May 2020, 21:17

Code: Select all

str := "ABCDEFGCVBCRTYECEFGH"
arr := StrSplitRegEx(str, "(?<=b)c|e(?=f)", "i")
for k, v in arr
   MsgBox, % v

StrSplitRegEx(str, pattern, options := "") {
   arr := []
   prevPos := 1, prevLen := 0
   while RegExMatch(str, options . "O)" . pattern, m, m ? m.Pos + m.Len : 1) {
      arr.Push( SubStr(str, prevPos + prevLen, m.Pos - prevPos - prevLen) )
      prevPos := m.Pos, prevLen := m.Len
   }
   arr.Push( SubStr(str, prevPos + prevLen) )
   Return arr
}
Something like this?
Surprisingly, yes! That works perfectly!

- [AHK].......: 1.1.32.00 Unicode 64-bit
- [OS].........: Windows 10.0.18362
- [GITHUB]...: github.com/DeltaPyth
- [PAYPAL]....: paypal.me/DelPyth
- [DISCORD]..: Delta#3324

Remember to use [code]CODE[/code] for your multi-line scripts.
Stay safe, stay inside, and remember to wash your hands for 20 seconds!
User avatar
Delta Pythagorean
Posts: 546
Joined: 13 Feb 2017, 13:44
GitHub: DelPyth
Location: Somewhere in the US

Re: Split String by RegEx

23 May 2020, 22:00

Meroveus wrote:
23 May 2020, 20:17
[...] Could you not just use SubStr()? [...]
This would imply that I know the position of the string I'm wanting to split by. However I can see how I would use RegExMatch's output to find the position. But the person right below you beat us to the punch.

- [AHK].......: 1.1.32.00 Unicode 64-bit
- [OS].........: Windows 10.0.18362
- [GITHUB]...: github.com/DeltaPyth
- [PAYPAL]....: paypal.me/DelPyth
- [DISCORD]..: Delta#3324

Remember to use [code]CODE[/code] for your multi-line scripts.
Stay safe, stay inside, and remember to wash your hands for 20 seconds!
User avatar
Delta Pythagorean
Posts: 546
Joined: 13 Feb 2017, 13:44
GitHub: DelPyth
Location: Somewhere in the US

Re: Split String by RegEx

23 May 2020, 22:14

Both swagfag and teadrinker solved this, but teadrinker was first and seems to me that it's the most reliable. If anyone else has a better solution, then by all means do post it.

- [AHK].......: 1.1.32.00 Unicode 64-bit
- [OS].........: Windows 10.0.18362
- [GITHUB]...: github.com/DeltaPyth
- [PAYPAL]....: paypal.me/DelPyth
- [DISCORD]..: Delta#3324

Remember to use [code]CODE[/code] for your multi-line scripts.
Stay safe, stay inside, and remember to wash your hands for 20 seconds!
teadrinker
Posts: 1458
Joined: 29 Mar 2015, 09:41
Contact:

Re: Split String by RegEx

23 May 2020, 22:30

swagfag wrote:

Code: Select all

	if RegExMatch(Delimiter, "^([^\(\)]*\))(.*)$", m) 
		Delimiter := m1 "(" m2 ")"
	else
		Delimiter := "(" Delimiter ")"
You add an extra subpattern by this, so this may cause confusion.

Code: Select all

str := "ABCDEFGCVBCRTYECEFGH"
arr := StrSplitRegEx(str, "(?<=cd)(..).*?\1", "i")
for k, v in arr
   MsgBox, % v

StrSplitRegEx(str, pattern, options := "") {
   arr := []
   prevPos := 1, prevLen := 0
   while RegExMatch(str, options . "O)" . pattern, m, m ? m.Pos + m.Len : 1) {
      arr.Push( SubStr(str, prevPos + prevLen, m.Pos - prevPos - prevLen) )
      prevPos := m.Pos, prevLen := m.Len
   }
   arr.Push( SubStr(str, prevPos + prevLen) )
   Return arr
}

Code: Select all

for k, v in RegExSplit("ABCDEFGCVBCRTYECEFGH", "i)(?<=cd)(..).*?\1")
   s .= k " -> " v "`n"
MsgBox % RTrim(s)

RegExSplit(ByRef String, Delimiter := "", OmitChars := "", MaxParts := -1) {
   static uFFFF := Chr(0xFFFF)

   ; early exit, split by chars
   if (Delimiter = "")
      return StrSplit(String, Delimiter, OmitChars, MaxParts)

   ; has regex flags?
   if RegExMatch(Delimiter, "^([^\(\)]*\))(.*)$", m) 
      Delimiter := m1 "(" m2 ")"
   else
      Delimiter := "(" Delimiter ")"

   return StrSplit(RegExReplace(String, Delimiter, uFFFF), uFFFF, OmitChars, MaxParts)
}
swagfag
Posts: 3617
Joined: 11 Jan 2017, 17:59

Re: Split String by RegEx

23 May 2020, 23:32

yeah, no clean way of resolving that using this approach unfortunately
scratch that, see https://www.autohotkey.com/boards/viewtopic.php?p=331034#p331034
Last edited by swagfag on 24 May 2020, 08:01, edited 1 time in total.
User avatar
Chunjee
Posts: 430
Joined: 18 Apr 2014, 19:05
GitHub: Chunjee

Re: Split String by RegEx

Yesterday, 00:30

https://biga-ahk.github.io/biga.ahk/#/?id=split does this I believe. Though regex patterns currently need to be specified with surrounding / characters

Code: Select all

A := new biga() ; requires https://www.npmjs.com/package/biga.ahk

A.split("a--b-c", "/[\-]+/")
; => ["a", "b", "c"]

currently no way to specify "i" regex option.

Code: Select all

A.split("ABCDEFGCVBCRTYECEFGH", "/(?<=B)C|E(?=F)/")
; => ["AB", "D", "FGCVB", "RTYEC", "FGH"]
swagfag
Posts: 3617
Joined: 11 Jan 2017, 17:59

Re: Split String by RegEx

Yesterday, 08:01

actually i dont remember why i put an extra set of parens in there

Code: Select all

RegExSplit(ByRef String, Delimiter := "", OmitChars := "", MaxParts := -1) {
	static uFFFF := Chr(0xFFFF)

	; early exit, split by chars
	if (Delimiter = "")
		return StrSplit(String, Delimiter, OmitChars, MaxParts)

	return StrSplit(RegExReplace(String, Delimiter, uFFFF), uFFFF, OmitChars, MaxParts)
}
this seems to work just fine
teadrinker
Posts: 1458
Joined: 29 Mar 2015, 09:41
Contact:

Re: Split String by RegEx

Yesterday, 13:50

Yeah, this seems to work :)

Return to “Ask For Help”

Who is online

Users browsing this forum: along, boiler, brotherS, dtsmarin2, Google [Bot], MannyKSoSo, nmnogueira, Rohwedder and 158 guests