Regex help: Capture content of broken ini key? Topic is solved

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
User avatar
kunkel321
Posts: 972
Joined: 30 Nov 2015, 21:19

Regex help: Capture content of broken ini key?

11 Oct 2021, 16:14

Hi Folks,
I’m using ini files for an unusual purpose. I’ll have a key that consists of an `n separated list. Of course this is not “proper” use of an ini file. In my haystack, below, you can see that the first key first main content= is a normal key. I can use iniRead to get its content with no problem. However the key, first optional content= is messed up because of the line breaks. If I try to retrieve its content with iniRead, then I don’t get the list of fruits because it stops looking when it gets to the end of the line (at least I assume that’s the problem).

If I use iniRead to capture the entire [section] however, it does show the embedded lists, as desired.

My challenge is to get the embedded “sub lists” into variables… Really it’s just the first sub list that I need. I’ll then remove the key name and find the next one, and so on. It is relevant that the keys which are lists, all contain the word “optional” (lowercase) in the name of the key.

My thought is to have a RegEx that is like the following, which I've put on three lines for readability:
(non-captured group that isolates the first key name that contains ‘options’)
(captured group that contains my list)
(non-captured group that isolates the next key which is found, and therefor indicates that I’ve reached then end of the list).


Hopefully that makes sense. I can explain more about what I'm trying to accomplish, if it helps.

Code: Select all

haystack =
(
[My Key List]
first main content=The food eater's name is [n].  His favorite fruits are;  
first optional content=
1 apples
1 bananas
0 cherries
1 oranges
0 mangoes
0 kiwis
second main content=He prefers them 
second optional content=
1 fresh
0 canned
0 frozen
third main=[n] eats fruits
third optional=
1 daily.
0 weekly.
0 less often than he probably should.
)
needle = 
RegExMatch(haystack, needle, outVar)
MsgBox  outVar is`n`n|%outVar%|
ExitApp 
ste(phen|ve) kunkel
User avatar
mikeyww
Posts: 26596
Joined: 09 Sep 2014, 18:38

Re: Regex help: Capture content of broken ini key?

11 Oct 2021, 16:19

If RegEx seems tricky for this, you can always use Loop, Read.
User avatar
kunkel321
Posts: 972
Joined: 30 Nov 2015, 21:19

Re: Regex help: Capture content of broken ini key?

11 Oct 2021, 16:47

Thanks Mike. I was also thinking of using a parse loop with a regExMatch. Maybe Loop, Read would be better (and faster) though. I will investigate.
ste(phen|ve) kunkel
User avatar
Xtra
Posts: 2744
Joined: 02 Oct 2015, 12:15

Re: Regex help: Capture content of broken ini key?

11 Oct 2021, 17:02

You can use as an ini file by using a unique delimiter:

Code: Select all

[My Key List]
first main content=The food eater's name is [n].  His favorite fruits are;  
first optional content=1 apples\n1 bananas\n0 cherries\n1 oranges\n0 mangoes\n0 kiwis
second main content=He prefers them 
second optional content=\n1 fresh\n0 canned\n0 frozen
third main=[n] eats fruits
third optional=1 daily.\n0 weekly.\n0 less often than he probably should.

Code: Select all

IniRead, first_optional_content, myinifile.ini, My Key List
first_optional_content := StrReplace(first_optional_content, "\n", "`n")
teadrinker
Posts: 4309
Joined: 29 Mar 2015, 09:41
Contact:

Re: Regex help: Capture content of broken ini key?  Topic is solved

11 Oct 2021, 17:22

Code: Select all

haystack =
(
[My Key List]
first main content=The food eater's name is [n].  His favorite fruits are;  
first optional content=
1 apples
1 bananas
0 cherries
1 oranges
0 mangoes
0 kiwis
second main content=He prefers them 
second optional content=
1 fresh
0 canned
0 frozen
third main=[n] eats fruits
third optional=
1 daily.
0 weekly.
0 less often than he probably should.
)
while RegExMatch(haystack, "sO)\R\V*\boptional\b\V*=\R\K.+?(?=\R\V+=\V*\R|$)", m, m ? m.Pos + m.Len : 1)
   MsgBox, % m[0]
User avatar
kunkel321
Posts: 972
Joined: 30 Nov 2015, 21:19

Re: Regex help: Capture content of broken ini key?

12 Oct 2021, 14:38

Thanks Everyone!

Re Xtra: I also was thinking about separating|with|pike|characters. I figured keeping them as a list would make the ini files easier to work with though..

Re Mike: I experimented with using a line-by-line loop. That was difficult too though. The problem is that once you detect a "key=", you then have to capture the *next* line. Then at the end, you have to stop capturing before the next key. I'm guessing that this can be done with an incrementing variable.

Re Teadrinker: Indeed your solution does isolate those broken keys!!! I have to experiment with your code, then maybe ask some clarifying questions. I tried simply taking your regex and using it for the needle variable in my original script, but of course it wasn't as simple as that.

Thanks Again Folks!
ste(phen|ve) kunkel
teadrinker
Posts: 4309
Joined: 29 Mar 2015, 09:41
Contact:

Re: Regex help: Capture content of broken ini key?

12 Oct 2021, 14:51

kunkel321 wrote: maybe ask some clarifying questions.
I will be glad to help!
User avatar
SpeedMaster
Posts: 494
Joined: 12 Nov 2016, 16:09

Re: Regex help: Capture content of broken ini key?

12 Oct 2021, 16:55

I would rather create a dedicated Myiniread() function that calls the xStr() function of SKAN 8-)

Code: Select all

haystack =
(
[My Key List]
first main content=The food eater's name is [n].  His favorite fruits are;  
first optional content=
1 apples
1 bananas
0 cherries
1 oranges
0 mangoes
0 kiwis
second main content=He prefers them 
second optional content=
1 fresh
0 canned
0 frozen
third main=[n] eats fruits
third optional=
1 daily.
0 weekly.
0 less often than he probably should.
)

msgbox, % Myiniread(haystack,"first optional content")
msgbox, % Myiniread(haystack,"second optional content")
msgbox, % Myiniread(haystack,"third optional")

Myiniread(haystack,key){
haystack .= "`n="
return xStr(xStr(haystack,0, key . "=" , "="),,,"`n",,0)
}

;https://www.autohotkey.com/boards/viewtopic.php?t=74050
xStr(ByRef H, C:=0, B:="", E:="",ByRef BO:=1, EO:="", BI:=1, EI:=1, BT:="", ET:="") {                           
Local L, LB, LE, P1, P2, Q, N:="", F:=0                 ; xStr v0.97 by SKAN on D1AL/D343 @ tiny.cc/xstr  
Return SubStr(H,!(ErrorLevel:=!((P1:=(L:=StrLen(H))?(LB:=StrLen(B))?(F:=InStr(H,B,C&1,BO,BI))?F+(BT=N?LB
:BT):0:(Q:=(BO=1&&BT>0?BT+1:BO>0?BO:L+BO))>1?Q:1:0)&&(P2:=P1?(LE:=StrLen(E))?(F:=InStr(H,E,C>>1,EO=N?(F
?F+LB:P1):EO,EI))?F+LE-(ET=N?LE:ET):0:EO=N?(ET>0?L-ET+1:L+1):P1+EO:0)>=P1))?P1:L+1,(BO:=Min(P2,L+1))-P1)  
}

teadrinker
Posts: 4309
Joined: 29 Mar 2015, 09:41
Contact:

Re: Regex help: Capture content of broken ini key?

12 Oct 2021, 17:56

SpeedMaster wrote:

Code: Select all

xStr(ByRef H, C:=0, B:="", E:="",ByRef BO:=1, EO:="", BI:=1, EI:=1, BT:="", ET:="") {                           
Local L, LB, LE, P1, P2, Q, N:="", F:=0                 ; xStr v0.97 by SKAN on D1AL/D343 @ tiny.cc/xstr  
Return SubStr(H,!(ErrorLevel:=!((P1:=(L:=StrLen(H))?(LB:=StrLen(B))?(F:=InStr(H,B,C&1,BO,BI))?F+(BT=N?LB
:BT):0:(Q:=(BO=1&&BT>0?BT+1:BO>0?BO:L+BO))>1?Q:1:0)&&(P2:=P1?(LE:=StrLen(E))?(F:=InStr(H,E,C>>1,EO=N?(F
?F+LB:P1):EO,EI))?F+LE-(ET=N?LE:ET):0:EO=N?(ET>0?L-ET+1:L+1):P1+EO:0)>=P1))?P1:L+1,(BO:=Min(P2,L+1))-P1)  
}
Oh my God! Why not in one line? :crazy:
User avatar
kunkel321
Posts: 972
Joined: 30 Nov 2015, 21:19

Re: Regex help: Capture content of broken ini key?

13 Oct 2021, 14:54

That xStr function is VERY cool! I suppose I should use the "KISS" principle though -- "Keep It Simple Stephen."

I think I've modified Teadrinker's regex to do what I need...

As you might have guessed, the lists (which are embedded in certain keys), get used in simple "sub-GUIs" of checkboxes. The entire ini [section] is a text replacement that gets called from a parent GUI. At run-time, if I choose the replacement text about fruit eating (i.e. the section), then the sub-GUIs pop up one at a time, letting me customize the replacement text on-the-fly.

My plan is that I'll capture the name of the first key that has the string "options" in it, i.e. theKey, and the corresponding key value, i.e. theList. After processing theList (by choosing the desired items), I'll use StrReplace to remove the theKey name and to replace theList with the desired items. Then I'll run the regExMatches again and do the next key and list.

Code: Select all

haystack =
(
[Text Replacement about Food Eating]
first main content=The food eater's name is [n].  His favorite fruits are;  
first optional content=
1 apples
1 bananas
0 cherries
1 oranges
0 mangoes
0 kiwis
second main content=He prefers them 
second optional content=
1 fresh
0 canned
0 frozen
third main=[n] eats fruits
third optional=
1 daily.
0 weekly.
0 less often than he probably should.
)

RegExMatch(haystack, "s)\R\V*\boptional\b\V*=\R\K.+?(?=\R\V+=\V*\R|$)", theList)
RegExMatch(haystack, "s)\R\V*\boptional\b\V*=\R", theKey)

MsgBox, the list is %theList% `n`nthe key is %theKey% ;works! :)
   
; Notes for newbs (like me).  Hopefully correct.
; O) match object is stored in OutputVar
; s) improves performance.
; \R means "any single newline of any type"
; \V matches a character that is not a vertical whitespace character. 
; * matches zero or more of the preceding character
; ? matches zero or one of the preceding character.
; \b means "word boundary", which is like an anchor because it doesn't consume any characters.
; \K requires the proceeding part to be present, but doesn't capture it.   
; . a dot matches any single character which is not part of a newline.
; + A plus sign matches one or more of the preceding character, class, or subpattern.
; | alternation (this|or|that)
; $ matches end of line. 
Thanks again Folks!
ste(phen|ve) kunkel
teadrinker
Posts: 4309
Joined: 29 Mar 2015, 09:41
Contact:

Re: Regex help: Capture content of broken ini key?

13 Oct 2021, 15:15

Perhaps this is a bit easier:

Code: Select all

haystack =
(
[My Key List]
first main content=The food eater's name is [n].  His favorite fruits are;  
first optional content=
1 apples
1 bananas
0 cherries
1 oranges
0 mangoes
0 kiwis
second main content=He prefers them 
second optional content=
1 fresh
0 canned
0 frozen
third main=[n] eats fruits
third optional=
1 daily.
0 weekly.
0 less often than he probably should.
)
obj := {}
while RegExMatch(haystack, "sO)\R(\V*\boptional\b\V*)=\R\K.+?(?=\R\V+=\V*\R|$)", m, m ? m.Pos + m.Len : 1)
   obj[ m[1] ] := m[0]

MsgBox, % obj["first optional content"]
MsgBox, % obj["second optional content"]
MsgBox, % obj["third optional"]
User avatar
kunkel321
Posts: 972
Joined: 30 Nov 2015, 21:19

Re: Regex help: Capture content of broken ini key?

14 Oct 2021, 14:41

teadrinker wrote:
13 Oct 2021, 15:15
Perhaps this is a bit easier:

Code: Select all

haystack =
(
[My Key List]
first main content=The food eater's name is [n].  His favorite fruits are;  
first optional content=
1 apples
1 bananas
0 cherries
1 oranges
0 mangoes
0 kiwis
second main content=He prefers them 
second optional content=
1 fresh
0 canned
0 frozen
third main=[n] eats fruits
third optional=
1 daily.
0 weekly.
0 less often than he probably should.
)
obj := {}
while RegExMatch(haystack, "sO)\R(\V*\boptional\b\V*)=\R\K.+?(?=\R\V+=\V*\R|$)", m, m ? m.Pos + m.Len : 1)
   obj[ m[1] ] := m[0]

MsgBox, % obj["first optional content"]
MsgBox, % obj["second optional content"]
MsgBox, % obj["third optional"]
I think "easier" might be a relative term here! LOL.

Interesting thing:
I've integrated the RegEx into my script, and it works great, however I found an ini section that doesn't get processed correctly...
Have a look.
The last line main key=... is erroneously getting included in theList. Maybe it's because of all the square brackets I have in there (??)

Code: Select all

haystack =
(
[pronoun tester]
main key=[n] prefers
radio optional key=
0 male
0 female
0 gender neutral

main key=pronouns.  [e] uses [e]/[m] when describing [m]self.  [e] likes the freedom to express [s] rights.  The choice is [r]. 
)

if RegExMatch(haystack, "s)\R\V*\boptional\b\V*=\R", theKey)
{
	RegExMatch(haystack, "s)\R\V*\boptional\b\V*=\R\K.+?(?=\R\V+=\V*\R|$)", theList)
	MsgBox, the key is %theKey%`n`nthe list is`n%theList%
}
else
	MsgBox No more options keys.
UPDATE: Actually, if I put an extra new line at the very end of the string (end of last line), then it does work. HOWEVER my actual working ini file does indeed have a couple of newline characters there and it is not working in that circumstance...

SECOND UPDATE: I figured it out. In my ini file, theList has to end with two keys, -or- with no key.
This is okay:

Code: Select all

[pronoun tester]
main key=[n] prefers
radio optional key=
0 male
0 female
0 gender neutral
 
And this is okay:

Code: Select all

[pronoun tester]
main key=[n] prefers
radio optional key=
0 male
0 female
0 gender neutral
main key=pronouns.  [e] uses [e]/[m] when describing [m]self.  [e] likes the freedom to express [s] rights.  The choice is [r]. 
another key=
But if there is only a single key under the list, then the key gets captured with the list. Teadrinker (or others), would it be feasible to adjust the regex to accommodate this?

Third Update: Zipped up project attached, if anyone is curious. It is meant to be run from another (already running) script.
MultiTool.zip
(1.18 MiB) Downloaded 26 times
ste(phen|ve) kunkel
teadrinker
Posts: 4309
Joined: 29 Mar 2015, 09:41
Contact:

Re: Regex help: Capture content of broken ini key?

14 Oct 2021, 17:33

Fixed:

Code: Select all

haystack =
(
[pronoun tester]
main key=[n] prefers
radio optional key=
0 male
0 female
0 gender neutral

main key=pronouns.  [e] uses [e]/[m] when describing [m]self.  [e] likes the freedom to express [s] rights.  The choice is [r]. 
)

while RegExMatch(haystack, "sO)\R(?:\V*\boptional\b\V*)=\R(.+?)\R*(\V+=\V*(\R|$)|$)", m, m ? m.Pos(1) + m.Len(1) : 1)
   MsgBox, % "|" m[1] "|"
User avatar
kunkel321
Posts: 972
Joined: 30 Nov 2015, 21:19

Re: Regex help: Capture content of broken ini key?

15 Oct 2021, 10:28

Thank you for the fast reply Teadrinker.

I see that in the context of your script with the loop, it does work as expected. When I use it in this script
Spoiler
however, the last line is still present. I noticed (with your previous version) that I had to remove the O) option if I want include the outVar in the body of the command. Maybe this breaks the regex(??) It doesn't seem like it would effect it though. I will tinker with this more later.
ste(phen|ve) kunkel
teadrinker
Posts: 4309
Joined: 29 Mar 2015, 09:41
Contact:

Re: Regex help: Capture content of broken ini key?

15 Oct 2021, 10:37

In this case you should use subpatterns:

Code: Select all

haystack =
(
[pronoun tester]
main key=[n] prefers
radio optional key=
0 male
0 female
0 gender neutral

main key=pronouns.  [e] uses [e]/[m] when describing [m]self.  [e] likes the freedom to express [s] rights.  The choice is [r].
key=
)

RegExMatch(haystack, "s)\R(\V*\boptional\b\V*)=\R(.+?)\R*(\V+=\V*(\R|$)|$)", theList)
MsgBox, % "key: " . theList1 . "`n" . "value:`n" . theList2
User avatar
kunkel321
Posts: 972
Joined: 30 Nov 2015, 21:19

Re: Regex help: Capture content of broken ini key?

15 Oct 2021, 14:56

teadrinker wrote:
15 Oct 2021, 10:37
In this case you should use subpatterns: [...]
Yep -- That fixed it! Many thanks Teadrinker!

EDIT: Actually I made a minor change. Hopefully this won't cause it to have other problems... I need it to capture the ending equal sign in the "optional key=" so I can then do a stringReplace and cull the keys out. Removing this
s)\R(\V*\boptional\b\V*)=\R(.+?)\R*(\V+=\V*(\R|$)|$) equal sign seemed to work.
ste(phen|ve) kunkel
teadrinker
Posts: 4309
Joined: 29 Mar 2015, 09:41
Contact:

Re: Regex help: Capture content of broken ini key?

15 Oct 2021, 16:34

kunkel321 wrote: s)\R(\V*\boptional\b\V*)=\R(.+?)\R*(\V+=\V*(\R|$)|$) equal sign seemed to work
Better to just move the closing parenthesis of the first subpattern: s)\R(\V*\boptional\b\V*=)\R(.+?)\R*(\V+=\V*(\R|$)|$)
User avatar
kunkel321
Posts: 972
Joined: 30 Nov 2015, 21:19

Re: Regex help: Capture content of broken ini key?

15 Oct 2021, 16:51

teadrinker wrote:
15 Oct 2021, 16:34
Better to just move the closing parenthesis of the first subpattern: s)\R(\V*\boptional\b\V*=)\R(.+?)\R*(\V+=\V*(\R|$)|$)
Excellent! :thumbup:
ste(phen|ve) kunkel

Return to “Ask for Help (v1)”

Who is online

Users browsing this forum: CaseyMoon, imstupidpleshelp, JoeWinograd, mmflume, Rohwedder and 186 guests