Hi all,
I just wanted to come back to this and post my final response as I found a solution to this which I wanted to share.
Sir Teddy's and other commentors on this post were very helpful and their code did work, however, I realised that there was a dedicated library for what I wanted to do. The library is grep which is discussed on the aHK forums here
https://autohotkey.com/board/topic/14817-grep-global-regular-expression-match/
I found that there was a dead link to the grep library code on the forum post, so I will link it here for futur users.
Grep.ahk:
Code: Select all
/*
Function: grep
Sets the output variable to all the entire or specified subpattern matches and returns their offsets within the haystack.
Parameters:
h - haystack
n - regex
v - output variable (ByRef)
s - (optional) starting position (default: 1)
e - (optional) subpattern to save in the output variable, where 0 is the entire match (default: 0)
d - (optional) delimiter - the character that seperates multiple values (default: EOT (0x04))
Returns:
The position (or offset) of each entire match.
Remarks:
Since multiple values are seperated with the delimiter any found within the haystack will be removed.
License:
- Version 2.0 <http www.autohotkey.net /~polyethene/#grep> Broken Link for safety
- Dedicated to the public domain (CC0 1.0) <http creativecommons.org /publicdomain/zero/1.0/> Broken Link for safety
*/
grep(h, n, ByRef v, s = 1, e = 0, d = "") {
v =
StringReplace, h, h, %d%, , All
Loop
If s := RegExMatch(h, n, c, s)
p .= d . s, s += StrLen(c), v .= d . (e ? c%e% : c)
Else Return, SubStr(p, 2), v := SubStr(v, 2)
}
The code above is a library
https://www.autohotkey.com/docs/commands/_Include.htm, meaning you should create a new ahk file (preferably in the same directory as the ahk file you are working on) called grep.ahk and post the grep code above into it. On the ahk file you are working on you should include at the top the following code:
You can now use the functon grep(put parameters here) in your ahk script. The code above contains instructions on how to use grep.
Why use grep over RegExMatch? grep is essentially RegExMatch but it returns not just the first RegExMatch in the string but all the matches. In other words it works like Sir Teddy's code above.
To answer my own initial question, I will need to use grep and RegEx to retreive text above and below string.
First I need to decide what my string is. In the following example text:
"Fifty years, and you remain a child,
Infinitely valued, loved, and treasured.
Fierce winds may rip away at autumn leaves,
The kind of turn by which one's life is measured.
Yet Eden lingers, innocent and wild.
Years matter not, nor chance, nor choice, nor change.
Ever you must be a child still.
Ambition matters not, nor joy, nor grief,
Reason, passion, temper, fortune, will,
Since you know love that nothing can estrange."
Let's say I chose love to be the desired string/word I am looking for.
The RegEx would be something like this: (ofcourse it will need to be adapted to each individual use case)
now this would match nothing in the example text above. That is because Love starts with a capital letter and there are only love strings with lower case in the example text.
I would therefore have to use this regex
(Remember that when using RegExMatch or grep the actual regex will need to be enclosed in quotes "). "i) " makes RegEx non case-sensitive.
Now that will match the single word "love" in the last sentence of the example text.
We now note that it does not match "loved" in the second line. In RegEx, a single dot (i.e a .) matches any character, and a single dot followed by a question mark (i.e .?) means either nothing or any character. So, the regex "i) love.?" will match loved. However this would not match "lovers" because there are two characters after "love". To match "love" "loved" and "lovers" we can use the regex "i) love.?.?" Note that we use a question mark so that the regex will still match love, if we did not use a question mark in the aforementioned regex it would only match lovers.
So, to come back to our example text, if we used grep with the regex "i) love.?" it would return "loved" and "love". Brilliant. But what I wanted was to retreive text above and below the string. As Sir Teddy pointed out in his code, a single dot . is any character (whether a word or space) so if we use grep with the regex "i) ...love.?..." it would return "loved" and "love" with the three characters before and after so it would return "d, loved" and "ow love". You can put as many dots before your searched string as you like.
What I eventually realised I wanted to do was get all the text before the string up to the start of the sentence and all the text after the strong until the end of the sentence. To do this I used grep with the RegEx "i) ^.*?love.?.*?$" and this got me "Infinitely valued, loved, and treasured." and "Since you know love that nothing can estrange."". This is because ^ means start of sentence (start of sentence is after a `n which is a new line a.k.a when you press enter in microsoft word), .* means match any character until the end of the whole document, but we don't want to match everything until the end of the whole document we only want everything before love.? until the start of the sentence, so we put love.? after .* (so .*love.?) so that it stops matching everything when it gets to love.? and because there are two instances of love.? in the example text, we use .*? to stop at the first one (so that it matches each sentence with love.? in it, instead of one match with both love.? in it). So we now have two instances of everything from the start of the sentence with love.? in it to love.? match, but we also want everything after love.? until the end of the sentence. $ means the end of the sentence (before a new line `n) and we will again use .*? which means match the first instance of everything after love.? is matched until the end of the sentence where love.? was matched. We then get "Infinitely valued, loved, and treasured." and "Since you know love that nothing can estrange.""!!
I appreciate this is a very crude explanation of RegEx and grep, and that a basic understanding of RegEx is probably required for any of this to make sense. I just found the answer and wanted to briefly give some feedback during my lunch break.
RegEx quick reference guide can be found here
https://www.autohotkey.com/docs/misc/RegEx-QuickRef.htm
And a very useful tutorial on RegEx can be found here
https://www.autohotkey.com/boards/viewtopic.php?t=28031
Thanks again to everyone who helped me with this!