 |
AutoHotkey Community Let's help each other out
|
| View previous topic :: View next topic |
| Author |
Message |
quicktest
Joined: 30 Jul 2004 Posts: 42
|
Posted: Thu Feb 08, 2007 8:39 pm Post subject: RegEx results contain unwanted characters |
|
|
I have been trying to learn Regular Expression today, and thought I was beginning to understand it when I ran into this problem. Any help would be appreciated.
My task is to grab the values for each ID from right hand side:
| Quote: | ID 1 : ABC
...
ID 21 = DEF
...
ID 307 - GHI
.... |
The results I want (in separate iterations) are
What I am doing right now is reading each ID line in a loop, then run the following code on each line:
| Code: | RegExp = (?::|-|=)\s(.*)\R
RegExMatch(Line, RegExp, Result) |
But even though I specified ?: for the first set of (), I am still getting
Not only are the : = - included, a space is included as well, and I believe the new line characters too. I guess I am not too clear on what qualifies as a subpattern to be returned, and what are not?
2nd question is if there is a way for RegExMatch to return the RIGHT-most position of the match instead of the left-most? At least that way I can obtain my values via crude calculations and string functions. |
|
| Back to top |
|
 |
Titan
Joined: 11 Aug 2004 Posts: 5068 Location: imaginationland
|
Posted: Thu Feb 08, 2007 9:13 pm Post subject: |
|
|
Try %Result1%. You might not even need a loop if ids := RegExReplace(text, "ID.*?[:=\-]\s*|`n(?!ID).*?(?=$|`n)") works. _________________
RegExReplace("irc.freenode.net/ahk", "^(?=(.(?=[\0-r\[]*((?<=\.).))))(?:[c-\x73]{2,8}(\S))+((2)|\b[^\2-]){2}\D++$", "$u3$1$3$4$2") |
|
| Back to top |
|
 |
quicktest
Joined: 30 Jul 2004 Posts: 42
|
Posted: Thu Feb 08, 2007 10:47 pm Post subject: |
|
|
Thanks much Titan; it seems the result I wanted was stored in Result1. Could you explain a bit on why this is? I understand RegExMatch can store multiple instances of values in an expandable array, although I only had 1 instance of value in the line, and do not understand why my RegExp would require more than 1 results to be created.
The ids expression you proposed appears to store the entire ID list in ids, although with all : and = and - removed. I will have to study it a little to understand why this is.
Thanks again for your help. |
|
| Back to top |
|
 |
Titan
Joined: 11 Aug 2004 Posts: 5068 Location: imaginationland
|
Posted: Thu Feb 08, 2007 11:56 pm Post subject: |
|
|
| quicktest wrote: | | 2nd question is if there is a way for RegExMatch to return the RIGHT-most position of the match instead of the left-most? At least that way I can obtain my values via crude calculations and string functions. | You need a lazy match, e.g. .*?
The result is stored in Result1 because:
| RegExMatch() - UnquotedOutputVar wrote: | | If any capturing subpatterns are present inside NeedleRegEx, their matches are stored in an array whose base name is OutputVar. For example, if the variable's name is Match, the substring that matches the first subpattern would be stored in Match1, the second would be stored in Match2, and so on. |
| quicktest wrote: | | The ids expression you proposed appears to store the entire ID list in ids, although with all : and = and - removed. I will have to study it a little to understand why this is. | Didn't you want that? If you remove the [:=\-]\s* part they should be left in. _________________
RegExReplace("irc.freenode.net/ahk", "^(?=(.(?=[\0-r\[]*((?<=\.).))))(?:[c-\x73]{2,8}(\S))+((2)|\b[^\2-]){2}\D++$", "$u3$1$3$4$2") |
|
| Back to top |
|
 |
PhiLho
Joined: 27 Dec 2005 Posts: 6721 Location: France (near Paris)
|
Posted: Fri Feb 09, 2007 10:57 am Post subject: |
|
|
I don't understand your \R
If that's carriage return char, you must write it \r
A simple way to get what you want, if your IDs has no spaces inside, is just:
| Code: | RegExp = \s(\S+)$
RegExMatch(Line, RegExp, Result)
| Badly written hasty test:
| Code: | Line1 = ID 1 : ABC
Line2 = ID 21 = DEF
Line3 = ID 307 - GHI
RegExp = \s(\S+)$
RegExMatch(Line1, RegExp, Result1)
RegExMatch(Line2, RegExp, Result2)
RegExMatch(Line3, RegExp, Result3)
MsgBox %Result11% %Result21% %Result31%
|
_________________
vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2") |
|
| Back to top |
|
 |
YMP
Joined: 23 Dec 2006 Posts: 265 Location: Russia
|
Posted: Fri Feb 09, 2007 12:12 pm Post subject: |
|
|
Regular Expressions (RegEx) - Quick Reference:
| Quote: |
In v1.0.46.06+, \R means "any single newline of any type" (namely `r, `n, or `r`n).
|
|
|
| Back to top |
|
 |
PhiLho
Joined: 27 Dec 2005 Posts: 6721 Location: France (near Paris)
|
Posted: Fri Feb 09, 2007 2:48 pm Post subject: |
|
|
OK I forgot it, and I just quickly skimmed the left side of the reference... And didn't found it in http://mushclient.com/pcre/pcrepattern.html which is probably obsolete now... The info is indeed in http://www.pcre.org/pcre.txt
Thanks for the reminder. _________________
vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2") |
|
| Back to top |
|
 |
quicktest
Joined: 30 Jul 2004 Posts: 42
|
Posted: Fri Feb 09, 2007 3:21 pm Post subject: |
|
|
Titan: Sorry, I must be going blind. Could you point me to where the lazy match .*? documentation is? Or was it .* that you meant? I don't seem to see how to make it return the right-most position though...
I took the subpattern explanation as meaning if I were processing ID1 and ID2 at the same time, then I would get the value for ID1 in Result1 and ID2 in Result2. Does the explanation instead mean that each () pair will cause a subpattern to be generated? How would I know if my results are not in Result2 or Result3?
Sorry, I made myself unclear; I wanted to obtain values like
Although your regexp seems to give me the value
by removing the : sign. I didn't check the rest of the array though; I will give that anoter try later today.
PhiLho: Your (much) simpler solution is working well, thank you. It would seem I have a long way to go before fully understanding Regular Expressions. I am wondering what if there are spaces or sometimes blank in the values? I am trying to make this regexp general purpose, so it can be used to obtain values other than IDs. The criteria I am thinking of is to grab any values after the : - or = signs, all the way till the end of the line. This is why I came up with (?::|-|=)\s(.*)\R. Would there be a better way to achieve this?
Thanks much to all who helped this newbie. |
|
| Back to top |
|
 |
PhiLho
Joined: 27 Dec 2005 Posts: 6721 Location: France (near Paris)
|
Posted: Fri Feb 09, 2007 4:12 pm Post subject: |
|
|
If you replace the \R with $, which is more traditional and portable, your expression is OK (as long as there is only one : - = in the line).
Another way:
| Code: | | RegExp = ^ID \d+ . (.*)$ |
_________________
vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2") |
|
| Back to top |
|
 |
JSLover
Joined: 20 Dec 2004 Posts: 542 Location: LooseChange911.com... the WTC attacks were done by the US Gov't... the official story is a lie...
|
Posted: Sun Feb 11, 2007 2:19 pm Post subject: |
|
|
| quicktest wrote: | | Could you explain a bit on why this is? |
...1st I want to mention that using result1/match1 is the correct way to solve this, but the following code is only to demonstrate "why ?: (question-colon)/a non-capturing subpattern is capturing"...
- The "overall match"..."match" contains any parts of the string that matched at all...including "non-capturing subpatterns" because the ?: only means it doesn't get its own "capture number" or "capture slot", but it's still included in the "overall match"
- Zero-length assertions, aren't in the "overall match"
| Code: | data=
(LTrim
ID 1 : ABC
ID 21 = DEF
ID 307 - GHI
)
;//normal regex, result in match1, not overall match
;//regex=(?::|-|=)\s(.*)
;//Zero-length look behind assertion, result in match1 *AND* overall match
regex=(?<=(?::|-|=)\s)(.*)
Loop, Parse, data, `n
{
line:=A_LoopField
RegExMatch(line, regex, match)
msgbox,
(LTrim
line(%line%)
match(%match%)
match1(%match1%)
match2(%match2%)
match3(%match3%)
)
} |
_________________
Home • Click image! • Blog |
|
| Back to top |
|
 |
|
|
You can post new topics in this forum You can reply to topics in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|