 |
AutoHotkey Community Let's help each other out
|
| View previous topic :: View next topic |
| Author |
Message |
dash
Joined: 29 Nov 2006 Posts: 53
|
Posted: Sat Dec 08, 2007 4:34 am Post subject: |
|
|
Thanks engunneer,
but i wanted to do it with regex, since its shorter (currently im using my own non-regex version as well ^_^').
I just dont get it, why the multi line option m) for RegExMatch isnt working for all lines that match the search criteria...
m) option seem to work only till the first match in the multi line list, then it just stops...
Was it intended to work like that? If so, are there any additional options which would make m) work for multiple matches? |
|
| Back to top |
|
 |
engunneer
Joined: 30 Aug 2005 Posts: 6772 Location: Pacific Northwest, US
|
Posted: Sat Dec 08, 2007 6:10 am Post subject: |
|
|
I can't find it right now, but someone (maybe Titan) had a good explanation of the confusion. I am fairly sure m) lets it search in multiline haystacks, but still only returns one result. I am certain that you must loop it some how to get all the results.
I know it is here somewhere, but can't find the right terms. _________________
Unless otherwise noted, all code is untested.
Common Answers: 1.(Loops, Viruses, etc.) 2. Search 3.RTFM |
|
| Back to top |
|
 |
Titan
Joined: 11 Aug 2004 Posts: 5068 Location: imaginationland
|
Posted: Sat Dec 08, 2007 11:21 am Post subject: |
|
|
| dash wrote: | | But that code above only "catches" the first found line that matches the regex pattern... and i have no idea why not the rest. | Try grep, e.g.:
| Code: | grep(code, "\b(?:http://)?[\w.:]+/[\w\%\.]+\.(?:jpe?g|gif)", var)
MsgBox, %var%
Loop, Parse, var,
MsgBox, URI #%A_Index% is: %A_LoopField% |
Regex is only designed to match the first result, so you need to use a function like my one or rewrite your expression to use nested subpatterns with greedy quantifiers. _________________
RegExReplace("irc.freenode.net/ahk", "^(?=(.(?=[\0-r\[]*((?<=\.).))))(?:[c-\x73]{2,8}(\S))+((2)|\b[^\2-]){2}\D++$", "$u3$1$3$4$2") |
|
| Back to top |
|
 |
dash
Joined: 29 Nov 2006 Posts: 53
|
Posted: Sat Dec 08, 2007 5:07 pm Post subject: |
|
|
Thank you both, engunneer & Titan!
I definitely gonna need your grep function for my other parsing scripts.
Thanks again! |
|
| Back to top |
|
 |
Predated
Joined: 06 Nov 2007 Posts: 42
|
Posted: Mon Dec 10, 2007 12:06 am Post subject: |
|
|
I have some data (folder names) which I need to convert to something a little more user friendly. I have something that I hacked together right now that 'works' but it's not flexible at all. First the data I need to clean:
| Code: |
;All of these can assume a-z,A-Z,0-9,-,_,*,. will be in the words
;Case 1
;name@server
jpbruckl@itcsc-home ;should return jpbruckl
the_rahven@hotmail.com ;should return the_rahven
;Case2
;service_name@conference.server
;@conference.server can be different values, like .hotmail.com or
;.itcsc-home
whiteboard@conference.itcsc-home ;should return whiteboard (itcsc-home)
group_chat@some-service.someISP6.net ;should return group_chat (ISP6.net)
Case3
;service_name@conference.server%2fNick%20XXXX
;this one really threw me for a loop, %2f is / and %20 is -
;these are URI encoded values
;the XXXX is there to indicate numbers or letters, etc. Any symbols
;in the username are URI encoded
whiteboard@conference.itcsc%2fEric%202212 ;should return Eric-2212 (Whiteboard)
|
I ~think~ that's it. Piece of cake right?
What I currently have (inflexible) is as follows:
| Code: |
Nick := RegExReplace(A_LoopFileName, "@itcsc","")
Nick := RegExReplace(Nick, "whiteboard@","")
Nick := RegexReplace(Nick, "`%20","-")
Nick := RegExReplace(Nick, "conference.itcsc`%2f", "Conference - ")
|
Note that the above code doesn't return what I asked for it to return in my case examples. This was proof of concept, and I'm moving forward from there. This is the last 'big hurdle' I have to overcome.
I understand that matches can be placed in variables, which is pretty much what I want, since I'll be sending these values to a TreeView. I have Titan's grep function available to use, if that makes it any easier.
I know I'm long winded, but wnated to give as much information as I can. |
|
| Back to top |
|
 |
ManaUser
Joined: 24 May 2007 Posts: 901
|
Posted: Mon Dec 10, 2007 8:49 am Post subject: |
|
|
I've have trouble with the m) option too, I sort of think it's broken, but before I report a bug, let me run this by use guys.
| Code: | TempText =
(
*Ham
*Jam
*Spam
)
TempText := RegExReplace(TempText, "m)^\*", "[*]")
MsgBox %TempText% |
The intended result is:
[*]Ham
[*]Jam
[*]Spam
but I get:
[*]Ham
*Jam
*Spam |
|
| Back to top |
|
 |
DerRaphael
Joined: 23 Nov 2007 Posts: 456 Location: Heidelberg, Germany
|
Posted: Mon Dec 10, 2007 9:17 am Post subject: |
|
|
| Code: |
TempText =
(
*Ham
*Jam
*Spam
)
TempText := RegExReplace(TempText, "m)\*", "[*]") ; ommiting the ^ gives the wanted result
MsgBox %TempText%
|
on the other hand:
| Code: |
TempText = *Ham`r`n*Jam`r`n*Spam`r`n
MsgBox % RegExReplace(TempText, "m)^\*","[*]")
|
this one works, too - using this one will not work:
| Code: |
TempText = *Ham`n*Jam`n*Spam`n
MsgBox % RegExReplace(TempText, "m)^\*","[*]")
|
editing the contination section like this works
| Code: |
TempText =
( Join`r`n
*Ham
*Jam
*Spam
)
TempText := RegExReplace(TempText, "m)^\*", "[*]")
MsgBox %TempText%
|
greets
derRaphael
greets
derRaphael |
|
| Back to top |
|
 |
Ian
Joined: 15 Jul 2007 Posts: 1157 Location: Enterprise, Alabama
|
Posted: Mon Dec 10, 2007 9:25 am Post subject: |
|
|
Only the last 3 will work for the intended results. Because:
By removing the "^", you thereby tell the script to no longer check the first char on each line, but all chars and replace then with [*].
| Code: | TempText =
(
*Ham* ; Note the added star.
*Jam
*Spam
)
TempText := RegExReplace(TempText, "m)\*", "[*]") ; ommiting the ^ gives the wanted result
MsgBox %TempText% |
_________________ ScriptPad/~dieom/dieom/izwian2k7/Trikster/God
 |
|
| Back to top |
|
 |
DerRaphael
Joined: 23 Nov 2007 Posts: 456 Location: Heidelberg, Germany
|
Posted: Mon Dec 10, 2007 9:29 am Post subject: |
|
|
autohotkeyhelp on regexmatch - options (quite at the bottom)
| Quote: | `n Switches from the default newline character (`r`n) to a solitary linefeed (`n), which is the standard on UNIX systems. The chosen newline character affects the behavior of anchors (^ and $) and the dot/period pattern.
`r Switches from the default newline character (`r`n) to a solitary carriage return (`r).
`a Recognizes any type of newline, namely `r, `n, or `r`n. [requires v1.0.46.06+]
|
in conclusion
| Code: |
TempText =
(
*Ham
*Jam
*Spam
)
TempText := RegExReplace(TempText, "m`a)^\*", "[*]")
MsgBox %TempText%
|
works as wanted
greets
derRaphael[/quote] |
|
| Back to top |
|
 |
biotech as guest Guest
|
Posted: Mon Dec 10, 2007 9:39 am Post subject: |
|
|
i am try to come up with expression that will detect roman numbers in text file.
for example
I.
II.
III.
...
and same without period at the end
the roman numbers are always at the beggining of the text.
can sombody help me?
thanks. |
|
| Back to top |
|
 |
DerRaphael
Joined: 23 Nov 2007 Posts: 456 Location: Heidelberg, Germany
|
Posted: Mon Dec 10, 2007 10:18 am Post subject: |
|
|
| Code: |
text =
( Join`r`n
I. my text containing some words including some numbers such as roman
II. this is still my text
mumbo jumboII
III. Fin
sugar
IV. real fin
(c) MMVII.
salt
)
re := "S)((M{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3}))?\.)"
loop, Parse, text, `n, `r
{
Position := RegExMatch(A_LoopField, re)
if !(Position)
MsgBox % A_LoopField "`r`nLine " A_Index ": has no Roman number"
else
MsgBox % A_LoopField "`r`nLine " A_Index ":" Position
}
|
The RegEx matches any found roman number and the msgbox returns its position
This expression has been found here
greets
DerRaphael
edit to check if a roman number is in the text, all you have to do is check if Position is different to 0. in such a case an occurence was found. |
|
| Back to top |
|
 |
Ian
Joined: 15 Jul 2007 Posts: 1157 Location: Enterprise, Alabama
|
Posted: Mon Dec 10, 2007 4:26 pm Post subject: |
|
|
| Quote: | | 1) Circumflex (^) matches immediately after all internal newlines -- as well as at the start of haystack where it always matches (but it does not match after a newline at the very end of haystack). |
_________________ ScriptPad/~dieom/dieom/izwian2k7/Trikster/God
 |
|
| Back to top |
|
 |
ManaUser
Joined: 24 May 2007 Posts: 901
|
Posted: Mon Dec 10, 2007 5:41 pm Post subject: |
|
|
| DerRaphael wrote: | autohotkeyhelp on regexmatch - options (quite at the bottom)
| Quote: | `n Switches from the default newline character (`r`n) to a solitary linefeed (`n), which is the standard on UNIX systems. The chosen newline character affects the behavior of anchors (^ and $) and the dot/period pattern.
`r Switches from the default newline character (`r`n) to a solitary carriage return (`r).
`a Recognizes any type of newline, namely `r, `n, or `r`n. [requires v1.0.46.06+]
|
|
Ah-ha! I missed that part. Thanks. |
|
| Back to top |
|
 |
Predated
Joined: 06 Nov 2007 Posts: 42
|
Posted: Tue Dec 11, 2007 5:58 pm Post subject: |
|
|
I think my question got lost in the shuffle somewhere...
I have this regex: i)([A-Z0-9._%+-]+)@([A-Z0-9.-]+)
Which according to RegEx library, will match jpbruckl@itcsc-home. When I use this code in an AHK script, I get nothing. Well, specifically i get [] in the msgbox, which is telling me that it's not returning t, which in turn tells me that my re is wrong somehow.
Corrected Code
| Code: | t = jpbruckl@itcsc-home
s = `n ; separator character between matches, like "|" or ","
p = ([a-zA-Z0-9._`%+-]+)@([a-zA-Z0-9.-]+) ; pattern to search for
t := RegExReplace(t, "(.*?)((" . p . ")|$)", "$2" . s)
StringTrimRight t, t, SubStr(t,-2,1) = s ? 3 : 2
MsgBox [%t%] |
The i) was throwing it off. I assume that I could also have modified the t := line to get the i) in there, but found the pattern I needed, regardless. |
|
| Back to top |
|
 |
Guest
|
Posted: Mon Feb 11, 2008 3:45 am Post subject: |
|
|
| When searching the forums for regexreplace I get every thread PhilHo has ever posted to, since that the word regexreplace is part of his sig. Is there some way of excluding personal sigs from forum search, so this doesn't happen? Phil is invaluable as an instructor here but it's screwing up my search. |
|
| Back to top |
|
 |
|
|
You can post new topics in this forum You can reply to topics in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|