AutoHotkey Homepage AutoHotkey Community
Let's help each other out
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Put here requests of problems with regular expressions
Goto page Previous  1, 2, 3 ... 15, 16, 17, 18, 19  Next
 
Post new topic   Reply to topic    AutoHotkey Community Forum Index -> Ask for Help
View previous topic :: View next topic  
Author Message
dash



Joined: 29 Nov 2006
Posts: 53

PostPosted: Sat Dec 08, 2007 4:34 am    Post subject: Reply with quote

Thanks engunneer,
but i wanted to do it with regex, since its shorter (currently im using my own non-regex version as well ^_^').

I just dont get it, why the multi line option m) for RegExMatch isnt working for all lines that match the search criteria...
m) option seem to work only till the first match in the multi line list, then it just stops...
Was it intended to work like that? If so, are there any additional options which would make m) work for multiple matches?
Back to top
View user's profile Send private message
engunneer



Joined: 30 Aug 2005
Posts: 6772
Location: Pacific Northwest, US

PostPosted: Sat Dec 08, 2007 6:10 am    Post subject: Reply with quote

I can't find it right now, but someone (maybe Titan) had a good explanation of the confusion. I am fairly sure m) lets it search in multiline haystacks, but still only returns one result. I am certain that you must loop it some how to get all the results.

I know it is here somewhere, but can't find the right terms.
_________________
Unless otherwise noted, all code is untested.
Common Answers: 1.(Loops, Viruses, etc.) 2. Search 3.RTFM
Back to top
View user's profile Send private message Visit poster's website
Titan



Joined: 11 Aug 2004
Posts: 5068
Location: imaginationland

PostPosted: Sat Dec 08, 2007 11:21 am    Post subject: Reply with quote

dash wrote:
But that code above only "catches" the first found line that matches the regex pattern... and i have no idea why not the rest.
Try grep, e.g.:

Code:
grep(code, "\b(?:http://)?[\w.:]+/[\w\%\.]+\.(?:jpe?g|gif)", var)
MsgBox, %var%

Loop, Parse, var, 
   MsgBox, URI #%A_Index% is: %A_LoopField%


Regex is only designed to match the first result, so you need to use a function like my one or rewrite your expression to use nested subpatterns with greedy quantifiers.
_________________

RegExReplace("irc.freenode.net/ahk", "^(?=(.(?=[\0-r\[]*((?<=\.).))))(?:[c-\x73]{2,8}(\S))+((2)|\b[^\2-]){2}\D++$", "$u3$1$3$4$2")
Back to top
View user's profile Send private message Visit poster's website
dash



Joined: 29 Nov 2006
Posts: 53

PostPosted: Sat Dec 08, 2007 5:07 pm    Post subject: Reply with quote

Thank you both, engunneer & Titan!
I definitely gonna need your grep function for my other parsing scripts.
Thanks again!
Back to top
View user's profile Send private message
Predated



Joined: 06 Nov 2007
Posts: 42

PostPosted: Mon Dec 10, 2007 12:06 am    Post subject: Reply with quote

I have some data (folder names) which I need to convert to something a little more user friendly. I have something that I hacked together right now that 'works' but it's not flexible at all. First the data I need to clean:

Code:

;All of these can assume a-z,A-Z,0-9,-,_,*,. will be in the words
;Case 1
;name@server 
jpbruckl@itcsc-home ;should return jpbruckl
the_rahven@hotmail.com ;should return the_rahven

;Case2
;service_name@conference.server
;@conference.server can be different values, like .hotmail.com or
;.itcsc-home
whiteboard@conference.itcsc-home ;should return whiteboard (itcsc-home)
group_chat@some-service.someISP6.net ;should return group_chat (ISP6.net)

Case3
;service_name@conference.server%2fNick%20XXXX
;this one really threw me for a loop, %2f is / and %20 is -
;these are URI encoded values
;the XXXX is there to indicate numbers or letters, etc. Any symbols
;in the username are URI encoded
whiteboard@conference.itcsc%2fEric%202212 ;should return Eric-2212 (Whiteboard)


I ~think~ that's it. Piece of cake right?

What I currently have (inflexible) is as follows:
Code:

Nick := RegExReplace(A_LoopFileName, "@itcsc","")
Nick := RegExReplace(Nick, "whiteboard@","")
Nick := RegexReplace(Nick, "`%20","-")
Nick := RegExReplace(Nick, "conference.itcsc`%2f", "Conference - ")


Note that the above code doesn't return what I asked for it to return in my case examples. This was proof of concept, and I'm moving forward from there. This is the last 'big hurdle' I have to overcome.

I understand that matches can be placed in variables, which is pretty much what I want, since I'll be sending these values to a TreeView. I have Titan's grep function available to use, if that makes it any easier.

I know I'm long winded, but wnated to give as much information as I can.
Back to top
View user's profile Send private message
ManaUser



Joined: 24 May 2007
Posts: 901

PostPosted: Mon Dec 10, 2007 8:49 am    Post subject: Reply with quote

I've have trouble with the m) option too, I sort of think it's broken, but before I report a bug, let me run this by use guys.
Code:
TempText =
(
*Ham
*Jam
*Spam
)
TempText := RegExReplace(TempText, "m)^\*", "[*]")
MsgBox %TempText%

The intended result is:
[*]Ham
[*]Jam
[*]Spam


but I get:
[*]Ham
*Jam
*Spam
Back to top
View user's profile Send private message
DerRaphael



Joined: 23 Nov 2007
Posts: 456
Location: Heidelberg, Germany

PostPosted: Mon Dec 10, 2007 9:17 am    Post subject: Reply with quote

Code:

TempText =
(
*Ham
*Jam
*Spam
)
TempText := RegExReplace(TempText, "m)\*", "[*]")        ; ommiting the ^ gives the wanted result
MsgBox %TempText%


on the other hand:

Code:

TempText = *Ham`r`n*Jam`r`n*Spam`r`n
MsgBox % RegExReplace(TempText, "m)^\*","[*]")


this one works, too - using this one will not work:

Code:

TempText = *Ham`n*Jam`n*Spam`n
MsgBox % RegExReplace(TempText, "m)^\*","[*]")


editing the contination section like this works

Code:

TempText =
( Join`r`n
*Ham
*Jam
*Spam
)
TempText := RegExReplace(TempText, "m)^\*", "[*]")
MsgBox %TempText%


greets
derRaphael


greets
derRaphael
Back to top
View user's profile Send private message
Ian



Joined: 15 Jul 2007
Posts: 1157
Location: Enterprise, Alabama

PostPosted: Mon Dec 10, 2007 9:25 am    Post subject: Reply with quote

Only the last 3 will work for the intended results. Because:

By removing the "^", you thereby tell the script to no longer check the first char on each line, but all chars and replace then with [*].

Code:
TempText =
(
*Ham* ; Note the added star.
*Jam
*Spam
)
TempText := RegExReplace(TempText, "m)\*", "[*]")        ; ommiting the ^ gives the wanted result
MsgBox %TempText%

_________________
ScriptPad/~dieom/dieom/izwian2k7/Trikster/God

Back to top
View user's profile Send private message
DerRaphael



Joined: 23 Nov 2007
Posts: 456
Location: Heidelberg, Germany

PostPosted: Mon Dec 10, 2007 9:29 am    Post subject: Reply with quote

autohotkeyhelp on regexmatch - options (quite at the bottom)
Quote:
`n Switches from the default newline character (`r`n) to a solitary linefeed (`n), which is the standard on UNIX systems. The chosen newline character affects the behavior of anchors (^ and $) and the dot/period pattern.
`r Switches from the default newline character (`r`n) to a solitary carriage return (`r).
`a Recognizes any type of newline, namely `r, `n, or `r`n. [requires v1.0.46.06+]


in conclusion
Code:

TempText =
(
*Ham
*Jam
*Spam
)
TempText := RegExReplace(TempText, "m`a)^\*", "[*]")
MsgBox %TempText%


works as wanted

greets
derRaphael[/quote]
Back to top
View user's profile Send private message
biotech as guest
Guest





PostPosted: Mon Dec 10, 2007 9:39 am    Post subject: Reply with quote

i am try to come up with expression that will detect roman numbers in text file.

for example
I.
II.
III.

...


and same without period at the end
the roman numbers are always at the beggining of the text.

can sombody help me?
thanks.
Back to top
DerRaphael



Joined: 23 Nov 2007
Posts: 456
Location: Heidelberg, Germany

PostPosted: Mon Dec 10, 2007 10:18 am    Post subject: Reply with quote

Code:

text =
( Join`r`n
I. my text containing some words including some numbers such as roman
II. this is still my text
mumbo jumboII
III. Fin
sugar
IV. real fin
(c) MMVII.
salt
)

re := "S)((M{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3}))?\.)"

loop, Parse, text, `n, `r
{

Position := RegExMatch(A_LoopField, re)
if !(Position)
   MsgBox % A_LoopField "`r`nLine " A_Index ": has no Roman number"
else
   MsgBox % A_LoopField "`r`nLine " A_Index ":" Position

}


The RegEx matches any found roman number and the msgbox returns its position

This expression has been found here Smile


greets
DerRaphael


edit to check if a roman number is in the text, all you have to do is check if Position is different to 0. in such a case an occurence was found.
Back to top
View user's profile Send private message
Ian



Joined: 15 Jul 2007
Posts: 1157
Location: Enterprise, Alabama

PostPosted: Mon Dec 10, 2007 4:26 pm    Post subject: Reply with quote

Quote:
1) Circumflex (^) matches immediately after all internal newlines -- as well as at the start of haystack where it always matches (but it does not match after a newline at the very end of haystack).

_________________
ScriptPad/~dieom/dieom/izwian2k7/Trikster/God

Back to top
View user's profile Send private message
ManaUser



Joined: 24 May 2007
Posts: 901

PostPosted: Mon Dec 10, 2007 5:41 pm    Post subject: Reply with quote

DerRaphael wrote:
autohotkeyhelp on regexmatch - options (quite at the bottom)
Quote:
`n Switches from the default newline character (`r`n) to a solitary linefeed (`n), which is the standard on UNIX systems. The chosen newline character affects the behavior of anchors (^ and $) and the dot/period pattern.
`r Switches from the default newline character (`r`n) to a solitary carriage return (`r).
`a Recognizes any type of newline, namely `r, `n, or `r`n. [requires v1.0.46.06+]

Ah-ha! I missed that part. Thanks.
Back to top
View user's profile Send private message
Predated



Joined: 06 Nov 2007
Posts: 42

PostPosted: Tue Dec 11, 2007 5:58 pm    Post subject: Reply with quote

I think my question got lost in the shuffle somewhere...


I have this regex: i)([A-Z0-9._%+-]+)@([A-Z0-9.-]+)

Which according to RegEx library, will match jpbruckl@itcsc-home. When I use this code in an AHK script, I get nothing. Well, specifically i get [] in the msgbox, which is telling me that it's not returning t, which in turn tells me that my re is wrong somehow.

Corrected Code
Code:
t = jpbruckl@itcsc-home
s = `n                ; separator character between matches, like "|" or ","
p =  ([a-zA-Z0-9._`%+-]+)@([a-zA-Z0-9.-]+) ; pattern to search for

t := RegExReplace(t, "(.*?)((" . p . ")|$)", "$2" . s)
StringTrimRight t, t, SubStr(t,-2,1) = s ? 3 : 2

MsgBox [%t%]


The i) was throwing it off. I assume that I could also have modified the t := line to get the i) in there, but found the pattern I needed, regardless.
Back to top
View user's profile Send private message
Guest






PostPosted: Mon Feb 11, 2008 3:45 am    Post subject: Reply with quote

When searching the forums for regexreplace I get every thread PhilHo has ever posted to, since that the word regexreplace is part of his sig. Is there some way of excluding personal sigs from forum search, so this doesn't happen? Phil is invaluable as an instructor here but it's screwing up my search.
Back to top
Display posts from previous:   
Post new topic   Reply to topic    AutoHotkey Community Forum Index -> Ask for Help All times are GMT
Goto page Previous  1, 2, 3 ... 15, 16, 17, 18, 19  Next
Page 16 of 19

 
Jump to:  
You can post new topics in this forum
You can reply to topics in this forum


Powered by phpBB © 2001, 2005 phpBB Group