AutoHotkey Community

It is currently May 26th, 2012, 11:46 am

All times are UTC [ DST ]




Post new topic Reply to topic  [ 24 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: August 30th, 2008, 12:16 pm 
Offline

Joined: March 11th, 2008, 11:36 pm
Posts: 291
Objective : Challenge and improve your Regex skill.

shortest pattern will be present as reference in second post of this thread.
since it's mainly focused on training than challenging. courses will be proceed by step through from easier to harder
notice that only red colored part in following example will be counted.

Code:
RegExMatch(Haystack, "Pattern", Output)



Regex Training One : Extract names and their date of birth who was born in between 1987 and 1992

    Rule :
    1) You must solve it by using Regex.
    2) Loop/IF statement are allowed.

    Original Source :
    Code:
    Haystack=
    (
    Horace Martin 1980-04-20
    Lyndsey Wilkerson 1986-02-12
    Clarissa Kuster 1991-05-28
    Tamika Minnie 1973-11-14
    Shania Jerome 1979-08-30
    Rylee Millhouse 1984-04-21
    Orrell Zundel 1988-01-24
    Daniel Kim 1973-10-10
    Hatty Franks 1987-07-24
    Pene Woodworth 1990-02-12
    )


    Expected Output :
    Code:
    Clarissa Kuster 1991-05-28
    Orrell Zundel 1988-01-24
    Hatty Franks 1987-07-24
    Pene Woodworth 1990-02-12

Regex Training Two : Extract google's logo image url from html source

    Rule :
    1) You must solve it by using Regex.
    2) Start from following code
    3) RegExMatch(Haystack, "Literal Expected Output") is fail

    Original Source :
    Code:
    URLDownloadToFile, http://www.google.com/ncr, %A_Temp%\g_index.htm
    FileRead, Haystack, %A_Temp%\g_index.htm


    Expected Output :
    Code:
    http://www.google.com/intl/en_ALL/images/logo.gif


Let me know if you have any suggestion or ideas for upcoming challenges
Hope you guys have fun with it

_________________
Easy WinAPI - Dive into Windows API World
Benchmark your AutoHotkey skills at PlayAHK.com


Last edited by heresy on August 31st, 2008, 1:13 pm, edited 5 times in total.

Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: August 30th, 2008, 12:17 pm 
Offline

Joined: March 11th, 2008, 11:36 pm
Posts: 291
reserved

_________________
Easy WinAPI - Dive into Windows API World
Benchmark your AutoHotkey skills at PlayAHK.com


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: August 30th, 2008, 12:39 pm 
Offline

Joined: November 8th, 2004, 12:46 am
Posts: 1271
Code:
Haystack=
(
Horace Martin 1980-04-20
Lyndsey Wilkerson 1986-02-12
Clarissa Kuster 1991-05-28
Tamika Minnie 1973-11-14
Shania Jerome 1979-08-30
Rylee Millhouse 1984-04-21
Orrell Zundel 1988-01-24
Daniel Kim 1973-10-10
Hatty Franks 1987-07-24
Pene Woodworth 1990-02-12
)

loop, parse, haystack, `n
{
   year := regexreplace( a_loopfield, "(^\w+ \w+|-\d+-\d+$)" )
   if year between 1987 and 1992
      list := a_loopfield . "`n" . list
}
msgbox % list

_________________
"Anything worth doing is worth doing slowly." - Mae West
Image


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: August 30th, 2008, 12:49 pm 
Offline

Joined: March 11th, 2008, 11:36 pm
Posts: 291
hi serenity.

Rule 1) You must solve it by using Regex.

IF statement is allowed but you've solved it through IF Between rather than Regex.
year of birth Validation also need to be proceed through RegEx.

_________________
Easy WinAPI - Dive into Windows API World
Benchmark your AutoHotkey skills at PlayAHK.com


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: August 30th, 2008, 12:59 pm 
Offline

Joined: November 8th, 2004, 12:46 am
Posts: 1271
Code:
loop, parse, haystack, `n
{
   year := regexreplace( a_loopfield, "(^\w+ \w+|-\d+-\d+$)" )
   if regexmatch( year, "(198[7-9]|199[0-2])" )
      list := a_loopfield . "`n" . list
}
msgbox % list


:)

_________________
"Anything worth doing is worth doing slowly." - Mae West
Image


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: August 30th, 2008, 1:02 pm 
Offline

Joined: March 11th, 2008, 11:36 pm
Posts: 291
yeah that's a valid attempt though you've used regex twice. so your count will be 39. good luck

Code:
(^\w+ \w+|-\d+-\d+$)
(198[7-9]|199[0-2])

_________________
Easy WinAPI - Dive into Windows API World
Benchmark your AutoHotkey skills at PlayAHK.com


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: August 30th, 2008, 1:16 pm 
Offline

Joined: November 8th, 2004, 12:46 am
Posts: 1271
Code:
loop, parse, haystack, `n
{
   if regexmatch( a_loopfield, "(^\w+ \w+ )(198[7-9]|199[0-2])(\-\d+-\d+$)" )
      list := a_loopfield . "`n" . list
}
msgbox % list

_________________
"Anything worth doing is worth doing slowly." - Mae West
Image


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: August 30th, 2008, 7:44 pm 
Offline
User avatar

Joined: August 11th, 2004, 1:47 am
Posts: 5347
Location: UK
Code:
h =
(
Horace Martin 1980-04-20
Lyndsey Wilkerson 1986-02-12
Clarissa Kuster 1991-05-28
Tamika Minnie 1973-11-14
Shania Jerome 1979-08-30
Rylee Millhouse 1984-04-21
Orrell Zundel 1988-01-24
Daniel Kim 1973-10-10
Hatty Franks 1987-07-24
Pene Woodworth 1990-02-12
)

h := RegExReplace(h, "\D+\b19(?!8[7-9]|9[0-2])[\d-]+")
MsgBox, %h%


Code:
t = %A_Temp%\g
URLDownloadToFile, *0 http://www.google.co.uk/, %t%
FileRead, h, %t%
FileDelete, %t%

RegExMatch(h, "i)<img\b[^>]*\bsrc=(""|')([^\-1]+?)(?-2)", h)
MsgBox, %h2%


Can I ask why you put unnecessary quote tags around your message like Skan often does? On my screen it makes text hard to read, so I only went by the code examples as a guideline.

_________________
GitHubScriptsIronAHK Contact by email not private message.


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: August 31st, 2008, 7:16 am 
Offline

Joined: March 11th, 2008, 11:36 pm
Posts: 291
Titan wrote:
Can I ask why you put unnecessary quote tags around your message like Skan often does? On my screen it makes text hard to read, so I only went by the code examples as a guideline.


i was trying to have better readability by splitting questions into quote boxes. didn't realized that it could be looked like that. i'll reformat it. neways your 2nd regex doesn't match to expected output :P

_________________
Easy WinAPI - Dive into Windows API World
Benchmark your AutoHotkey skills at PlayAHK.com


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: August 31st, 2008, 8:30 am 
Offline

Joined: April 18th, 2008, 7:57 am
Posts: 1390
Location: The Interwebs
#1:
Code:
Haystack=
(
Horace Martin 1980-04-20
Lyndsey Wilkerson 1986-02-12
Clarissa Kuster 1991-05-28
Tamika Minnie 1973-11-14
Shania Jerome 1979-08-30
Rylee Millhouse 1984-04-21
Orrell Zundel 1988-01-24
Daniel Kim 1973-10-10
Hatty Franks 1987-07-24
Pene Woodworth 1990-02-12
)
Loop, Parse, Haystack, `n
  If (RegExMatch(A_LoopField,"19(8[7-9]|9[0-2])"))
    Output .= A_LoopField "`n"
MsgBox % Output

Total of 17.

#2:
Code:
URLDownloadToFile, http://www.google.com/ncr, %A_Temp%\g_index.htm
FileRead, Haystack, %A_Temp%\g_index.htm
RegExMatch(Haystack,"ue=(.+?)/"".+110 src=""(.+?)""",Output)
MsgBox % Output1 Output2

Total of 30.


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: August 31st, 2008, 10:30 am 
Offline
User avatar

Joined: August 11th, 2004, 1:47 am
Posts: 5347
Location: UK
heresy wrote:
your 2nd regex doesn't match to expected output
I get the correct output. google.com/ncr redirects me to .co.uk so I probably don't have the same source as you.

_________________
GitHubScriptsIronAHK Contact by email not private message.


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: August 31st, 2008, 1:09 pm 
Offline

Joined: March 11th, 2008, 11:36 pm
Posts: 291
@ Krogdor
hey you've provoked the regex genius :lol:

@ Titan
i know that you're the person who can obtain all the hall of fame for regex stuff
but i was talking about the url header whether http://www.google.com or http://www.google.co.uk :P

me myself wrote:
Expected Output :
Code:
http://www.google.com/intl/en_ALL/images/logo.gif


_________________
Easy WinAPI - Dive into Windows API World
Benchmark your AutoHotkey skills at PlayAHK.com


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: August 31st, 2008, 1:37 pm 
Offline
User avatar

Joined: August 11th, 2004, 1:47 am
Posts: 5347
Location: UK
heresy wrote:
i was talking about the url header whether http://www.google.com or http://www.google.co.uk
The image URI is relative, so to put a new string before it has nothing to do with regex. You could easily pull "www.google.co.uk" from another part in the HTML source but there is no guarantee it is the same host of the resource in question i.e. my copy could be from a proxy or downloaded directly from IP.

_________________
GitHubScriptsIronAHK Contact by email not private message.


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: August 31st, 2008, 2:27 pm 
Offline

Joined: November 8th, 2004, 12:46 am
Posts: 1271
Krogdor wrote:
#2:
Code:
URLDownloadToFile, http://www.google.com/ncr, %A_Temp%\g_index.htm
FileRead, Haystack, %A_Temp%\g_index.htm
RegExMatch(Haystack,"ue=(.+?)/"".+110 src=""(.+?)""",Output)
MsgBox % Output1 Output2

Total of 30.


This returns blank for me. I wonder if it's a locale thing.

For some reason AHK won't let me use (110 src="(.+?)") or (110 src="([\w\D]+?)") in a script. I've run into this before when trying to match " character in regex.

Code:
URLDownloadToFile, http://www.google.com/ncr, %A_Temp%\g_index.htm
FileRead, Haystack, %A_Temp%\g_index.htm
RegExMatch( Haystack, "(110 src="([\w\D]+?)")", m ) ; (110 src="(.+?)")
msgbox % "http://www.google.com" . m2

_________________
"Anything worth doing is worth doing slowly." - Mae West
Image


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: August 31st, 2008, 8:56 pm 
Offline

Joined: April 18th, 2008, 7:57 am
Posts: 1390
Location: The Interwebs
Code:
URLDownloadToFile, http://www.google.com/ncr, %A_Temp%\g_index.htm
FileRead, Haystack, %A_Temp%\g_index.htm
RegExMatch( Haystack, "(110 src=""([\w\D]+?)"")", m ) ; (110 src="(.+?)")
msgbox % "http://www.google.com" . m2


You need to put two quotes in a row to escape them inside a quoted string.


Report this post
Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 24 posts ]  Go to page 1, 2  Next

All times are UTC [ DST ]


Who is online

Users browsing this forum: No registered users and 2 guests


You can post new topics in this forum
You can reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Powered by phpBB® Forum Software © phpBB Group