AutoHotkey Homepage AutoHotkey Community
Let's help each other out
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

simple problem with RegExMatch() or InStr()

 
Reply to topic    AutoHotkey Community Forum Index -> Ask for Help
View previous topic :: View next topic  
Author Message
peteryy



Joined: 12 Mar 2009
Posts: 16

PostPosted: Sat Jul 04, 2009 4:24 am    Post subject: simple problem with RegExMatch() or InStr() Reply with quote

im new to autohotkey, Crying or Very sadi need to copy every pic id and its price to save in other place and it was around 500KB file, so it will take me about half day time to search, copy and paste if i do it in manualy. i was trying to get the id number or by its price from a file, but i fail to creating the pattern to search. any one pls help??
this is the content of file that i want to search
Quote:
<meta http-equiv=refresh content=4>
<PRE id="TRANS_DATA">
<TR><TD colspan=7 class='x2'>Race 1<TR><TD colspan=7 class='x3'>unconfirmed orders<TR><a href="javascript:mr(663362,'PIC','AU','SG')"><TD>1<TD class='x5'>4<TD>12<TD>0<TD>4.65<TD>44/0<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(664258,'PIC','AU','SG')"><TD>1<TD class='x5'>4<TD>10<TD>10<TD>4.75<TD>51/15<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(664413,'PIC','AU','SG')"><TD>1<TD class='x5'>4<TD>22<TD>22<TD>4.80<TD>42/12<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(664616,'PIC','AU','SG')"><TD>1<TD class='x5'>4<TD>17<TD>17<TD>4.90<TD>51/15<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(663995,'PIC','AU','SG')"><TD>1<TD class='x5'>4<TD>0<TD>15<TD>4.90<TD>0/10<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(663136,'PIC','AU','SG')"><TD>1<TD class='x5'>5<TD>30<TD>0<TD>4.50<TD>25/0<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(663779,'PIC','AU','SG')"><TD>1<TD class='x5'>5<TD>0<TD>13<TD>4.75<TD CLASS=RD>0/6<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(664412,'PIC','AU','SG')"><TD>1<TD class='x5'>5<TD>22<TD>22<TD>4.80<TD>44/12<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(665290,'PIC','AU','SG')"><TD>1<TD class='x5'>5<TD>12<TD>12<TD>4.80<TD CLASS=RD>8/7<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(665051,'PIC','AU','SG')"><TD>1<TD class='x5'>5<TD>17<TD>17<TD>4.85<TD CLASS=RD>11/6<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(664615,'PIC','AU','SG')"><TD>1<TD class='x5'>5<TD>22<TD>22<TD>4.90<TD>58/15<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(663980,'PIC','AU','SG')"><TD>1<TD class='x5'>5<TD>0<TD>15<TD>4.90<TD>0/10<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(664841,'PIC','AU','SG')"><TD>1<TD class='x5'>5<TD>26<TD>26<TD>5.00<TD>55/15<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(663580,'PIC','AU','SG')"><TD>1<TD class='x5'>6<TD>13<TD>13<TD>4.10<TD>56/15<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(662945,'PIC','AU','SG')"><TD>1<TD class='x5'>6<TD>10<TD>10<TD>4.10<TD>54/15<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(663123,'PIC','AU','SG')"><TD>1<TD class='x5'>6<TD>30<TD>0<TD>4.50<TD>27/0<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(663554,'PIC','AU','SG')"><TD>1<TD class='x5'>6<TD>12<TD>0<TD>4.65<TD>44/0<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(664171,'PIC','AU','SG')"><TD>1<TD class='x5'>6<TD>4<TD>4<TD>4.75<TD>52/15<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(663775,'PIC','AU','SG')"><TD>1<TD class='x5'>6<TD>0<TD>13<TD>4.75<TD CLASS=RD>0/6<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(664397,'PIC','AU','SG')"><TD>1<TD class='x5'>6<TD>16<TD>16<TD>4.80<TD>40/12<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(665235,'PIC','AU','SG')"><TD>1<TD class='x5'>6<TD>12<TD>12<TD>4.80<TD CLASS=RD>18/7<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(665074,'PIC','AU','SG')"><TD>1<TD class='x5'>6<TD>14<TD>14<TD>4.85<TD CLASS=RD>8/6<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(664620,'PIC','AU','SG')"><TD>1<TD class='x5'>6<TD>14<TD>14<TD>4.90<TD>59/15<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(663988,'PIC','AU','SG')"><TD>1<TD class='x5'>6<TD>0<TD>15<TD>4.90<TD>0/10<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(664835,'PIC','AU','SG')"><TD>1<TD class='x5'>6<TD>34<TD>34<TD>5.00<TD>57/15<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(663397,'PIC','AU','SG')"><TD>1<TD class='x5'>7<TD>8<TD>8<TD>4.10<TD>60/15<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(662942,'PIC','AU','SG')"><TD>1<TD class='x5'>7<TD>12<TD>12<TD>4.10<TD>60/15<TD><font class='del2'>Del</font></TD></a>




this my testing code
Code:
filedelete, test.txt



    Loop,5
   {
   Loop, read, pc.txt, var
   {

      FileRead,var1,pc.txt
      StringReplace, var1, var1, </TD>,`n , All


      start1:=InStr(var1,"java",0,start1)+14
      end1:=InStr(var1,",'EAT",0,start1)
      StringMid,var1,var1,% start1,% end1-start1
      FileAppend, %var1%`n, Test.txt
      ;count += 1
      ;msgbox <%start1%>  <%end1%>
   }
   
   }





and lastly this is the result that i manage to produce
Quote:
665043
equiv=refresh content=4>
<PRE id="TRANS_DATA">
<TR><TD colspan=7 class='x2'>Race 1<TR><TD colspan=7 class='x3'>unconfirmed orders<TR><a href="javascript:mr(663362
663362
664258
664413
664616
663995
663136
663779
664412
665290
665051
664615
663980
664841
663580
662945
663123
663554
664171
663775
664397
665235
665074
664620
663988
664835
663397
662942
663181
663457
664229
663771
664420
665141




now my problem is my coding will resulting endless loop and accidently resulting non id data for me...
any one pls help me to solve this??
pls pls...dun wana waste so many time for just search, copy and paste...
Crying or Very sad pls
Back to top
View user's profile Send private message MSN Messenger
animeaime



Joined: 04 Nov 2008
Posts: 1045

PostPosted: Sat Jul 04, 2009 4:45 am    Post subject: Reply with quote

Dumb question, but can you give a "snip" (i.e. example) of a pic ID? By using this example, you can extract the pattern.

For example, I think this is a pic ID, but I want to be sure.
Code:
"javascript:mr(664258,'PIC','AU','SG')"


In the above example, you want the 664258, right?
_________________
As always, if you have any further questions, don't hesitate to ask.

Add OOP to your scripts via the Class Library. Check out my scripts.
Back to top
View user's profile Send private message Send e-mail
animeaime



Joined: 04 Nov 2008
Posts: 1045

PostPosted: Sat Jul 04, 2009 5:00 am    Post subject: Reply with quote

OK, below is my initial attempt, please give feedback on what it does right and wrong, so I can adjust the code. Of course, you can adjust the code as you see fit. As testing, I copied the contents of the "quote block", the massive listing of code, to "pc.txt". I then ran the below code and it created a list of 27 pic IDs in "test.txt".

Code:
ReadFile := "pc.txt"
WriteFile := "test.txt"

;reads the text to match from <ReadFile>
FileRead, text, %ReadFile%
FileDelete, %WriteFile%
start := 1

;finds all occurrences of "javascript:mr(number,'PIC'"
;   and appends the number, followed by a newline (\n) to the specified <WriteFile>
while pos := RegExMatch(text, "javascript:mr\((?<PicID>\d++),'PIC'", match, start)
{
    FileAppend %MatchPicID%`n, %WriteFile%
    start := pos + strLen(match)
}

MsgBox, Done

_________________
As always, if you have any further questions, don't hesitate to ask.

Add OOP to your scripts via the Class Library. Check out my scripts.
Back to top
View user's profile Send private message Send e-mail
peteryy



Joined: 12 Mar 2009
Posts: 16

PostPosted: Sat Jul 04, 2009 5:40 am    Post subject: Reply with quote

thx to animeaime
haha...
its work!~!~and thx for fast respond...
so wonderful
for my stupid question again...
(text, "javascript:mr\((?<PicID>\d++),'PIC'", match, start)
can u explain for me??
and izit PicID = %MatchPicID%
becouse im quite blur with RegExMatch's perl expersion
thx again
Back to top
View user's profile Send private message MSN Messenger
peteryy



Joined: 12 Mar 2009
Posts: 16

PostPosted: Sat Jul 04, 2009 5:45 am    Post subject: Reply with quote

how about if i want to search all pic ID number which match its price 4.90 only??
Back to top
View user's profile Send private message MSN Messenger
animeaime



Joined: 04 Nov 2008
Posts: 1045

PostPosted: Sat Jul 04, 2009 6:13 am    Post subject: Reply with quote

So, it does the job? Good.

Ok. here's the breakdown.

When working with regular expressions, you are looking for a pattern. By using this pattern, you find what you seek. So, from your examples, it seems you were looking for "javascript:mr(" followed by a number, followed by ",'PIC'". For example, "javascript:mr(664258,'PIC'" would be one such match.

Once you figure out the pattern, you want to figure out what you want to capture, if anything. You wanted the pic ID, the number, so we want to capture it. Also, you need to figure out what remains the same, and what changes. In this case, only the number changes, the rest of it remains the same. So, in the resulting RegEx, everything but the number is literal text.

Code:
;the RegEx I used (in red, the literal text)
"javascript:mr\((?<PicID>\d++),'PIC'"


When working with regular expressions, don't forget that some characters need to be escaped, because they have special meaning to the RegEx engine. Check out the RegEx - Quick Reference for a full list of these "metacharacters".

For example, an open parenthesis has special meaning, so to match a literal opening parenthesis, we need to precede it with a backslash ("\")).

Code:
;the RegEx I used
"javascript:mr\((?<PicID>\d++),'PIC'"


OK, hopefully you are still following me. If not, feel free to ask questions on anything I mention.

Now, after handling the literal text (the part that doesn't change), we need to handle the part that does - in this case, the number. So, where the number was in the example, "javascript:mr(664258,'PIC'", we need to replace it with a RegEx that matches "a number". Since the number is a decimal number (base 10) and can be "any length" (for all I know), I decided on this RegEx, "(?<PicID>\d++)". This RegEx will match one ore more digits and store it in the named group, "PicID".

The "(?<PicID>" starts a named-capture group. This allows storing a value which can later be retrieved - see the use of "MatchPicID" in the code. "PicID" is the group, and "Match" is the "UnquotedOutputVar" I specified when calling RegExMatch.

The "\d++" matches one or more digits. "\d" means "a digit" (0-9). The "+" after means "one or more". The next "+" makes it possessive - this means that no backtracing will be done. Since the next character is a comma, there is no reason to backtrace. If you don't understand what I mean by backtracing, I refer you to an AWESOME RegEx tutorial which is what I studied when I first learned RegExes. The link for possessive is to the page in the tutorial for possessive quantifiers.


So, merging "what stays the same" with "what changes", I ended up with the RegEx I used. As stated before, if you have any questions, feel free to ask. I would also recommend checking out the tutorial I mentioned. I would say it's a must-have for anyone serious about learning regular expressions.
_________________
As always, if you have any further questions, don't hesitate to ask.

Add OOP to your scripts via the Class Library. Check out my scripts.
Back to top
View user's profile Send private message Send e-mail
animeaime



Joined: 04 Nov 2008
Posts: 1045

PostPosted: Sat Jul 04, 2009 6:14 am    Post subject: Reply with quote

peteryy wrote:
how about if i want to search all pic ID number which match its price 4.90 only??

Where is the price found in the input? Can you give an example? In other words, how can you tell the price for a pic?
_________________
As always, if you have any further questions, don't hesitate to ask.

Add OOP to your scripts via the Class Library. Check out my scripts.
Back to top
View user's profile Send private message Send e-mail
peteryy



Joined: 12 Mar 2009
Posts: 16

PostPosted: Sat Jul 04, 2009 6:32 am    Post subject: Reply with quote

haha...still in progress to digest...
cant imagine that respond so fast...
Quote:
<TR><TD colspan=7 class='x2'>Race 1<TR><TD colspan=7 class='x3'>unconfirmed orders<TR><a href="javascript:mr(663362,'PIC','AU','SG')"><TD>1<TD class='x5'>4<TD>12<TD>0<TD>4.65<TD>44/0<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(664258,'PIC','AU','SG')"><TD>1<TD class='x5'>4<TD>10<TD>10<TD>4.75<TD>51/15<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(664413,'PIC','AU','SG')"><TD>1<TD class='x5'>4<TD>22<TD>22<TD>4.80<TD>42/12<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(664616,'PIC','AU','SG')"><TD>1<TD class='x5'>4<TD>17<TD>17<TD>4.90<TD>51/15<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(663995,'PIC','AU','SG')"><TD>1<TD class='x5'>4<TD>0<TD>15<TD>4.90<TD>0/10<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(663136,'PIC','AU','SG')"><TD>1<TD class='x5'>5<TD>30<TD>0<TD>4.50<TD>25/0<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(663779,'PIC','AU','SG')"><TD>1<TD class='x5'>5<TD>0<TD>13<TD>4.75<TD CLASS=RD>0/6<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(664412,'PIC','AU','SG')"><TD>1<TD class='x5'>5<TD>22<TD>22<TD>4.80<TD>44/12<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(665290,'PIC','AU','SG')"><TD>1<TD class='x5'>5<TD>12<TD>12<TD>4.80<TD CLASS=RD>8/7<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(665051,'PIC','AU','SG')"><TD>1<TD class='x5'>5<TD>17<TD>17<TD>4.85<TD CLASS=RD>11/6<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(664615,'PIC','AU','SG')"><TD>1<TD class='x5'>5<TD>22<TD>22<TD>4.90<TD>58/15<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(663980,'PIC','AU','SG')"><TD>1<TD class='x5'>5<TD>0<TD>15<TD>4.90<TD>0/10<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(664841,'PIC','AU','SG')"><TD>1<TD class='x5'>5<TD>26<TD>26<TD>5.00<TD>55/15<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(663580,'PIC','AU','SG')"><TD>1<TD class='x5'>6<TD>13<TD>13<TD>4.10<TD>56/15<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(662945,'PIC','AU','SG')"><TD>1<TD class='x5'>6<TD>10<TD>10<TD>4.10<TD>54/15<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(663123,'PIC','AU','SG')"><TD>1<TD class='x5'>6<TD>30<TD>0<TD>4.50<TD>27/0<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(663554,'PIC','AU','SG')"><TD>1<TD class='x5'>6<TD>12<TD>0<TD>4.65<TD>44/0<TD><font class='del2'>Del</font></TD></a><TR><a href="javascript:mr(664171,'PIC','AU','SG')"><TD>1<TD class='x5'>6<TD>4<TD>4<TD>4.75<TD>52/15<TD><font class='del2'>Del</font></TD></a>
Back to top
View user's profile Send private message MSN Messenger
animeaime



Joined: 04 Nov 2008
Posts: 1045

PostPosted: Sat Jul 04, 2009 6:43 am    Post subject: Reply with quote

OK, I'm not sure if this simplification leads to false positives, but here's the new version. This version assumes that a price, "#.## "will follow every pic ID, and that the first number in the form "#.##" after the pic ID is its price. If these assumptions are not valid, I can modify it as such. Note the parts in red are the changes from the previous version.

Code:
ReadFile := "pc.txt"
WriteFile := "test.txt"

;reads the text to match from <ReadFile>
FileRead, text, %ReadFile%
FileDelete, %WriteFile%
start := 1

/*
finds all occurrences of "javascript:mr(number,'PIC'"
   followed by a price (#.##) zero or more characters away
For each pic ID / price, the following is appended
"PicID Price`n" - the PicID, a space, the price, and the newline character, "`n"
*/
while pos := RegExMatch(text, "javascript:mr\((?<PicID>\d++),'PIC'.*?(?<Price>\d\.\d{2})", match, start)
{
    FileAppend %MatchPicID% %MatchPrice%`n, %WriteFile%
    start := pos + strLen(match)
}

MsgBox, Done



The output (text.txt):
Code:
663362 4.65
664258 4.75
664413 4.80
664616 4.90
663995 4.90
663136 4.50
663779 4.75
664412 4.80
665290 4.80
665051 4.85
664615 4.90
663980 4.90
664841 5.00
663580 4.10
662945 4.10
663123 4.50
663554 4.65
664171 4.75
663775 4.75
664397 4.80
665235 4.80
665074 4.85
664620 4.90
663988 4.90
664835 5.00
663397 4.10
662942 4.10

_________________
As always, if you have any further questions, don't hesitate to ask.

Add OOP to your scripts via the Class Library. Check out my scripts.
Back to top
View user's profile Send private message Send e-mail
peteryy



Joined: 12 Mar 2009
Posts: 16

PostPosted: Sat Jul 04, 2009 6:46 am    Post subject: Reply with quote

is working is working!~!~
Very Happy
thx for provide me tutorial to follow with..
now i get what u mean in ((?<PicID>\d++)...
haha...
thx you very much..
u save my life alot
i need to put more time to understand Regular-Expressions
i dun neo how to say...
thx
Back to top
View user's profile Send private message MSN Messenger
animeaime



Joined: 04 Nov 2008
Posts: 1045

PostPosted: Sat Jul 04, 2009 6:52 am    Post subject: Reply with quote

You're welcome, glad to help. Hope you check out the tutorial - it's an even bigger life saver than me Very Happy.
_________________
As always, if you have any further questions, don't hesitate to ask.

Add OOP to your scripts via the Class Library. Check out my scripts.
Back to top
View user's profile Send private message Send e-mail
peteryy



Joined: 12 Mar 2009
Posts: 16

PostPosted: Sat Jul 04, 2009 6:55 am    Post subject: Reply with quote

okay i will stick to it..i promis thx again Laughing
Back to top
View user's profile Send private message MSN Messenger
Display posts from previous:   
Reply to topic    AutoHotkey Community Forum Index -> Ask for Help All times are GMT
Page 1 of 1

 
Jump to:  
You can post new topics in this forum
You can reply to topics in this forum


Powered by phpBB © 2001, 2005 phpBB Group