AutoHotkey Community

It is currently May 26th, 2012, 11:13 pm

All times are UTC [ DST ]




Post new topic Reply to topic  [ 32 posts ]  Go to page Previous  1, 2, 3  Next
Author Message
 Post subject:
PostPosted: November 13th, 2009, 2:27 pm 
Offline
User avatar

Joined: March 19th, 2008, 12:43 am
Posts: 5480
Location: the tunnel(?=light)
Don't think that should be too difficult, but for purposes of variable sorting we'll use two separate searches instead of one (but we can use one search to help the other):

Code:
#NoEnv
F7::
clipboard =
send ^a^c
ClipWait

Pos:=RegExMatch(Clipboard,"s)Billing Address.*?Firma: (?P<Firma>\V+).*?Name: (?P<Name>\V+).*?Adresse : (?P<Adresse>\V+).*?Stadt : (?P<Stadt>\V+)",Billing)
RegExMatch(Clipboard,"s)Delivery Address.*?Firma: (?P<Firma>\V+).*?Name: (?P<Name>\V+).*?Adresse : (?P<Adresse>\V+).*?Stadt : (?P<Stadt>\V+)",Delivery,Pos+StrLen(Billing))


MsgBox % "Billing Firma: " BillingFirma "`n"
  . "Billing Name: " BillingName "`n"
  . "Billing Adresse: " BillingAdresse "`n"
  . "Billing Stadt: " BillingStadt "`n`n"
  . "Delivery Firma: " DeliveryFirma "`n"
  . "Delivery Name: " DeliveryName "`n"
  . "Delivery Adresse: " DeliveryAdresse "`n"
  . "Delivery Stadt: " DeliveryStadt
Return

_________________
Image
Try Quick Search for Autohotkey or see the tutorial for newbies.


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: November 13th, 2009, 7:58 pm 
Offline

Joined: August 16th, 2009, 2:02 pm
Posts: 20
Thanks for your suggestions.

BTW, here's the structure of the order mail. For the Germans here: All spelling errors are intentional.

Code:
my company
221, Rev Road
1234 Swiss Town
Eine neue Bestellung ist eingetroffen.

Bestell-Informationen
Bestellnummer: xxxx
Bestelldatum: Wednesday, 11 November 2009
Bestellstatus:

Ihre Informationen
Rechungsadresse
Firma:
Name: Jane Doe
Adresse : 123, Some Road



Stadt : Swiss village
Postleitzahl : 4321
Land : CHE
Telefon : 012 345 67 89
Fax :
Email : jane.doe@example.com
Lieferadresse
Firma: Acme inc.
Name : Jane Doe
Adresse : 123, Some other Road


Stadt : Big Swiss town
Postleitzahl : 4567
Land : CHE
Telefon : 039 345 67 89
Fax :




Bestellte Produkte
Anzahl Name Artikel-Nr. Preis Summe
1 widget xy-01-001 64.45 chf 64.45 chf

Zwischensumme : 64.45 chf


I have some further questions:

The "Adresse" part can consist of two lines. I only get the first line with sinkfaze's command. What do I have to add to the code to get the second line? I'd like to learn sth here.

What if I wanted to extract all the article numbers (strings that start with xy) and the number of items (widgets)?

Thanks!


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: November 13th, 2009, 8:44 pm 
Offline

Joined: February 17th, 2008, 7:09 am
Posts: 536
TalksWithComputers wrote:
I have some further questions:

The "Adresse" part can consist of two lines. I only get the first line with sinkfaze's command. What do I have to add to the code to get the second line? I'd like to learn sth here.

What if I wanted to extract all the article numbers (strings that start with xy) and the number of items (widgets)?

Thanks!


This is where you can simply expand on the current regex, based on what you expect, what is required, and what MAY be there you can write in regex. So, 2 addresses, each with the 'Adresse' line? No problem :) Combine them.

BTW it may be better to replace the whitespace of .*? to \s*

Adresse : (?P<Adresse>\V+)\s*(?:Adresse : (?P<Adresse2>\V+))?

The modified version says a second Adresse is optional, but if it's captured to save in Adresse2.

To find the xy- and other things, is merely modifications of regex. I find the following link(s) VERY helpful.

http://perldoc.perl.org/perlre.html
http://www.autohotkey.com/forum/viewtopic.php?t=32161


Last edited by rtcvb32 on November 14th, 2009, 2:41 am, edited 1 time in total.

Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: November 13th, 2009, 9:11 pm 
Offline
User avatar

Joined: March 19th, 2008, 12:43 am
Posts: 5480
Location: the tunnel(?=light)
Could you give an example of what the data looks like when a second line exists?

_________________
Image
Try Quick Search for Autohotkey or see the tutorial for newbies.


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: November 14th, 2009, 10:01 am 
Offline

Joined: August 16th, 2009, 2:02 pm
Posts: 20
Quote:
Could you give an example of what the data looks like when a second line exists?

Sure. Second line is just below the Adresse line. It looks like this:
Code:
Name: Jane Doe
Adresse : 123, Some Road
Apt 3

Quote:
The modified version says a second Adresse is optional, but if it's captured to save in Adresse2.

Very neat! It's just that the 2nd line is just a line of text below the Adresse part.
And thanks for the link. I've tried tools and sites that offer RegEx help, but I could not wrap my brain around it in a reasonable amount of time. But I'll get there eventually!


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: November 14th, 2009, 11:23 am 
Offline

Joined: August 16th, 2009, 2:02 pm
Posts: 20
I've been fooling around with the Sandbox for the last 2 hrs or so.
I've tried to match all strings starting with xy and can be about as long as xy-00-000ab (last two characters are optional).

My problems lie in matching more than one occurrence of the string and making some chars optional.

Code:
xy-......

matches exactly one string up to 000. Adding more dots is counterproductive as it includes unwanted characters when the string is unusually long.

Code:
m)xy-\S\S\S\S\S\S\S\S

matches the whole string (up to "ab"), but only if all the chars are present.

I simply want all the strings that start with xy and end at the next whitespace to be assigned to a variable. To reduce the potential for errors, the matching should only take place after "Bestellte Produkte".

I haven't even gotten to the part with the Number of items in the shopping cart.

What am I missing?


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: November 14th, 2009, 12:37 pm 
Offline

Joined: February 17th, 2008, 7:09 am
Posts: 536
TalksWithComputers wrote:
I've been fooling around with the Sandbox for the last 2 hrs or so.
I've tried to match all strings starting with xy and can be about as long as xy-00-000ab (last two characters are optional).

My problems lie in matching more than one occurrence of the string and making some chars optional.

Code:
xy-......

matches exactly one string up to 000. Adding more dots is counterproductive as it includes unwanted characters when the string is unusually long.

Code:
m)xy-\S\S\S\S\S\S\S\S

matches the whole string (up to "ab"), but only if all the chars are present.

I simply want all the strings that start with xy and end at the next whitespace to be assigned to a variable. To reduce the potential for errors, the matching should only take place after "Bestellte Produkte".

I haven't even gotten to the part with the Number of items in the shopping cart.

What am I missing?


This is where the power of recursion and specifying how many you wanted comes in. The following code will grab anything that starts with xy-, and keep going until it hits some whitespace. If you wanted a length of say 5-7 characters then it goes. I'll give several examples.

Code:
xy-\S*     ; xy- with Whitespace of 0 or more matches
xy-\S+     ; xy- with Whitespace of 1 or more matches
xy-\S{5,7} ; xy- with whitepsace of 5 to 7 matches
xy-\S{5,}  ; xy- with whitepsace of at least 5 matches
xy-\S{,7}  ; xy- with whitepsace no more than 7 matches

;if you had exactly 5 or 7 non-whitespace, then you get to do fun stuff.
xy-(\S{5}|\S{7})  ; xy- with whitepsace of either 5 or 7 matches.

;remember, the {} goes to the last pattern, if it was a single character, then you can match 7 a's ex: a{7}
;if it was in a group, you can match 7 foo's or bar's. (bar|foo){7}



This is just a little bit, there is quite a bit more you can do, but should suffice for your needs


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: November 14th, 2009, 6:20 pm 
Offline

Joined: August 16th, 2009, 2:02 pm
Posts: 20
rtcvb32 wrote:
This is where the power of recursion and specifying how many you wanted comes in. ...
This is just a little bit, there is quite a bit more you can do, but should suffice for your needs

Thanks! Why does it always have to be simple, yet hard? sigh...

Clean-up:
Okay, what remains missing from my script are the following elements:
- Get second "Adresse" line, which is optional
- loop the article numbers (and create a loop that pastes them)
- get the number of items

Address line:
It can't be that hard! What parameter do I have to add in order to get the optional second line below the Adresse line?

Loop:
rtcvb32, I've tried your suggestion from your previous post.
I've adapted it a little and put it inside a loop. Well, it doesn't work...
Code:
pos :=1
loop, {
;pos should point to starting of regex match now.
;match holds the contents of interest.
  pos := RegExMatch(Clipboard,"s)mf-(?P<Artikelnr>)\S*",Artikelnr, pos)
;no more patterns.
  if pos = 0
    break
;~   pos += strlen(Artikelnr)
pos += pos + 1
}


- get the number of items
I suspect that I would merely have to identify the line where the article number is on and copy the first word, i.e. number. Can this be done within the same Regexmatch statement and inside the same loop?


Here - for better overview, an example of the input that I might have.
Code:
my company
221, Rev Road
1234 Swiss Town
Eine neue Bestellung ist eingetroffen. 
 
Bestell-Informationen
Bestellnummer: xxxx
Bestelldatum: Wednesday, 11 November 2009
Bestellstatus: 
 
Ihre Informationen
Rechungsadresse
Firma: 
Name: Jane Doe
Adresse : 123, Some Road

 
Stadt : Swiss village
Postleitzahl : 4321
Land : CHE
Telefon : 012 345 67 89
Fax : 
Email : jane.doe@example.com
 Lieferadresse
Firma:  Acme inc.
Name : Jane Doe
Adresse : 123, Some other Road
Apt. 6
 
Stadt : Big Swiss town
Postleitzahl : 4567
Land : CHE
Telefon : 039 345 67 89
Fax : 
 
 
 
 
Bestellte Produkte
Anzahl Name Artikel-Nr. Preis Summe
1 widget xy-01-001 64.45 chf 64.45 chf
2 widgets xy-01-002 34.34 chf 34.34 chf


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: November 14th, 2009, 6:26 pm 
Offline
User avatar

Joined: March 19th, 2008, 12:43 am
Posts: 5480
Location: the tunnel(?=light)
Actually it shouldn't be too much of a change from where we started. You'll notice in your original data set that the address (in both places) has an extra newline before the next set of data begins, so all you have to do is match everything up to the first set of two or more vertical whitespace characters:

Code:
var=
(
my company
221, Rev Road
1234 Swiss Town
Eine neue Bestellung ist eingetroffen.

Bestell-Informationen
Bestellnummer: xxxx
Bestelldatum: Wednesday, 11 November 2009
Bestellstatus:

Ihre Informationen
Rechungsadresse
Firma:
Name: Jane Doe
Adresse : 123, Some Road
Apt 3


Stadt : Swiss village
Postleitzahl : 4321
Land : CHE
Telefon : 012 345 67 89
Fax :
Email : jane.doe@example.com
Lieferadresse
Firma: Acme inc.
Name : Jane Doe
Adresse : 456, Some other Road
Apt 7

Stadt : Big Swiss town
Postleitzahl : 4567
Land : CHE
Telefon : 039 345 67 89
Fax :




Bestellte Produkte
Anzahl Name Artikel-Nr. Preis Summe
1 widget xy-01-001 64.45 chf 64.45 chf

Zwischensumme : 64.45 chf
)


Pos:=RegExMatch(var,"Adresse\s?: (?P<Adresse>.*?)\v{2,}",Billing)
RegExMatch(var,"Adresse\s?: (?P<Adresse>.*?)\v{2,}",Delivery,Pos+StrLen(Billing))
MsgBox % BillingAdresse "`n`n"
 . DeliveryAdresse
return

_________________
Image
Try Quick Search for Autohotkey or see the tutorial for newbies.


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: November 14th, 2009, 6:52 pm 
Offline
User avatar

Joined: March 19th, 2008, 12:43 am
Posts: 5480
Location: the tunnel(?=light)
TalksWithComputers wrote:
I've adapted it a little and put it inside a loop. Well, it doesn't work...


Of course it doesn't work...well, actually it does work, just not the way you intended. What are you trying to match here?

Code:
RegExMatch(Clipboard,"s)mf-(?P<Artikelnr>)\S*",Artikelnr, pos)


Your capturing subpattern contains nothing, and nothing is real easy to match. :lol: I think what you meant to do was this:

Code:
RegExMatch(Clipboard,"s)mf-(?P<Artikelnr>\S*)",Artikelnr, pos)


At any rate, you were on the right track, just the wrong type of loop. The While-loop is the beast we're looking for and in order to use it you'll need to make sure that your starting position constantly updates. Your method to update the starting position was good, but generally it's easier to use the last found position plus the length of the last found string. I believe this is what you're looking for:

Code:
var=
(
my company
221, Rev Road
1234 Swiss Town
Eine neue Bestellung ist eingetroffen.
 
Bestell-Informationen
Bestellnummer: xxxx
Bestelldatum: Wednesday, 11 November 2009
Bestellstatus:
 
Ihre Informationen
Rechungsadresse
Firma:
Name: Jane Doe
Adresse : 123, Some Road

 
Stadt : Swiss village
Postleitzahl : 4321
Land : CHE
Telefon : 012 345 67 89
Fax :
Email : jane.doe@example.com
 Lieferadresse
Firma:  Acme inc.
Name : Jane Doe
Adresse : 123, Some other Road
Apt. 6
 
Stadt : Big Swiss town
Postleitzahl : 4567
Land : CHE
Telefon : 039 345 67 89
Fax :
 
 
 
 
Bestellte Produkte
Anzahl Name Artikel-Nr. Preis Summe
1 widget xy-01-001 64.45 chf 64.45 chf
2 widgets xy-01-002 34.34 chf 34.34 chf
)

Pos=1 ; need to initially set pos to 1 for the expression to evaluate correctly while a match is found
While Pos:=RegExMatch(var,"(?P<Amt>\d+) widgets? xy-(?P<Item>\S+)",Widget,Pos+StrLen(Widget))
  res.="Item No.: " WidgetItem "`n"
    . "Amount: " WidgetAmt "`n`n"
MsgBox % RegExReplace(res,"\v+$")
return

_________________
Image
Try Quick Search for Autohotkey or see the tutorial for newbies.


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: November 14th, 2009, 7:58 pm 
Offline

Joined: February 17th, 2008, 7:09 am
Posts: 536
Quote:
Ihre Informationen
Rechungsadresse
Firma:
Name: Jane Doe
Adresse : 123, Some Road
Apt 3


Well, capturing the second address without a header may cause some problems, as we will have to assume it's correct. Unless of course you can guarantee x number of newlines before the next block of information.

So i guess the best way, regarding my previous post is.

Code:
Adresse : (?P<Adresse>\V+)\s{,2}(?P<Adresse2>\V+)?
;afterwards follow your previous one. The idea is the whitespace is going to be 1-2 (\r\n or just \n although it could be followed by spaces) Then if there is a pattern to match, it saves it.


Course if this pattern fails, the remainder of your regex should work just fine.

Btw, your example also was faulted that it was pointing to a mf- rather than a xy-, but that was already fixed in the other example.


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: November 15th, 2009, 11:44 am 
Offline

Joined: August 24th, 2005, 5:29 pm
Posts: 549
Location: Berlin / Germany
Two suggestions:
Code:
RXAdresse := "Firma\s*:\s(?P<Firma>\V*)\s*Name\s*:\s*(?P<Name>\V*)\s*Adresse\s*:\s*(?P<Adresse>\V*\R?\V*)\s*Stadt\s*:\s*(?P<Stadt>\V*)"
RXArtikel := "m)^(?P<Anzahl>\d+).*?(?P<ArtikelNr>xy-\d+-\w+)"
Remember that you have to determine the StartingPosition!

BTW: Should xy really be a constant value?

_________________
nick :wink:


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: November 15th, 2009, 10:05 pm 
Offline

Joined: August 16th, 2009, 2:02 pm
Posts: 20
Thanks everybody for helping!

I always get an error message when trying to run the while loop.
I found out - all by myself, mind you - that I have to use {..} for this kind of loop. I always get an error message, though. It also happened with the original script posted by sinkfaze. I have added Nick's suggestion to improve the script. "Widget" is a synonym for "item". It can be any word with any length.

Quote:
==> This line does not contain a recognized action.
Specifically: While RegExMatch(Clipboard,"(?P<Amt>\d+).*?xy-(?P<Item>\S+)",Widget,Pos+StrLen(Widget))
>Exit code: 2

Which part has to loop?

Code:
Pos:=1
While RegExMatch(Clipboard,"(?P<Amt>\d+).*?xy-(?P<Item>\S+)",Widget,Pos+StrLen(Widget))
{
res.="Item No.: " WidgetItem "`n"
    . "Amount: " WidgetAmt "`n`n"
}
MsgBox % RegExReplace(res,"\v+$")


Quote:
BTW: Should xy really be a constant value?

Yes, all article numbers start with the same two letters. I got confused with my examples, that's why mf and xy appear. They are supposed to be the same.

Quote:
Remember that you have to determine the StartingPosition!

I'd love to, but how?

Once I got this covered, I might even be able to put the different pieces together.


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: November 16th, 2009, 2:44 am 
TalksWithComputers wrote:
I always get an error message when trying to run the while loop.
Are you using the most recent release of AHK?


Report this post
Top
  
Reply with quote  
 Post subject:
PostPosted: November 16th, 2009, 3:21 am 
Offline
User avatar

Joined: March 19th, 2008, 12:43 am
Posts: 5480
Location: the tunnel(?=light)
I don't think you copied my script correctly, it should be this:

Code:
Pos:=1
While Pos:=RegExMatch(Clipboard,"(?P<Amt>\d+).*?xy-(?P<Item>\S+)",Widget,Pos+StrLen(Widget))
{
res.="Item No.: " WidgetItem "`n"
    . "Amount: " WidgetAmt "`n`n"
}
MsgBox % RegExReplace(res,"\v+$")

_________________
Image
Try Quick Search for Autohotkey or see the tutorial for newbies.


Report this post
Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 32 posts ]  Go to page Previous  1, 2, 3  Next

All times are UTC [ DST ]


Who is online

Users browsing this forum: Bing [Bot], Cerberus, Google [Bot], joetazz, JSLover, Maestr0, rbrtryn, Tipsy3000 and 61 guests


You can post new topics in this forum
You can reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Powered by phpBB® Forum Software © phpBB Group