| View previous topic :: View next topic |
| Author |
Message |
Titan
Joined: 11 Aug 2004 Posts: 5031 Location: /b/
|
Posted: Sat Mar 17, 2007 8:28 pm Post subject: xpath v3 - read and write XML documents with XPath syntax |
|
|
A simple and easy set of functions for parsing XML content with xpath including save and load routines. Extremely fast and lightweight for AutoHotkey; nodes and attributes can be created and removed directly within your expressions without DOM traversal.
The included manual covers some of the common uses of xpath and demonstrates how to use this library. Unlike my other scripts this one is commented from the source for anyone who wants to know how it works.
Download
I'd like to thank everyone who posted suggestions, bug reports and advice. _________________ Chat (IRC) • PlusNet • Scripts • IronAHK • Contact by email not private message.
Last edited by Titan on Thu Aug 27, 2009 10:11 am; edited 6 times in total |
|
| Back to top |
|
 |
n-l-i-d Guest
|
Posted: Sat Mar 17, 2007 8:47 pm Post subject: |
|
|
Wow! I'm baffled. Great work! |
|
| Back to top |
|
 |
Ace_NoOne
Joined: 10 Oct 2005 Posts: 299 Location: Germany
|
Posted: Sun Mar 18, 2007 10:35 am Post subject: |
|
|
That's just awesome, Titan!
Seriously impressive! _________________ Improving my world, one script at a time.
Join the AutoHotkey IRC channel: irc.freenode.net #autohotkey |
|
| Back to top |
|
 |
Tuncay
Joined: 07 Nov 2006 Posts: 868 Location: Berlin, DE
|
Posted: Sun Mar 18, 2007 9:45 pm Post subject: |
|
|
Wow thx I was waiting for this since I know you was working on it.  _________________ Download Ahk Standard Library Collection - An archive with stdlib compatible function libraries |
|
| Back to top |
|
 |
majkinetor
Joined: 24 May 2006 Posts: 4114 Location: Belgrade
|
Posted: Mon Mar 19, 2007 8:43 am Post subject: |
|
|
/applaud _________________
 |
|
| Back to top |
|
 |
Ace_NoOne
Joined: 10 Oct 2005 Posts: 299 Location: Germany
|
Posted: Fri Mar 23, 2007 2:12 pm Post subject: |
|
|
Hmm ... I can't seem to get this working - either I'm stupid, or it's a bug:
I'm trying to retrieve the latest link from the xkcd feed (see below for a snapshot of its current state).
For that I use the following code: | Code: | xkcdFeed = http://xkcd.com/rss.xml
rss := XmlDoc(xkcdFeed)
latest := XPath(rss, "/rss/channel/item[1]/link") | (not sure whether the nodes array is zero-based or one-based, but that doesn't matter here)
However, this returns the full ITEM node of the last (third) item: | Code: | <item>
<title>Keyboards are Disgusting
</title>
<link>http://xkcd.com/c237.html
</link>
<description><img src="http://imgs.xkcd.com/comics/keyboards_are_disgusting.png" title="Alternate method: convince them to pretend it's an Etch-a-Sketch and try to erase it." alt="Alternate method: convince them to pretend it's an Etch-a-Sketch and try to erase it." />
</description>
<guid isPermaLink="true">http://xkcd.com/c237.html
</guid>
<pubDate>2007-03-19
</pubDate>
</item> |
I can't explain this - can anyone else?
Current contents of the xkcd feed (for reference): | Code: | <?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:blogChannel="http://backend.userland.com/blogChannelModule">
<channel>
<title>xkcd.com</title>
<link>http://www.xkcd.com</link>
<description>xkcd.com: A webcomic of romance and math humor.</description>
<language>en</language>
<copyright>Copyright 2005-2006 Randall Munroe</copyright>
<pubDate>Fri, 23 Mar 2007 07:47:44 -0400</pubDate>
<lastBuildDate>Fri, 23 Mar 2007 07:47:44 -0400</lastBuildDate>
<managingEditor>rmunroe@gmail.com</managingEditor>
<webMaster>davean@sciesnet.net</webMaster>
<item>
<title>Blagofaire</title>
<link>http://xkcd.com/c239.html</link>
<description><img src="http://imgs.xkcd.com/comics/blagofaire.png" title="Things were better before the Structuring and the Levels." alt="Things were better before the Structuring and the Levels." /></description>
<guid isPermaLink="true">http://xkcd.com/c239.html</guid>
<pubDate>2007-03-23</pubDate>
</item>
<item>
<title>Pet Peeve #114</title>
<link>http://xkcd.com/c238.html</link>
<description><img src="http://imgs.xkcd.com/comics/pet_peeve_114.png" title="I'm reading a goddamn book, thank you very much." alt="I'm reading a goddamn book, thank you very much." /></description>
<guid isPermaLink="true">http://xkcd.com/c238.html</guid>
<pubDate>2007-03-21</pubDate>
</item>
<item>
<title>Keyboards are Disgusting</title>
<link>http://xkcd.com/c237.html</link>
<description><img src="http://imgs.xkcd.com/comics/keyboards_are_disgusting.png" title="Alternate method: convince them to pretend it's an Etch-a-Sketch and try to erase it." alt="Alternate method: convince them to pretend it's an Etch-a-Sketch and try to erase it." /></description>
<guid isPermaLink="true">http://xkcd.com/c237.html</guid>
<pubDate>2007-03-19</pubDate>
</item>
</channel>
</rss> |
_________________ Improving my world, one script at a time.
Join the AutoHotkey IRC channel: irc.freenode.net #autohotkey |
|
| Back to top |
|
 |
Titan
Joined: 11 Aug 2004 Posts: 5031 Location: /b/
|
Posted: Fri Mar 23, 2007 7:19 pm Post subject: |
|
|
The following gives me the correct result:
| Code: | xkcdFeed = http://xkcd.com/rss.xml
rss := XmlDoc(xkcdFeed)
latest := XPath(rss, "/rss/channel/item[1]/link/text()")
MsgBox, %latest% |
Try re-downloading XPath.ahk if it doesn't work for you. Note that the function preserves whitespace regardless of PI, so you can use the ^\s*|\s*$ regex to convert the string into a valid URI. _________________ Chat (IRC) • PlusNet • Scripts • IronAHK • Contact by email not private message. |
|
| Back to top |
|
 |
Ace_NoOne
Joined: 10 Oct 2005 Posts: 299 Location: Germany
|
Posted: Sat Mar 24, 2007 7:57 am Post subject: |
|
|
Thanks Titan; I found the problem:
I had opened XPath.ahk in the browser (Firefox) and just copied the code into my file. For whatever reason, that seems to create problems; it works just fine if I actually download the file and take the code from there. _________________ Improving my world, one script at a time.
Join the AutoHotkey IRC channel: irc.freenode.net #autohotkey |
|
| Back to top |
|
 |
Guest+ Guest
|
|
| Back to top |
|
 |
Titan
Joined: 11 Aug 2004 Posts: 5031 Location: /b/
|
Posted: Sat Mar 24, 2007 6:47 pm Post subject: |
|
|
Thanks, sorry about that. _________________ Chat (IRC) • PlusNet • Scripts • IronAHK • Contact by email not private message. |
|
| Back to top |
|
 |
Guest+ Guest
|
Posted: Sat Mar 24, 2007 10:26 pm Post subject: |
|
|
No problem
btw, very nice script. It will help me a lot. |
|
| Back to top |
|
 |
chris_lee
Joined: 04 Apr 2007 Posts: 6
|
Posted: Wed Apr 04, 2007 10:13 am Post subject: why I got some extra tailing bytes of 0xA0? |
|
|
I tried the code
| Code: | xkcdFeed = http://xkcd.com/rss.xml
rss := XmlDoc(xkcdFeed)
latest := XPath(rss, "/rss/channel/item[1]/link/text()")
MsgBox, %latest%
|
but I got some extra tailing bytes of 0xA0.
any idea? |
|
| Back to top |
|
 |
Titan
Joined: 11 Aug 2004 Posts: 5031 Location: /b/
|
Posted: Wed Apr 04, 2007 12:13 pm Post subject: |
|
|
http://www.xkcd.com/rss.xml is not very well-formed, it has unescaped closing tags within /rss/channel/description. I'll improve the parser to ignore such closing tags without opening ones and strip off generated whitespace ( /0xA0) for the next version. In the meantime you could use:
| Code: | xkcdFeed = http://www.xkcd.com/rss.xml ; URI
xkcdLocFeed = %A_Temp%\xkcd.xml ; local path
UrlDownloadToFile, %xkcdFeed%, %xkcdLocFeed%
FileRead, rss, %xkcdLocFeed%
StringReplace, rss, rss, /></description>, /></description>, All ; escape closing tags
rss := XmlDoc(rss) ; load as var ...
latest := XPath(rss, "/rss/channel/item[1]/link/text()")
MsgBox, %latest% |
Thanks for the feedback.
Edit: now fixed in version 1.01 thanks. _________________ Chat (IRC) • PlusNet • Scripts • IronAHK • Contact by email not private message. |
|
| Back to top |
|
 |
chris_lee
Joined: 04 Apr 2007 Posts: 6
|
Posted: Mon Apr 09, 2007 3:38 am Post subject: |
|
|
I tried version 1.01 and it works just fine.
Thanks a lot!  |
|
| Back to top |
|
 |
Venia Legendi
Joined: 27 May 2005 Posts: 35
|
Posted: Tue Apr 10, 2007 12:53 pm Post subject: Operators ">" and "=" |
|
|
Hallo, 1st of all - THIS IS JUST WHAT I NEEDED, thanks.
Nevertheless I don't understand the follwing: Why ist Book2 with a price of 2.0 found by [price>2.0] and it's not found by [price=2.0]?
| Code: |
#Include XPath.ahk
xPath(t, "/bookstore[+1]")
loop, 5 {
xPath(t, "/bookstore/book[+1]/title[+1]", "title" A_index)
xPath(t, "/bookstore/book[" A_index "]/price[+1]", A_index ".0")
}
E := "[price>=2.0] " XPath(t, "/bookstore/book[price>=2.0]/title/text()")
E := E "`n[price>2.0] " XPath(t, "/bookstore/book[price>2.0]/title/text()") " -> also Book2 found?"
E := E "`n[price=2.0] " XPath(t, "/bookstore/book[price=2.0]/title/text()") " -> not found?"
E := E "`n[price=2] " XPath(t, "/bookstore/book[price=2]/title/text()") " -> not found?"
E := E "`n---" t
Gui, Add, Edit, w400 h400 ReadOnly -Wrap, %E%
Gui, Show, , xPath
|
|
|
| Back to top |
|
 |
|