 |
AutoHotkey Community Let's help each other out
|
| View previous topic :: View next topic |
| Author |
Message |
polyethene
Joined: 11 Aug 2004 Posts: 5248 Location: UK
|
Posted: Tue Aug 28, 2007 11:54 pm Post subject: |
|
|
| Sean wrote: | | You're really obsessed with the speed. | *sniff*...*cough* sorry?
Kidding lol, but I have good reason - I need to parse several megs of xml data in an ahk environment where it takes (literally) mins compared to ms in c#. Unfortunately .NET is not cross-platform compatible and I'm tied down to ahk with all the dependencies I created.
I tried vtable but ran into a few problems, and since xpath 2.0 is almost complete anyway I decided to put it on hold. There are XSLT and DOM functions that need wrapping so it's a topic I'll revisit soon.
| Sean wrote: | | It always have to convert the (ANSI) string to Unicode string, then, have to allocate and fill another buffer via SysAllocString. | I see. _________________ GitHub Scripts IronAHK Contact by email not private message. |
|
| Back to top |
|
 |
polyethene
Joined: 11 Aug 2004 Posts: 5248 Location: UK
|
Posted: Tue Oct 30, 2007 5:06 pm Post subject: |
|
|
xpath 2.03 released. This includes stdlib compliance and removal of RC status. _________________ GitHub Scripts IronAHK Contact by email not private message. |
|
| Back to top |
|
 |
Andreone
Joined: 20 Jul 2007 Posts: 257 Location: Paris, France
|
Posted: Tue Oct 30, 2007 7:12 pm Post subject: |
|
|
I started a script that use XPath before you released the v2.
I have replace XmlDoc by xpath_load (I did noticed in the first place that you also changed the parameters ).
Anyway, now the xpath() function always returns empty strings.
I have last ahk version and xpath 2.03. |
|
| Back to top |
|
 |
Andreone
Joined: 20 Jul 2007 Posts: 257 Location: Paris, France
|
Posted: Wed Oct 31, 2007 9:14 am Post subject: |
|
|
Another quick question: does xpath support xml escape characters?
As far as I know, there are 5 escape char:< > & ' " |
|
| Back to top |
|
 |
polyethene
Joined: 11 Aug 2004 Posts: 5248 Location: UK
|
Posted: Wed Oct 31, 2007 1:55 pm Post subject: |
|
|
| Andreone wrote: | Anyway, now the xpath() function always returns empty strings.
I have last ahk version and xpath 2.03. | Maybe your expressions are invalid, it works in the following example:
| Code: | doc = <root><element>some text</element></root>
xpath_load(xmldoc, doc)
MsgBox, % xpath(xmldoc, "/root/element/text()") |
| Andreone wrote: | | Another quick question: does xpath support xml escape characters? | In what sense? The load function can recognize and ignore entity chars and xpath should be able to write them correctly. _________________ GitHub Scripts IronAHK Contact by email not private message. |
|
| Back to top |
|
 |
Andreone
Joined: 20 Jul 2007 Posts: 257 Location: Paris, France
|
Posted: Wed Oct 31, 2007 2:27 pm Post subject: |
|
|
| Quote: | | Maybe your expressions are invalid | Yes I suppose. But as they worked with xpath before v2, I wondered if you had hints about what can cause the behavior. | Code: | <?xml version="1.0" encoding="UTF-8"?>
<config>
<proxy>proxy:80</proxy>
<curl_location></curl_location>
</config> |
| Code: | xpath_load(xml, FileName)
proxy := xpath(xml, "/config/proxy/text()")
; proxy is empty |
| Quote: | | In what sense? The load function can recognize and ignore entity chars and xpath should be able to write them correctly. | Well, what I expect is if I read an xml value that has <ThisIsMyValue>, I would have <ThisIsMyValue> into the ahk variable. And conversely.
What do you mean by ignore? |
|
| Back to top |
|
 |
polyethene
Joined: 11 Aug 2004 Posts: 5248 Location: UK
|
Posted: Wed Oct 31, 2007 3:24 pm Post subject: |
|
|
You're right, I must've accidentally removed a few lines in 2.03 which resulted in files not being read. This is now fixed in 2.04.
| Andreone wrote: | | Well, what I expect is if I read an xml value that has <ThisIsMyValue>, I would have <ThisIsMyValue> into the ahk variable. And conversely. | Only < is converted to < in write operations so the parser doesn't see it as a node - which is what I meant when I said it is ignored. When values are being read nothing is converted. _________________ GitHub Scripts IronAHK Contact by email not private message. |
|
| Back to top |
|
 |
Andreone
Joined: 20 Jul 2007 Posts: 257 Location: Paris, France
|
Posted: Wed Oct 31, 2007 4:05 pm Post subject: |
|
|
Thanks, now the example I provided works.
But I have another issue:
| Code: | <?xml version="1.0" encoding="UTF-8"?>
<config>
<application>
<apptitle>AutoHotkey</apptitle>
</application>
<application>
<apptitle>Mirror</apptitle>
</application>
</config> |
| Code: | xpath_load(xml, FileName)
AppsNode := xpath(xml, "/config/application")
Loop, Parse, AppsNode, `,
{
AppNode := A_LoopField
temp := xpath(AppNode, "apptitle/text()")
msgbox % temp
} |
The load is good. The first call to xpath is good too. Then "apptitle/text()" returns nothing again. I don't see what is wrong with the xml, nor the code. Do you have an idea?
| Quote: | | When values are being read nothing is converted. | OK, I can work like that. Besides, is that a normal feature (I don't know what other xml parser do about that, your parser is the first one is used)? |
|
| Back to top |
|
 |
polyethene
Joined: 11 Aug 2004 Posts: 5248 Location: UK
|
Posted: Wed Oct 31, 2007 4:53 pm Post subject: |
|
|
It doesn't work like that. You can only call xpath with a variable created by xpath_load (or itself), e.g.:
| Code: | xpath_load(xml, FileName)
applicationNodes := xpath(xml, "/config/application/count()")
Loop, %applicationNodes%
{
title := xpath(xml, "/config/application[" . A_Index . "]/apptitle/text()")
MsgBox, %title%
} |
| Andreone wrote: | | is that a normal feature | No, entities should be escaped. The reason it's not there now is because it reduces performance and bloats code size, but I may decide to add it in future. _________________ GitHub Scripts IronAHK Contact by email not private message. |
|
| Back to top |
|
 |
Andreone
Joined: 20 Jul 2007 Posts: 257 Location: Paris, France
|
Posted: Thu Nov 01, 2007 3:54 pm Post subject: |
|
|
| Quote: | | It doesn't work like that. You can only call xpath with a variable created by xpath_load (or itself), e.g.: | I've done like you said and it works fine now. Yeah
I'm wondering: the example I provided to you was working with XPath v1. Was I just very lucky?
| Quote: | | Quote: | | is that a normal feature | No, entities should be escaped. The reason it's not there now is because it reduces performance and bloats code size, but I may decide to add it in future. | I truly hope you'll decide to add it. It would be easier to deal with them of their management was integrated to xpath. Actually, I put html and regex into xml nodes, so I guess you see my point .
Besides, the way xpath currently doesn't seems to be right: if you read a value to you has just written, you can't tell that you'll have the same value. This can lead to data corruption
I think that a couple of "simple" functions like translateEntitesFromXml and translateEntitesToXml would be enough. Your code won't be bloated and performances would stay the same.
Anyway, thanks for your support.
Bye |
|
| Back to top |
|
 |
polyethene
Joined: 11 Aug 2004 Posts: 5248 Location: UK
|
Posted: Thu Nov 01, 2007 4:44 pm Post subject: |
|
|
That's right, but xpath 2 is different
Here are the entity translation functions (requires grep v2):
| Code: | ; zlib license
EscapeXmlEntities(str) {
StringReplace, str, str, ", ", All
StringReplace, str, str, &, &, All
StringReplace, str, str, ', ', All
StringReplace, str, str, <, <, All
StringReplace, str, str, >, >, All
If A_FormatInteger = H
ishex = x
d = !
grep(str, "[\x1-\x1f\x7f-\xff]", chr, 1, 0, d) ; find characters outside the printable ASCII range
Loop, Parse, chr, %d%
StringReplace, str, str, %A_LoopField%, % "&#" . ishex . Asc(A_LoopField) . ";"
Return, str
}
UnescapeXmlEntities(str) {
StringReplace, str, str, ", ", All
StringReplace, str, str, &, &, All
StringReplace, str, str, ', ', All
StringReplace, str, str, <, <, All
StringReplace, str, str, >, >, All
; don't use this because it uses HTML's DTD for named entities:
; Transform, str, HTML, %str%
grep(str, "i)&#x?[\da-f]{1,4};", ref)
Loop, Parse, ref,
{
n := SubStr(A_LoopField, 3, -1)
If (InStr(n, "x") == 1) {
StringTrimLeft, n, n, 1
n = 0x%n%
}
If (n > 1 and n < 256)
StringReplace, str, str, %A_LoopField%, % Chr(n)
}
Return, str
} |
_________________ GitHub Scripts IronAHK Contact by email not private message. |
|
| Back to top |
|
 |
ErichN Guest
|
Posted: Wed Nov 07, 2007 1:09 pm Post subject: Try to get text from XML file |
|
|
Hello,
I try to get text from this XML file.
| Code: |
<?xml version="1.0" encoding="utf-8"?>
<partsmanagement count="2" length-unit="mm" type="" build="53" database="">
<part P_ARTICLE_PARTTYPE="1" P_ARTICLE_PARTNR="30002571">
<freeproperty P_ARTICLE_FREE_DATA_DESCRIPTION="de_DE@Alpha;" pos="1" P_ARTICLE_FREE_DATA_VALUE="??_??@30002571;"/>
<variant P_ARTICLE_VARIANT="1" P_ARTICLE_ADJUSTRANGE="0" P_ARTICLE_DOORDEPTH="0" P_ARTICLE_DOORHEIGHT="0" P_ARTICLE_DOORMOUNTINGSPACE="0" P_ARTICLE_DOORWIDTH="0" P_ARTICLE_FLOW="0" P_ARTICLE_INTRINSICSAFETY="0" P_ARTICLE_PANELDEPTH="0" P_ARTICLE_PANELHEIGHT="0" P_ARTICLE_PANELMOUNTINGSPACE="0" P_ARTICLE_PANELWIDTH="0" P_ARTICLE_PRESSURE="0" P_ARTICLE_SHORTCIRCUITRESISTANT="0" P_ARTICLE_WIRECROSSSECTION_UNIT="0">
<functiontemplate connectionDesignation="CES" functiondefcategory="1303" functiondefgroup="1" functiondefid="1" intrinsicsafety="0" pos="1"/>
</variant>
</part>
</partsmanagement>
|
I tried some code to get the text of P_ARTICLE_PARTNR,
but it doesn`t work and I don΄t know how to get the text.
My Code:
| Code: |
#Include xpath.ahk ;
File =test.xml
xpath_load(xml,File) ; load an XML document
data := XPath(xml , "/partsmanagement/part[1]/text()") ;
msgbox, %data%
|
It only shows me the text in variant and freeproperty sections.
How can I get the text? |
|
| Back to top |
|
 |
polyethene
Joined: 11 Aug 2004 Posts: 5248 Location: UK
|
Posted: Wed Nov 07, 2007 2:54 pm Post subject: |
|
|
Use the expression: /partsmanagement/part[1]/@P_ARTICLE_PARTNR/text()
You need to download the latest update v2.05 first. _________________ GitHub Scripts IronAHK Contact by email not private message. |
|
| Back to top |
|
 |
Andreone
Joined: 20 Jul 2007 Posts: 257 Location: Paris, France
|
Posted: Wed Nov 07, 2007 3:10 pm Post subject: |
|
|
@Titan: could you post a message when you release a new version, so we don't miss it? (and maybe briefly indicate what changed since previous release )
Thank you |
|
| Back to top |
|
 |
polyethene
Joined: 11 Aug 2004 Posts: 5248 Location: UK
|
|
| Back to top |
|
 |
|
|
You can post new topics in this forum You can reply to topics in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|