| View previous topic :: View next topic |
| Author |
Message |
randallf
Joined: 06 Jul 2009 Posts: 678
|
Posted: Mon Feb 01, 2010 2:25 pm Post subject: Looking for: Parse rows from HTML table |
|
|
Hello, I have data that I'd like to turn back into rows that is in a table format. Does anyone have anything that will parse this back out for me?
| Code: | <TABLE>
<TR>
<TD><tr valign="top" bgcolor="#E0F1FF"><td nowrap><img src="/icons/ecblank.gif" border="0" height="16" width="1" alt=""></td><td colspan="5" nowrap><a href="/survey_t3hwebsite.nsf/*1-SurveysGroupTrendHC?OpenPage&Start=1&Count=30&Collapse=1.1" target="_self"><img src="/icons/collapse.gif" border="0" height="16" width="16" alt="Hide details for T3h Customer / 901094"></a><font size="2" color="#0000ff" face="Arial">T3h Customer / 901094</font></td><td nowrap align="center"><b><font size="2" color="#000000" face="Arial">1403</font></b></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">4.6</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">4.6</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">4.6</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">4.7</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">4.6</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap><font size="2" face="Arial"></font></td><td><img src="/icons/ecblank.gif" border="0" height="16" width="1" alt=""></td></tr></TD>
</TR>
</TABLE> |
|
|
| Back to top |
|
 |
jethrow
Joined: 24 May 2009 Posts: 1907 Location: Iowa, USA
|
Posted: Mon Feb 01, 2010 5:40 pm Post subject: |
|
|
I'm not sure exactly what you're wanting to do, but here is a script that will parse all the elements that have innerText (AHKL & COM_L): | Code: | html = ; put html from above here
doc := COM_CreateObject( "HTMLfile" )
doc.write( html )
Loop, % doc.all.length
If ( text:=doc.all[ A_Index-1 ].innerText ) <> ""
MsgBox, %text% |
_________________
- in case I forgot to smile
Basic Webpage Controls
COM Object Reference |
|
| Back to top |
|
 |
randallf
Joined: 06 Jul 2009 Posts: 678
|
Posted: Mon Feb 01, 2010 5:54 pm Post subject: |
|
|
Thank you for the reply,
I shall clarify, what I am looking for is something that will turn the above, which is a single table row, into a tab delimited row of the same data. |
|
| Back to top |
|
 |
jethrow
Joined: 24 May 2009 Posts: 1907 Location: Iowa, USA
|
Posted: Tue Feb 02, 2010 3:27 am Post subject: |
|
|
Hmmm, seems your question is half HTML, half AHK. Beings that this is an AHK forum, could you provide the html, or example of the html, you want it changed to? Then someone could probably provide a script that would change it . _________________
- in case I forgot to smile
Basic Webpage Controls
COM Object Reference |
|
| Back to top |
|
 |
randallf
Joined: 06 Jul 2009 Posts: 678
|
Posted: Tue Feb 02, 2010 2:03 pm Post subject: |
|
|
| jethrow wrote: | Hmmm, seems your question is half HTML, half AHK. Beings that this is an AHK forum, could you provide the html, or example of the html, you want it changed to? Then someone could probably provide a script that would change it . |
I am looking to turn it into plain text |
|
| Back to top |
|
 |
Guest
|
Posted: Tue Feb 02, 2010 2:33 pm Post subject: |
|
|
Something like this? | Code: | html =
(
; put html from above here
)
row := ""
doc := COM_CreateObject( "HTMLfile" )
doc.write( html )
Loop, % ( td:=doc.all.tags( "td" ) ).length
If ( text:=td[ A_Index-1 ].innerText ) <> ""
row .= text "`t"
MsgBox, % RTrim( row ) |
|
|
| Back to top |
|
 |
randallf
Joined: 06 Jul 2009 Posts: 678
|
Posted: Tue Feb 02, 2010 4:07 pm Post subject: |
|
|
| Code: | #include E:\FSROOT\FILES\SCRIPTS\AutoHotKey\MODULES\com.ahk
html =
(
<TABLE>
<TR>
<TD><tr valign="top" bgcolor="#E0F1FF"><td nowrap><img src="/icons/ecblank.gif" border="0" height="16" width="1" alt=""></td><td colspan="5" nowrap><a href="/survey_t3hwebsite.nsf/*1-SurveysGroupTrendHC?OpenPage&Start=1&Count=30&Collapse=1.1" target="_self"><img src="/icons/collapse.gif" border="0" height="16" width="16" alt="Hide details for T3h Customer / 901094"></a><font size="2" color="#0000ff" face="Arial">T3h Customer / 901094</font></td><td nowrap align="center"><b><font size="2" color="#000000" face="Arial">1403</font></b></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">4.6</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">4.6</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">4.6</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">4.7</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">4.6</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap><font size="2" face="Arial"></font></td><td><img src="/icons/ecblank.gif" border="0" height="16" width="1" alt=""></td></tr></TD>
</TR>
</TABLE>
)
row := ""
doc := COM_CreateObject( "HTMLfile" )
doc.write( html )
Loop, % ( td:=doc.all.tags( "td" ) ).length
If ( text:=td[ A_Index-1 ].innerText ) <> ""
row .= text "`t"
MsgBox, % RTrim( row ) |
Doesn't work, I seem to get COM errors...
| Code: | Function Name: "write"
ERROR: No COM Dispatch object!
()
Will Continue?
[YES] [NO] |
| Code: |
Function Name: "all"
ERROR: No COM Dispatch object!
()
Will Continue?
[YES] [NO] |
|
|
| Back to top |
|
 |
tank
Joined: 21 Dec 2007 Posts: 3700 Location: Louisville KY USA
|
Posted: Tue Feb 02, 2010 9:23 pm Post subject: |
|
|
 _________________
We are troubled on every side‚ yet not distressed; we are perplexed‚
but not in despair; Persecuted‚ but not forsaken; cast down‚ but not destroyed; |
|
| Back to top |
|
 |
randallf
Joined: 06 Jul 2009 Posts: 678
|
Posted: Tue Feb 02, 2010 10:44 pm Post subject: |
|
|
Thank you! But sorry, Tank, All I knew to add was that one line D: unf. I don't have a working COM script that I'm plugging into at this point...
| Code: | com_init()
row := ""
doc := COM_CreateObject( "HTMLfile" )
doc.write( html )
Loop, % ( td:=doc.all.tags( "td" ) ).length
If ( text:=td[ A_Index-1 ].innerText ) <> ""
row .= text "`t"
MsgBox, % RTrim( row ) |
|
|
| Back to top |
|
 |
jethrow
Joined: 24 May 2009 Posts: 1907 Location: Iowa, USA
|
Posted: Tue Feb 02, 2010 11:49 pm Post subject: |
|
|
| jethrow wrote: | | I'm not sure exactly what you're wanting to do, but here is a script that will parse all the elements that have innerText (AHKL & COM_L) |
Are you using AHKL & COM_L? _________________
- in case I forgot to smile
Basic Webpage Controls
COM Object Reference |
|
| Back to top |
|
 |
tank
Joined: 21 Dec 2007 Posts: 3700 Location: Louisville KY USA
|
Posted: Wed Feb 03, 2010 1:10 am Post subject: |
|
|
| jethrow wrote: | | jethrow wrote: | | I'm not sure exactly what you're wanting to do, but here is a script that will parse all the elements that have innerText (AHKL & COM_L) |
Are you using AHKL & COM_L? | whoops didnt even look at the syntax he was using your right jethrow shouldnt need COM_Init() I am betting he has wrong version of COML _________________
We are troubled on every side‚ yet not distressed; we are perplexed‚
but not in despair; Persecuted‚ but not forsaken; cast down‚ but not destroyed; |
|
| Back to top |
|
 |
Guest
|
Posted: Wed Feb 03, 2010 5:21 am Post subject: |
|
|
| I will search and update and try again tomorrow. |
|
| Back to top |
|
 |
randallf
Joined: 06 Jul 2009 Posts: 678
|
Posted: Mon Feb 08, 2010 4:17 pm Post subject: |
|
|
This is working really awesome, thank you!
Is there a way to make it preserve the table lines?
For example right now I get output that is all one line, such as
| Code: | | Person1 10 20 30 40 50 Person2 10 20 30 40 50 Person3 10 20 30 40 50 etc |
Is there a way to make it return it with the rows intact?
| Code: | Person1 10 20 30 40 50
Person2 10 20 30 40 50
Person3 10 20 30 40 50
etc |
I can possibly parse the data out based on types in this case but for the future it'd be really handy to have it preserve the rows.
this does not work:
| Code: | row := ""
doc := COM_CreateObject( "HTMLfile" )
doc.write( html )
Loop, % ( td:=doc.all.tags( "td" ) ).length
If ( text:=td[ A_Index-1 ].innerText ) <> ""
row .= text "`t"
If ( text:=td[ A_Index-1 ].innerText ) </TR> ""
row .= text "`n"
MsgBox, % RTrim( row ) |
Thank you again!!!! |
|
| Back to top |
|
 |
sinkfaze
Joined: 18 Mar 2008 Posts: 5043 Location: the tunnel(?=light)
|
Posted: Mon Feb 08, 2010 4:30 pm Post subject: |
|
|
You'll have to change it based upon your parsing criteria, but in this case:
| Code: | row := ""
doc := COM_CreateObject( "HTMLfile" )
doc.write( html )
Loop, % ( td:=doc.all.tags( "td" ) ).length
If ( text:=td[ A_Index-1 ].innerText ) <> ""
row .= (!row ? "" : InStr(text,"Person") ? "`n" : "`t") text
MsgBox, % RTrim( row ) |
_________________ Try Quick Search for Autohotkey or see the tutorial for newbies. |
|
| Back to top |
|
 |
randallf
Joined: 06 Jul 2009 Posts: 678
|
Posted: Mon Feb 08, 2010 4:36 pm Post subject: |
|
|
| sinkfaze wrote: | You'll have to change it based upon your parsing criteria, but in this case:
| Code: | row := ""
doc := COM_CreateObject( "HTMLfile" )
doc.write( html )
Loop, % ( td:=doc.all.tags( "td" ) ).length
If ( text:=td[ A_Index-1 ].innerText ) <> ""
row .= (!row ? "" : InStr(text,"Person") ? "`n" : "`t") text
MsgBox, % RTrim( row ) |
|
Thank you for the example, however topologically it's not really like that, Person1, Person2 etc could be any name. Is there any way to make it just insert an `n whenever it finds </TR> ?
Sorry to ask for so much but I'm strugging to understand what's going on with this code. |
|
| Back to top |
|
 |
|