AutoHotkey Homepage AutoHotkey Community
Let's help each other out
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Looking for: Parse rows from HTML table
Goto page 1, 2, 3  Next
 
Reply to topic    AutoHotkey Community Forum Index -> Ask for Help
View previous topic :: View next topic  
Author Message
randallf



Joined: 06 Jul 2009
Posts: 678

PostPosted: Mon Feb 01, 2010 2:25 pm    Post subject: Looking for: Parse rows from HTML table Reply with quote

Hello, I have data that I'd like to turn back into rows that is in a table format. Does anyone have anything that will parse this back out for me?

Code:
<TABLE>
<TR>
  <TD><tr valign="top" bgcolor="#E0F1FF"><td nowrap><img src="/icons/ecblank.gif" border="0" height="16" width="1" alt=""></td><td colspan="5" nowrap><a href="/survey_t3hwebsite.nsf/*1-SurveysGroupTrendHC?OpenPage&Start=1&Count=30&Collapse=1.1" target="_self"><img src="/icons/collapse.gif" border="0" height="16" width="16" alt="Hide details for T3h Customer / 901094"></a><font size="2" color="#0000ff" face="Arial">T3h Customer / 901094</font></td><td nowrap align="center"><b><font size="2" color="#000000" face="Arial">1403</font></b></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">4.6</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">4.6</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">4.6</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">4.7</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">4.6</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap><font size="2" face="Arial"></font></td><td><img src="/icons/ecblank.gif" border="0" height="16" width="1" alt=""></td></tr></TD>
</TR>

</TABLE>
Back to top
View user's profile Send private message
jethrow



Joined: 24 May 2009
Posts: 1907
Location: Iowa, USA

PostPosted: Mon Feb 01, 2010 5:40 pm    Post subject: Reply with quote

I'm not sure exactly what you're wanting to do, but here is a script that will parse all the elements that have innerText (AHKL & COM_L):
Code:
html = ; put html from above here

doc := COM_CreateObject( "HTMLfile" )
doc.write( html )
Loop, % doc.all.length
   If ( text:=doc.all[ A_Index-1 ].innerText ) <> ""
      MsgBox, %text%

_________________
Very Happy - in case I forgot to smile
Basic Webpage Controls
COM Object Reference
Back to top
View user's profile Send private message Visit poster's website Yahoo Messenger
randallf



Joined: 06 Jul 2009
Posts: 678

PostPosted: Mon Feb 01, 2010 5:54 pm    Post subject: Reply with quote

Thank you for the reply,

I shall clarify, what I am looking for is something that will turn the above, which is a single table row, into a tab delimited row of the same data.
Back to top
View user's profile Send private message
jethrow



Joined: 24 May 2009
Posts: 1907
Location: Iowa, USA

PostPosted: Tue Feb 02, 2010 3:27 am    Post subject: Reply with quote

Hmmm, seems your question is half HTML, half AHK. Beings that this is an AHK forum, could you provide the html, or example of the html, you want it changed to? Then someone could probably provide a script that would change it Wink .
_________________
Very Happy - in case I forgot to smile
Basic Webpage Controls
COM Object Reference
Back to top
View user's profile Send private message Visit poster's website Yahoo Messenger
randallf



Joined: 06 Jul 2009
Posts: 678

PostPosted: Tue Feb 02, 2010 2:03 pm    Post subject: Reply with quote

jethrow wrote:
Hmmm, seems your question is half HTML, half AHK. Beings that this is an AHK forum, could you provide the html, or example of the html, you want it changed to? Then someone could probably provide a script that would change it Wink .


I am looking to turn it into plain text
Back to top
View user's profile Send private message
Guest






PostPosted: Tue Feb 02, 2010 2:33 pm    Post subject: Reply with quote

Something like this?
Code:
html =
(
; put html from above here
)

row := ""
doc := COM_CreateObject( "HTMLfile" )
doc.write( html )
Loop, % ( td:=doc.all.tags( "td" ) ).length
   If ( text:=td[ A_Index-1 ].innerText ) <> ""
      row .= text "`t"
MsgBox, % RTrim( row )
Back to top
randallf



Joined: 06 Jul 2009
Posts: 678

PostPosted: Tue Feb 02, 2010 4:07 pm    Post subject: Reply with quote

Code:
#include E:\FSROOT\FILES\SCRIPTS\AutoHotKey\MODULES\com.ahk

html =
(
<TABLE>
<TR>
  <TD><tr valign="top" bgcolor="#E0F1FF"><td nowrap><img src="/icons/ecblank.gif" border="0" height="16" width="1" alt=""></td><td colspan="5" nowrap><a href="/survey_t3hwebsite.nsf/*1-SurveysGroupTrendHC?OpenPage&Start=1&Count=30&Collapse=1.1" target="_self"><img src="/icons/collapse.gif" border="0" height="16" width="16" alt="Hide details for T3h Customer / 901094"></a><font size="2" color="#0000ff" face="Arial">T3h Customer / 901094</font></td><td nowrap align="center"><b><font size="2" color="#000000" face="Arial">1403</font></b></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">4.6</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">4.6</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">4.6</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">4.7</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">4.6</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap align="center"><font size="2" color="#000000" face="Arial">0.0</font></td><td nowrap><font size="2" face="Arial"></font></td><td><img src="/icons/ecblank.gif" border="0" height="16" width="1" alt=""></td></tr></TD>
</TR>

</TABLE>
)

row := ""
doc := COM_CreateObject( "HTMLfile" )
doc.write( html )
Loop, % ( td:=doc.all.tags( "td" ) ).length
   If ( text:=td[ A_Index-1 ].innerText ) <> ""
      row .= text "`t"
MsgBox, % RTrim( row )


Doesn't work, I seem to get COM errors...

Code:
Function Name:    "write"
ERROR: No COM Dispatch object!
   ()

Will Continue?
[YES] [NO]

Code:

Function Name:    "all"
ERROR: No COM Dispatch object!
   ()

Will Continue?
[YES] [NO]
Back to top
View user's profile Send private message
tank



Joined: 21 Dec 2007
Posts: 3700
Location: Louisville KY USA

PostPosted: Tue Feb 02, 2010 9:23 pm    Post subject: Reply with quote

Code:
com_init()
Wink
_________________

We are troubled on every side‚ yet not distressed; we are perplexed‚
but not in despair; Persecuted‚ but not forsaken; cast down‚ but not destroyed;
Back to top
View user's profile Send private message
randallf



Joined: 06 Jul 2009
Posts: 678

PostPosted: Tue Feb 02, 2010 10:44 pm    Post subject: Reply with quote

Thank you! But sorry, Tank, All I knew to add was that one line D: unf. I don't have a working COM script that I'm plugging into at this point...

Code:
com_init()
row := ""
doc := COM_CreateObject( "HTMLfile" )
doc.write( html )
Loop, % ( td:=doc.all.tags( "td" ) ).length
   If ( text:=td[ A_Index-1 ].innerText ) <> ""
      row .= text "`t"
MsgBox, % RTrim( row )
Back to top
View user's profile Send private message
jethrow



Joined: 24 May 2009
Posts: 1907
Location: Iowa, USA

PostPosted: Tue Feb 02, 2010 11:49 pm    Post subject: Reply with quote

jethrow wrote:
I'm not sure exactly what you're wanting to do, but here is a script that will parse all the elements that have innerText (AHKL & COM_L)

Are you using AHKL & COM_L?
_________________
Very Happy - in case I forgot to smile
Basic Webpage Controls
COM Object Reference
Back to top
View user's profile Send private message Visit poster's website Yahoo Messenger
tank



Joined: 21 Dec 2007
Posts: 3700
Location: Louisville KY USA

PostPosted: Wed Feb 03, 2010 1:10 am    Post subject: Reply with quote

jethrow wrote:
jethrow wrote:
I'm not sure exactly what you're wanting to do, but here is a script that will parse all the elements that have innerText (AHKL & COM_L)

Are you using AHKL & COM_L?
whoops didnt even look at the syntax he was using your right jethrow shouldnt need COM_Init() I am betting he has wrong version of COML
_________________

We are troubled on every side‚ yet not distressed; we are perplexed‚
but not in despair; Persecuted‚ but not forsaken; cast down‚ but not destroyed;
Back to top
View user's profile Send private message
Guest






PostPosted: Wed Feb 03, 2010 5:21 am    Post subject: Reply with quote

I will search and update and try again tomorrow.
Back to top
randallf



Joined: 06 Jul 2009
Posts: 678

PostPosted: Mon Feb 08, 2010 4:17 pm    Post subject: Reply with quote

This is working really awesome, thank you!

Is there a way to make it preserve the table lines?

For example right now I get output that is all one line, such as

Code:
Person1 10 20 30 40 50 Person2 10 20 30 40 50 Person3 10 20 30 40 50 etc


Is there a way to make it return it with the rows intact?

Code:
Person1 10 20 30 40 50
Person2 10 20 30 40 50
Person3 10 20 30 40 50
etc


I can possibly parse the data out based on types in this case but for the future it'd be really handy to have it preserve the rows.

this does not work:

Code:
row := ""
doc := COM_CreateObject( "HTMLfile" )
doc.write( html )
Loop, % ( td:=doc.all.tags( "td" ) ).length
  If ( text:=td[ A_Index-1 ].innerText ) <> ""
      row .= text "`t"
  If ( text:=td[ A_Index-1 ].innerText ) </TR> ""
      row .= text "`n"
MsgBox, % RTrim( row )


Thank you again!!!!
Back to top
View user's profile Send private message
sinkfaze



Joined: 18 Mar 2008
Posts: 5043
Location: the tunnel(?=light)

PostPosted: Mon Feb 08, 2010 4:30 pm    Post subject: Reply with quote

You'll have to change it based upon your parsing criteria, but in this case:

Code:
row := ""
doc := COM_CreateObject( "HTMLfile" )
doc.write( html )
Loop, % ( td:=doc.all.tags( "td" ) ).length
   If ( text:=td[ A_Index-1 ].innerText ) <> ""
      row .= (!row ? "" : InStr(text,"Person") ? "`n" : "`t") text
MsgBox, % RTrim( row )

_________________
Try Quick Search for Autohotkey or see the tutorial for newbies.
Back to top
View user's profile Send private message Send e-mail
randallf



Joined: 06 Jul 2009
Posts: 678

PostPosted: Mon Feb 08, 2010 4:36 pm    Post subject: Reply with quote

sinkfaze wrote:
You'll have to change it based upon your parsing criteria, but in this case:

Code:
row := ""
doc := COM_CreateObject( "HTMLfile" )
doc.write( html )
Loop, % ( td:=doc.all.tags( "td" ) ).length
   If ( text:=td[ A_Index-1 ].innerText ) <> ""
      row .= (!row ? "" : InStr(text,"Person") ? "`n" : "`t") text
MsgBox, % RTrim( row )


Thank you for the example, however topologically it's not really like that, Person1, Person2 etc could be any name. Is there any way to make it just insert an `n whenever it finds </TR> ?

Sorry to ask for so much but I'm strugging to understand what's going on with this code.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    AutoHotkey Community Forum Index -> Ask for Help All times are GMT
Goto page 1, 2, 3  Next
Page 1 of 3

 
Jump to:  
You can post new topics in this forum
You can reply to topics in this forum


Powered by phpBB © 2001, 2005 phpBB Group