Simple HTML Table Parse

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
mattjgale
Posts: 3
Joined: 27 Oct 2017, 15:49

Simple HTML Table Parse

27 Oct 2017, 16:13

I'm trying to scrape some locally saved html files and outputting to a spreadsheet but am having some difficulties so I'm hoping I can get a bit of support.

For debugging & learning purposes I've been simply trying to scrape the website so I have the information displayed and then was planning on working through an array to save the data to a spreadsheet. Before getting to that point though I need to try and get a working scrape which I've been having difficulty with. I have several iterations of code but I have been mainly playing with a ComObj. After researching I understand that RegEx may work well but I do not fully understand how that works or if that would be the best solution. My most recent code is:

Code: Select all

wb := ComObjCreate( "InternetExplorer.Application" )
	wb.Navigate("C:\Users\Matt\OneDrive\AHK\a.html")	
	wb.Visible := True

sleep 500

	table := IE.document.getElementById("MsoNormalTable").Rows
	rows  := table.Length
	msgbox %rows%
	cols  := table[0].getElementsByTagName("TD").Length
	msgbox %cols%
	msg =
	Loop, % rows
	{
		msg .= (i := A_Index-1) ? "`n" : ""
		Loop, % cols
			msg .= (A_Index = 1 ? "" : "`t") table[i].getElementsByTagName("TD")[A_Index-1].innerText
	}
	MsgBox, % msg
This opens the html file I have saved locally but doesn't parse any information - aka the msgbox is blank.

As a point of reference the html file is:

Code: Select all

<html xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:w="urn:schemas-microsoft-com:office:word"
xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882"
xmlns:m="http://schemas.microsoft.com/office/2004/12/omml"
xmlns="http://www.w3.org/TR/REC-html40">

<head>
<meta http-equiv=Content-Type content="text/html; charset=windows-1252">
<meta name=ProgId content=Word.Document>
<meta name=Generator content="Microsoft Word 15">
<meta name=Originator content="Microsoft Word 15">
<link rel=File-List href="b_files/filelist.xml">

<link rel=dataStoreItem href="b_files/item0001.xml"
target="b_files/props002.xml">
<link rel=themeData href="b_files/themedata.thmx">
<link rel=colorSchemeMapping href="b_files/colorschememapping.xml">
</style>
</head>

<body lang=EN-US link=blue vlink=purple style='tab-interval:.5in'>

<div class=WordSection1>

<p class=MsoNormal><b style='mso-bidi-font-weight:normal'><span
style='font-size:11.0pt;font-family:"Calibri",sans-serif;mso-ascii-theme-font:
minor-latin;mso-hansi-theme-font:minor-latin;mso-bidi-font-family:"Times New Roman"'><o:p>&nbsp;</o:p></span></b></p>

<table class=MsoNormalTable border=1 cellspacing=0 cellpadding=0 width=721
 style='border-collapse:collapse;mso-table-layout-alt:fixed;border:none;
 mso-border-alt:solid windowtext .5pt;mso-padding-alt:0in 5.4pt 0in 5.4pt;
 mso-border-insideh:.5pt solid windowtext;mso-border-insidev:.5pt solid windowtext'>
 <tr style='mso-yfti-irow:0;mso-yfti-firstrow:yes;height:.2in;mso-height-rule:
  exactly'>
  <td width=721 colspan=9 style='width:541.1pt;border:solid windowtext 1.0pt;
  border-bottom:solid #BFBFBF 1.0pt;mso-border-alt:solid windowtext .5pt;
  mso-border-bottom-alt:solid #BFBFBF .5pt;background:black;padding:0in 5.4pt 0in 5.4pt;
  height:.2in;mso-height-rule:exactly'>
  <h3><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;
  mso-ascii-theme-font:minor-latin;mso-hansi-theme-font:minor-latin;color:#F79646'>Tracking
  Info <o:p></o:p></span></h3>
  </td>
 </tr>
 <tr style='mso-yfti-irow:1;height:26.5pt'>
  <td width=81 valign=bottom style='width:61.0pt;border:solid #BFBFBF 1.0pt;
  border-top:none;mso-border-top-alt:solid #BFBFBF .5pt;mso-border-alt:solid #BFBFBF .5pt;
  padding:0in 5.4pt 0in 5.4pt;height:26.5pt'>
  <p class=MsoBodyText style='line-height:115%'><b style='mso-bidi-font-weight:
  normal'><span style='font-size:11.0pt;line-height:115%;font-family:"Calibri",sans-serif;
  mso-ascii-theme-font:minor-latin;mso-hansi-theme-font:minor-latin;mso-bidi-font-family:
  "Times New Roman"'>JOB ORDER #: </span></b><b style='mso-bidi-font-weight:
  normal'><span style='font-size:11.0pt;line-height:115%;font-family:"Calibri",sans-serif;
  mso-ascii-theme-font:minor-latin;mso-hansi-theme-font:minor-latin'><o:p></o:p></span></b></p>
  </td>
  <td width=93 valign=bottom style='width:70.05pt;border-top:none;border-left:
  none;border-bottom:solid #BFBFBF 1.0pt;border-right:solid #BFBFBF 1.0pt;
  mso-border-top-alt:solid #BFBFBF .5pt;mso-border-left-alt:solid #BFBFBF .5pt;
  mso-border-alt:solid #BFBFBF .5pt;padding:0in 5.4pt 0in 5.4pt;height:26.5pt'>
  <p class=MsoBodyText style='line-height:115%'><span style='font-size:11.0pt;
  line-height:115%;font-family:"Calibri",sans-serif;mso-ascii-theme-font:minor-latin;
  mso-hansi-theme-font:minor-latin'>MG81711<o:p></o:p></span></p>
  </td>
  <td width=114 colspan=2 valign=bottom style='width:85.5pt;border-top:none;
  border-left:none;border-bottom:solid #BFBFBF 1.0pt;border-right:solid #BFBFBF 1.0pt;
  mso-border-top-alt:solid #BFBFBF .5pt;mso-border-left-alt:solid #BFBFBF .5pt;
  mso-border-alt:solid #BFBFBF .5pt;padding:0in 5.4pt 0in 5.4pt;height:26.5pt'>
  <p class=MsoBodyText style='line-height:115%'><b style='mso-bidi-font-weight:
  normal'><span style='font-size:11.0pt;line-height:115%;font-family:"Calibri",sans-serif;
  mso-ascii-theme-font:minor-latin;mso-hansi-theme-font:minor-latin;mso-bidi-font-family:
  "Times New Roman"'>TIER RATING: </span></b><b style='mso-bidi-font-weight:
  normal'><span style='font-size:11.0pt;line-height:115%;font-family:"Calibri",sans-serif;
  mso-ascii-theme-font:minor-latin;mso-hansi-theme-font:minor-latin'><o:p></o:p></span></b></p>
  </td>
  <td width=132 colspan=2 valign=bottom style='width:99.0pt;border-top:none;
  border-left:none;border-bottom:solid #BFBFBF 1.0pt;border-right:solid #BFBFBF 1.0pt;
  mso-border-top-alt:solid #BFBFBF .5pt;mso-border-left-alt:solid #BFBFBF .5pt;
  mso-border-alt:solid #BFBFBF .5pt;padding:0in 5.4pt 0in 5.4pt;height:26.5pt'>
  <p class=MsoBodyText style='line-height:115%'><span style='font-size:11.0pt;
  line-height:115%;font-family:"Calibri",sans-serif;mso-ascii-theme-font:minor-latin;
  mso-hansi-theme-font:minor-latin'>1<o:p></o:p></span></p>
  </td>
  <td width=234 colspan=2 valign=bottom style='width:175.5pt;border-top:none;
  border-left:none;border-bottom:solid #BFBFBF 1.0pt;border-right:solid #BFBFBF 1.0pt;
  mso-border-top-alt:solid #BFBFBF .5pt;mso-border-left-alt:solid #BFBFBF .5pt;
  mso-border-alt:solid #BFBFBF .5pt;padding:0in 5.4pt 0in 5.4pt;height:26.5pt'>
  <p class=MsoBodyText style='line-height:115%'><b style='mso-bidi-font-weight:
  normal'><span style='font-size:11.0pt;line-height:115%;font-family:"Calibri",sans-serif;
  mso-ascii-theme-font:minor-latin;mso-hansi-theme-font:minor-latin;mso-bidi-font-family:
  "Times New Roman"'>AD COPY REQUESTED: Y/N: </span></b><b style='mso-bidi-font-weight:
  normal'><span style='font-size:11.0pt;line-height:115%;font-family:"Calibri",sans-serif;
  mso-ascii-theme-font:minor-latin;mso-hansi-theme-font:minor-latin'><o:p></o:p></span></b></p>
  </td>
  <td width=67 valign=bottom style='width:50.05pt;border-top:none;border-left:
  none;border-bottom:solid #BFBFBF 1.0pt;border-right:solid #BFBFBF 1.0pt;
  mso-border-top-alt:solid #BFBFBF .5pt;mso-border-left-alt:solid #BFBFBF .5pt;
  mso-border-alt:solid #BFBFBF .5pt;padding:0in 5.4pt 0in 5.4pt;height:26.5pt'>
  <p class=MsoBodyText style='line-height:115%'><span style='font-size:11.0pt;
  line-height:115%;font-family:"Calibri",sans-serif;mso-ascii-theme-font:minor-latin;
  mso-hansi-theme-font:minor-latin'>N<o:p></o:p></span></p>
  </td>
 </tr>
 <tr style='mso-yfti-irow:2;height:.2in;mso-height-rule:exactly'>
  <td width=721 colspan=9 style='width:541.1pt;border-top:none;border-left:
  solid windowtext 1.0pt;border-bottom:solid #BFBFBF 1.0pt;border-right:solid windowtext 1.0pt;
  mso-border-top-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;
  mso-border-bottom-alt:solid #BFBFBF .5pt;background:black;padding:0in 5.4pt 0in 5.4pt;
  height:.2in;mso-height-rule:exactly'>
  <h3><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;
  mso-ascii-theme-font:minor-latin;mso-hansi-theme-font:minor-latin;color:#F79646'>Engagement
  Info <o:p></o:p></span></h3>
  </td>
 </tr>
 <tr style='mso-yfti-irow:3;height:.4in'>
  <td width=205 colspan=3 style='width:153.55pt;border:solid #BFBFBF 1.0pt;
  border-top:none;mso-border-top-alt:solid #BFBFBF .5pt;mso-border-alt:solid #BFBFBF .5pt;
  padding:0in 5.4pt 0in 5.4pt;height:.4in'>
  <p class=MsoBodyText><b style='mso-bidi-font-weight:normal'><span
  style='font-size:11.0pt;font-family:"Calibri",sans-serif;mso-ascii-theme-font:
  minor-latin;mso-hansi-theme-font:minor-latin'>OPEN ORDER <o:p></o:p></span></b></p>
  </td>
  <td width=108 colspan=2 style='width:81.0pt;border-top:none;border-left:none;
  border-bottom:solid #BFBFBF 1.0pt;border-right:solid #BFBFBF 1.0pt;
  mso-border-top-alt:solid #BFBFBF .5pt;mso-border-left-alt:solid #BFBFBF .5pt;
  mso-border-alt:solid #BFBFBF .5pt;padding:0in 5.4pt 0in 5.4pt;height:.4in'>
  <p class=FieldText><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;
  mso-ascii-theme-font:minor-latin;mso-hansi-theme-font:minor-latin;font-weight:
  normal'>No<o:p></o:p></span></p>
  </td>
  <td width=192 colspan=2 style='width:2.0in;border-top:none;border-left:none;
  border-bottom:solid #BFBFBF 1.0pt;border-right:solid #BFBFBF 1.0pt;
  mso-border-top-alt:solid #BFBFBF .5pt;mso-border-left-alt:solid #BFBFBF .5pt;
  mso-border-alt:solid #BFBFBF .5pt;padding:0in 5.4pt 0in 5.4pt;height:.4in'>
  <p class=MsoBodyText><b style='mso-bidi-font-weight:normal'><span
  style='font-size:11.0pt;font-family:"Calibri",sans-serif;mso-ascii-theme-font:
  minor-latin;mso-hansi-theme-font:minor-latin'>CONTACT RECRUITER 1ST<o:p></o:p></span></b></p>
  </td>
  <td width=217 colspan=2 style='width:162.55pt;border-top:none;border-left:
  none;border-bottom:solid #BFBFBF 1.0pt;border-right:solid #BFBFBF 1.0pt;
  mso-border-top-alt:solid #BFBFBF .5pt;mso-border-left-alt:solid #BFBFBF .5pt;
  mso-border-alt:solid #BFBFBF .5pt;padding:0in 5.4pt 0in 5.4pt;height:.4in'>
  <p class=FieldText><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;
  mso-ascii-theme-font:minor-latin;mso-hansi-theme-font:minor-latin;font-weight:
  normal'>Yes<o:p></o:p></span></p>
  </td>
 </tr>
 <tr style='mso-yfti-irow:4;height:.2in;mso-height-rule:exactly'>
  <td width=721 colspan=9 style='width:541.1pt;border-top:none;border-left:
  solid windowtext 1.0pt;border-bottom:solid #BFBFBF 1.0pt;border-right:solid windowtext 1.0pt;
  mso-border-top-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;
  mso-border-bottom-alt:solid #BFBFBF .5pt;background:black;padding:0in 5.4pt 0in 5.4pt;
  height:.2in;mso-height-rule:exactly'>
  <h3><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;
  mso-ascii-theme-font:minor-latin;mso-hansi-theme-font:minor-latin;color:#F79646'>Company
  Info <o:p></o:p></span></h3>
  </td>
 </tr>
 <tr style='mso-yfti-irow:5;height:21.9pt'>
  <td width=205 colspan=3 style='width:153.55pt;border:solid #BFBFBF 1.0pt;
  border-top:none;mso-border-top-alt:solid #BFBFBF .5pt;mso-border-alt:solid #BFBFBF .5pt;
  padding:0in 5.4pt 0in 5.4pt;height:21.9pt'>
  <p class=MsoBodyText><b style='mso-bidi-font-weight:normal'><span
  style='font-size:11.0pt;font-family:"Calibri",sans-serif;mso-ascii-theme-font:
  minor-latin;mso-hansi-theme-font:minor-latin'>JOB TITLE<o:p></o:p></span></b></p>
  </td>
  <td width=517 colspan=6 style='width:387.55pt;border-top:none;border-left:
  none;border-bottom:solid #BFBFBF 1.0pt;border-right:solid #BFBFBF 1.0pt;
  mso-border-top-alt:solid #BFBFBF .5pt;mso-border-left-alt:solid #BFBFBF .5pt;
  mso-border-alt:solid #BFBFBF .5pt;padding:0in 5.4pt 0in 5.4pt;height:21.9pt'>
  <p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;
  mso-ascii-theme-font:minor-latin;mso-hansi-theme-font:minor-latin;mso-bidi-font-family:
  "Times New Roman"'>Example Job Title<o:p></o:p></span></p>
  </td>
 </tr>
 <tr style='mso-yfti-irow:6;height:21.9pt'>
  <td width=205 colspan=3 style='width:153.55pt;border:solid #BFBFBF 1.0pt;
  border-top:none;mso-border-top-alt:solid #BFBFBF .5pt;mso-border-alt:solid #BFBFBF .5pt;
  padding:0in 5.4pt 0in 5.4pt;height:21.9pt'>
  <p class=MsoBodyText><b style='mso-bidi-font-weight:normal'><span
  style='font-size:11.0pt;font-family:"Calibri",sans-serif;mso-ascii-theme-font:
  minor-latin;mso-hansi-theme-font:minor-latin'>COMPANY NAME<o:p></o:p></span></b></p>
  </td>
  <td width=517 colspan=6 style='width:387.55pt;border-top:none;border-left:
  none;border-bottom:solid #BFBFBF 1.0pt;border-right:solid #BFBFBF 1.0pt;
  mso-border-top-alt:solid #BFBFBF .5pt;mso-border-left-alt:solid #BFBFBF .5pt;
  mso-border-alt:solid #BFBFBF .5pt;padding:0in 5.4pt 0in 5.4pt;height:21.9pt'>
  <p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;
  mso-ascii-theme-font:minor-latin;mso-hansi-theme-font:minor-latin'>Example
  Company Name<o:p></o:p></span></p>
  </td>
 </tr>
 <tr style='mso-yfti-irow:7;height:21.9pt'>
  <td width=205 colspan=3 style='width:153.55pt;border:solid #BFBFBF 1.0pt;
  border-top:none;mso-border-top-alt:solid #BFBFBF .5pt;mso-border-alt:solid #BFBFBF .5pt;
  padding:0in 5.4pt 0in 5.4pt;height:21.9pt'>
  <p class=MsoBodyText><b style='mso-bidi-font-weight:normal'><span
  style='font-size:11.0pt;font-family:"Calibri",sans-serif;mso-ascii-theme-font:
  minor-latin;mso-hansi-theme-font:minor-latin'>ADDRESS<o:p></o:p></span></b></p>
  </td>
  <td width=517 colspan=6 style='width:387.55pt;border-top:none;border-left:
  none;border-bottom:solid #BFBFBF 1.0pt;border-right:solid #BFBFBF 1.0pt;
  mso-border-top-alt:solid #BFBFBF .5pt;mso-border-left-alt:solid #BFBFBF .5pt;
  mso-border-alt:solid #BFBFBF .5pt;padding:0in 5.4pt 0in 5.4pt;height:21.9pt'>
  <p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>Example
  Address</span><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;
  mso-ascii-theme-font:minor-latin;mso-hansi-theme-font:minor-latin'><o:p></o:p></span></p>
  </td>
 </tr>
 <tr style='mso-yfti-irow:8;height:21.9pt'>
  <td width=205 colspan=3 style='width:153.55pt;border:solid #BFBFBF 1.0pt;
  border-top:none;mso-border-top-alt:solid #BFBFBF .5pt;mso-border-alt:solid #BFBFBF .5pt;
  padding:0in 5.4pt 0in 5.4pt;height:21.9pt'>
  <p class=MsoBodyText><b style='mso-bidi-font-weight:normal'><span
  style='font-size:11.0pt;font-family:"Calibri",sans-serif;mso-ascii-theme-font:
  minor-latin;mso-hansi-theme-font:minor-latin'>PHONE NUMBER<o:p></o:p></span></b></p>
  </td>
  <td width=517 colspan=6 style='width:387.55pt;border-top:none;border-left:
  none;border-bottom:solid #BFBFBF 1.0pt;border-right:solid #BFBFBF 1.0pt;
  mso-border-top-alt:solid #BFBFBF .5pt;mso-border-left-alt:solid #BFBFBF .5pt;
  mso-border-alt:solid #BFBFBF .5pt;padding:0in 5.4pt 0in 5.4pt;height:21.9pt'>
  <p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;
  mso-fareast-font-family:Calibri'>(123) 456-7896 Direct Test Phone<o:p></o:p></span></p>
  <p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;
  mso-fareast-font-family:Calibri'>(123) 456-7896 Mobile Test Phone<o:p></o:p></span></p>
  </td>
 </tr>
 <tr style='mso-yfti-irow:9;height:21.9pt'>
  <td width=205 colspan=3 style='width:153.55pt;border:solid #BFBFBF 1.0pt;
  border-top:none;mso-border-top-alt:solid #BFBFBF .5pt;mso-border-alt:solid #BFBFBF .5pt;
  padding:0in 5.4pt 0in 5.4pt;height:21.9pt'>
  <p class=MsoBodyText><b style='mso-bidi-font-weight:normal'><span
  style='font-size:11.0pt;font-family:"Calibri",sans-serif;mso-ascii-theme-font:
  minor-latin;mso-hansi-theme-font:minor-latin'>EMPLOYER NAME/TITLE <o:p></o:p></span></b></p>
  </td>
  <td width=517 colspan=6 style='width:387.55pt;border-top:none;border-left:
  none;border-bottom:solid #BFBFBF 1.0pt;border-right:solid #BFBFBF 1.0pt;
  mso-border-top-alt:solid #BFBFBF .5pt;mso-border-left-alt:solid #BFBFBF .5pt;
  mso-border-alt:solid #BFBFBF .5pt;padding:0in 5.4pt 0in 5.4pt;height:21.9pt'>
  <p class=FieldText><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;
  mso-fareast-font-family:Calibri;font-weight:normal'>Example Employer Name, CEO</span><span
  style='font-size:11.0pt;font-family:"Calibri",sans-serif;mso-ascii-theme-font:
  minor-latin;mso-fareast-font-family:Calibri;mso-hansi-theme-font:minor-latin;
  font-weight:normal'><o:p></o:p></span></p>
  </td>
 </tr>
 <tr style='mso-yfti-irow:10;height:21.9pt'>
  <td width=205 colspan=3 style='width:153.55pt;border:solid #BFBFBF 1.0pt;
  border-top:none;mso-border-top-alt:solid #BFBFBF .5pt;mso-border-alt:solid #BFBFBF .5pt;
  padding:0in 5.4pt 0in 5.4pt;height:21.9pt'>
  <p class=MsoBodyText><b style='mso-bidi-font-weight:normal'><span
  style='font-size:11.0pt;font-family:"Calibri",sans-serif;mso-ascii-theme-font:
  minor-latin;mso-hansi-theme-font:minor-latin'>EMPLOYER CONTACT INFO<o:p></o:p></span></b></p>
  </td>
  <td width=517 colspan=6 style='width:387.55pt;border-top:none;border-left:
  none;border-bottom:solid #BFBFBF 1.0pt;border-right:solid #BFBFBF 1.0pt;
  mso-border-top-alt:solid #BFBFBF .5pt;mso-border-left-alt:solid #BFBFBF .5pt;
  mso-border-alt:solid #BFBFBF .5pt;padding:0in 5.4pt 0in 5.4pt;height:21.9pt'>
  <p class=MsoNormal><a href="mailto:[email protected]"><span style='font-size:
  11.0pt;font-family:"Calibri",sans-serif;mso-ascii-theme-font:minor-latin;
  mso-hansi-theme-font:minor-latin;mso-bidi-font-family:"Times New Roman"'>example</span></a><span
  class=MsoHyperlink><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;
  mso-ascii-theme-font:minor-latin;mso-hansi-theme-font:minor-latin;mso-bidi-font-family:
  "Times New Roman"'>@email.com</span></span><span style='font-size:11.0pt;
  font-family:"Calibri",sans-serif;mso-ascii-theme-font:minor-latin;mso-hansi-theme-font:
  minor-latin;mso-bidi-font-family:"Times New Roman"'><o:p></o:p></span></p>
  <p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;
  mso-ascii-theme-font:minor-latin;mso-hansi-theme-font:minor-latin;mso-bidi-font-family:
  "Times New Roman"'><o:p>&nbsp;</o:p></span></p>
  </td>
 </tr>
 <tr style='mso-yfti-irow:11;height:21.9pt'>
  <td width=205 colspan=3 style='width:153.55pt;border:solid #BFBFBF 1.0pt;
  border-top:none;mso-border-top-alt:solid #BFBFBF .5pt;mso-border-alt:solid #BFBFBF .5pt;
  padding:0in 5.4pt 0in 5.4pt;height:21.9pt'>
  <p class=MsoBodyText><b style='mso-bidi-font-weight:normal'><span
  style='font-size:11.0pt;font-family:"Calibri",sans-serif;mso-ascii-theme-font:
  minor-latin;mso-hansi-theme-font:minor-latin'>COMPANY DESCRIPTION<o:p></o:p></span></b></p>
  </td>
  <td width=517 colspan=6 style='width:387.55pt;border-top:none;border-left:
  none;border-bottom:solid #BFBFBF 1.0pt;border-right:solid #BFBFBF 1.0pt;
  mso-border-top-alt:solid #BFBFBF .5pt;mso-border-left-alt:solid #BFBFBF .5pt;
  mso-border-alt:solid #BFBFBF .5pt;padding:0in 5.4pt 0in 5.4pt;height:21.9pt'>
  <p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;
  mso-ascii-theme-font:minor-latin;mso-hansi-theme-font:minor-latin;mso-bidi-font-family:
  "Times New Roman";color:#373737'>This is an example company description.<o:p></o:p></span></p>
  </td>
 </tr>
 <tr style='mso-yfti-irow:12;height:21.9pt'>
  <td width=205 colspan=3 style='width:153.55pt;border:solid #BFBFBF 1.0pt;
  border-top:none;mso-border-top-alt:solid #BFBFBF .5pt;mso-border-alt:solid #BFBFBF .5pt;
  padding:0in 5.4pt 0in 5.4pt;height:21.9pt'>
  <p class=MsoBodyText><b style='mso-bidi-font-weight:normal'><span
  style='font-size:11.0pt;font-family:"Calibri",sans-serif;mso-ascii-theme-font:
  minor-latin;mso-hansi-theme-font:minor-latin'># OF EMPLOYEES/LOCATIONS<o:p></o:p></span></b></p>
  </td>
  <td width=517 colspan=6 style='width:387.55pt;border-top:none;border-left:
  none;border-bottom:solid #BFBFBF 1.0pt;border-right:solid #BFBFBF 1.0pt;
  mso-border-top-alt:solid #BFBFBF .5pt;mso-border-left-alt:solid #BFBFBF .5pt;
  mso-border-alt:solid #BFBFBF .5pt;padding:0in 5.4pt 0in 5.4pt;height:21.9pt'>
  <p class=FieldText><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;
  mso-ascii-theme-font:minor-latin;mso-hansi-theme-font:minor-latin;font-weight:
  normal'>16, growing<o:p></o:p></span></p>
  </td>
 </tr>
 <tr style='mso-yfti-irow:13;height:21.9pt'>
  <td width=205 colspan=3 style='width:153.55pt;border:solid #BFBFBF 1.0pt;
  border-top:none;mso-border-top-alt:solid #BFBFBF .5pt;mso-border-alt:solid #BFBFBF .5pt;
  padding:0in 5.4pt 0in 5.4pt;height:21.9pt'>
  <p class=MsoBodyText><b style='mso-bidi-font-weight:normal'><span
  style='font-size:11.0pt;font-family:"Calibri",sans-serif;mso-ascii-theme-font:
  minor-latin;mso-hansi-theme-font:minor-latin'>CULTURE<o:p></o:p></span></b></p>
  </td>
  <td width=517 colspan=6 style='width:387.55pt;border-top:none;border-left:
  none;border-bottom:solid #BFBFBF 1.0pt;border-right:solid #BFBFBF 1.0pt;
  mso-border-top-alt:solid #BFBFBF .5pt;mso-border-left-alt:solid #BFBFBF .5pt;
  mso-border-alt:solid #BFBFBF .5pt;padding:0in 5.4pt 0in 5.4pt;height:21.9pt'>
  <p class=FieldText><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;
  font-weight:normal'>Creative, passion for team growth</span><span
  style='font-size:11.0pt;font-family:"Calibri",sans-serif;mso-ascii-theme-font:
  minor-latin;mso-hansi-theme-font:minor-latin;font-weight:normal'><o:p></o:p></span></p>
  </td>
 </tr>
 <tr style='mso-yfti-irow:14;height:.2in;mso-height-rule:exactly'>
  <td width=721 colspan=9 style='width:541.1pt;border-top:none;border-left:
  solid windowtext 1.0pt;border-bottom:solid #BFBFBF 1.0pt;border-right:solid windowtext 1.0pt;
  mso-border-top-alt:solid windowtext .5pt;mso-border-alt:solid windowtext .5pt;
  mso-border-bottom-alt:solid #BFBFBF .5pt;background:black;padding:0in 5.4pt 0in 5.4pt;
  height:.2in;mso-height-rule:exactly'>
  <h3><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;
  mso-ascii-theme-font:minor-latin;mso-hansi-theme-font:minor-latin;color:#E36C0A'>Job
  Description<o:p></o:p></span></h3>
  </td>
 </tr>
 <tr style='mso-yfti-irow:15;mso-yfti-lastrow:yes;height:27.85pt'>
  <td width=205 colspan=3 style='width:153.55pt;border:solid #BFBFBF 1.0pt;
  border-top:none;mso-border-top-alt:solid #BFBFBF .5pt;mso-border-alt:solid #BFBFBF .5pt;
  padding:0in 5.4pt 0in 5.4pt;height:27.85pt'>
  <p class=MsoBodyText><b style='mso-bidi-font-weight:normal'><span
  style='font-size:11.0pt;font-family:"Calibri",sans-serif;mso-ascii-theme-font:
  minor-latin;mso-hansi-theme-font:minor-latin'>JOB DESCRIPTION<o:p></o:p></span></b></p>
  </td>
  <td width=517 colspan=6 style='width:387.55pt;border-top:none;border-left:
  none;border-bottom:solid #BFBFBF 1.0pt;border-right:solid #BFBFBF 1.0pt;
  mso-border-top-alt:solid #BFBFBF .5pt;mso-border-left-alt:solid #BFBFBF .5pt;
  mso-border-alt:solid #BFBFBF .5pt;padding:0in 5.4pt 0in 5.4pt;height:27.85pt'>
  <p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;
  mso-ascii-theme-font:minor-latin;mso-hansi-theme-font:minor-latin;mso-bidi-font-family:
  "Times New Roman"'>Example Job Description, Example Job Description, Example
  Job Description Example Job Description Example Job Description Example Job
  Description Example Job Description Example Job Description Example Job
  Description Example Job Description Example Job Description Example Job
  Description Example Job Description Example Job Description Example Job
  Description Example Job Description Example Job Description Example Job
  Description<o:p></o:p></span></p>
  </td>
 </tr>
 <![if !supportMisalignedColumns]>
 <tr height=0>
  <td width=81 style='border:none'></td>
  <td width=93 style='border:none'></td>
  <td width=30 style='border:none'></td>
  <td width=84 style='border:none'></td>
  <td width=24 style='border:none'></td>
  <td width=108 style='border:none'></td>
  <td width=84 style='border:none'></td>
  <td width=150 style='border:none'></td>
  <td width=67 style='border:none'></td>
 </tr>
 <![endif]>
</table>

<p class=MsoNormal><b style='mso-bidi-font-weight:normal'><span
style='font-size:11.0pt;font-family:"Calibri",sans-serif;mso-ascii-theme-font:
minor-latin;mso-hansi-theme-font:minor-latin'><o:p>&nbsp;</o:p></span></b></p>

<p class=MsoNormal align=center style='margin-bottom:1.0pt;text-align:center'><b
style='mso-bidi-font-weight:normal'><span style='font-size:11.0pt;font-family:
"Calibri",sans-serif;mso-ascii-theme-font:minor-latin;mso-fareast-font-family:
Calibri;mso-hansi-theme-font:minor-latin;mso-bidi-font-family:"Times New Roman"'><o:p>&nbsp;</o:p></span></b></p>

</div>

</body>

</html>


Could someone either help me solve this or point me in the right direction on what I'm missing? I greatly appreciate any help provided!
teadrinker
Posts: 4325
Joined: 29 Mar 2015, 09:41
Contact:

Re: Simple HTML Table Parse

27 Oct 2017, 18:01

Hi, mattjgale,

Internet Explorer object has no access to local files for security reasons. You could using Htmlfile object instead.
mattjgale wrote:

Code: Select all

table := IE.document.getElementById("MsoNormalTable").Rows
Your html has no such id. I found the table with "MsoNormalTable" class.

Code: Select all

<table class=MsoNormalTable border=1 cellspacing=0 cellpadding=0 width=721 ...
The class is not the same, as id. Try this code:

Code: Select all

FileRead, html, C:\Users\Matt\OneDrive\AHK\a.html
oDoc := ComObjCreate("htmlfile")
oDoc.write("<meta http-equiv=""X-UA-Compatible"" content=""IE=9"">")
oDoc.write(html)
table := oDoc.getElementsByClassName("MsoNormalTable")[0]
rows := table.rows
Loop % rows.length  {
   cells := rows[A_Index - 1].cells
   Loop % cells.length
      MsgBox, % cells[A_Index - 1].innerText
}
mattjgale
Posts: 3
Joined: 27 Oct 2017, 15:49

Re: Simple HTML Table Parse

28 Oct 2017, 17:10

This is fantastic thank you so much!

I'm going through and noticing that this is pulling the cells on the right and not the "headers" on the left. Some of the files I have are in a somewhat different format so I was planning on looping through and doing a check for headers to assign variables so I can maintain the same structure when I export to an excel file.

Any pointers on which part of the code would need to be tweaked to simply cycle through all fields instead of the ones on the right?

Once again I really appreciate the help so far!
teadrinker
Posts: 4325
Joined: 29 Mar 2015, 09:41
Contact:

Re: Simple HTML Table Parse

29 Oct 2017, 06:14

Glad to help! :)
I'm going through and noticing that this is pulling the cells on the right and not the "headers" on the left.
I'm not sure, what you mean by that. Your table has no such items as "headers", all items are cells. My code passes all cells consequentially from left to right and from top to bottom, and no one cells is missed.

Image

The script returns:
Tracking Info —> JOB ORDER #: —> MG81711 —> TIER RATING: —> 1 —> AD COPY REQUESTED: Y/N: —> N —> Engagement Info —> OPEN ORDER —> No —> CONTACT RECRUITER 1ST —> Yes —> Company Info —> JOB TITLE —> Example Job Title ...
and so on.

Return to “Ask for Help (v1)”

Who is online

Users browsing this forum: Google [Bot], hiahkforum and 224 guests