Page 1 of 1

Obtain information from html

Posted: 17 Sep 2020, 05:10
by songdg
I know the pattern begin with "<tr align=middle >" and end with "</tr>", and want to obtain each information within the pattern.
And from the code below to get the HTML:

Code: Select all

wb := ComObjCreate("InternetExplorer.Application")
wb.Navigate(URL)
HTML:= wb.document.documentElement.innerHTML

Code: Select all

<html>
<head>
    <title>xyzxyzxyzxyz</title>
    <meta http-equiv="Content-Type" content="text/html; charset=GBK">
    <link rel="stylesheet" href="/aic/skins/1//style.css" type="text/css">
</head>
<body>
<table cellspacing=0 cellpadding=0 width="750" border=0 align=center>
    <tr height="28" class="HeadMainThead">
        <td align="center">xyzxyzxyzxyz</td>
    </tr>
    <tr>
        <td valign=top align=middle>
            <table class="main_table" border="1" cellpadding="3" cellspacing="0" align=center width="750">
                <col width="15%">
                <col width="5%">
                <col width="15%">
                <col width="5%">
                <tr class="main_datalist_thead_sub">
                    <td align="center">xyzxyzxyzxyz</td>
                    <td align="center">xyzxyzxyzxyz</td>
                    <td align="center">xyzxyzxyzxyz</td>
                    <td align="center">xyzxyzxyzxyz</td>
                </tr>
                
                <tr align=middle >
                    <td align=center>xyzxyzxyzxyz</td>
                    <td align=center>2018-07-19&nbsp;</td>
                    <td align=center>&nbsp;</td>
                    <td align=center>&nbsp;</td>
                </tr>
                
                <tr align=middle >
                    <td align=center>xyzxyzxyzxyz</td>
                    <td align=center>2019-08-08&nbsp;</td>
                    <td align=center>&nbsp;</td>
                    <td align=center>&nbsp;</td>
                </tr>
                
                <tr align=middle >
                    <td align=center>xyzxyzxyzxyz</td>
                    <td align=center>2020-07-15&nbsp;</td>
                    <td align=center>&nbsp;</td>
                    <td align=center>&nbsp;</td>
                </tr>
                



            </table>
        </td>
    </tr>
</table>


</body>
</html>
[/c]

Re: Obtain information in javascript

Posted: 17 Sep 2020, 08:53
by TheDewd
This is HTML, not JavaScript.

Code: Select all

#SingleInstance, Force

HTML =
(
<html>
	<head>
		<title>xyzxyzxyzxyz</title>
		<meta http-equiv="Content-Type" content="text/html; charset=GBK">
		<link rel="stylesheet" href="/aic/skins/1//style.css" type="text/css">
	</head>
	<body>
		<table cellspacing=0 cellpadding=0 width="750" border=0 align=center>
			<tr height="28" class="HeadMainThead">
				<td align="center">xyzxyzxyzxyz</td>
			</tr>
			<tr>
				<td valign=top align=middle>
					<table class="main_table" border="1" cellpadding="3" cellspacing="0" align=center width="750">
						<col width="15`%">
						<col width="5`%">
						<col width="15`%">
						<col width="5`%">
						<tr class="main_datalist_thead_sub">
							<td align="center">xyzxyzxyzxyz</td>
							<td align="center">xyzxyzxyzxyz</td>
							<td align="center">xyzxyzxyzxyz</td>
							<td align="center">xyzxyzxyzxyz</td>
						</tr>
						<tr align=middle >
							<td align=center>xyzxyzxyzxyz</td>
							<td align=center>2018-07-19&nbsp;</td>
							<td align=center>&nbsp;</td>
							<td align=center>&nbsp;</td>
						</tr>
						<tr align=middle >
							<td align=center>xyzxyzxyzxyz</td>
							<td align=center>2019-08-08&nbsp;</td>
							<td align=center>&nbsp;</td>
							<td align=center>&nbsp;</td>
						</tr>
						<tr align=middle >
							<td align=center>xyzxyzxyzxyz</td>
							<td align=center>2020-07-15&nbsp;</td>
							<td align=center>&nbsp;</td>
							<td align=center>&nbsp;</td>
						</tr>
					</table>
				</td>
			</tr>
		</table>
	</body>
</html>
)

Pos1 := 0
Pos2 := 0

While (Pos1 := RegExMatch(RegexReplace(HTML, "^\s+|\s+$"), "s)<tr align=middle >(.*?)<\/tr>", MatchA, Pos1 + 1)) {
	While (Pos2 := RegExMatch(RegexReplace(MatchA1, "^\s+|\s+$"), "s)<td align=center>(.*?)<\/td>", MatchB, Pos2 + 1)) {
		MsgBox, % MatchB1
	}
}
The RegExReplace is not really neccessary, but I added it anyway. Probably an easier way to do this, but this is one way...

Re: Obtain information in javascript

Posted: 17 Sep 2020, 11:23
by Yakshongas
Like this maybe?

Code: Select all

SetWorkingDir, A_ScriptDir
File := "ScanFile.html"

F1::
FileRead, FullFile, % File
Loop, Parse, FullFile, `n
{
    Trimedfile .= Trim(A_LoopField)
}
RegExMatch(Trimedfile, "<td (.+)<\/td>", Info)
MsgBox, % Info
Trimedfile := "", Info := ""
Return

Re: Obtain information in javascript

Posted: 18 Sep 2020, 00:03
by songdg
TheDewd wrote:
17 Sep 2020, 08:53
The RegExReplace is not really neccessary, but I added it anyway. Probably an easier way to do this, but this is one way...
Thank you, what if the html code come from a variable say html_code, I make that change HTML = (%html_code%), but it doesn't work.

Re: Obtain information in javascript  Topic is solved

Posted: 18 Sep 2020, 01:52
by Xtra

Code: Select all

#NoEnv

html_code =
(%
<html>
<head>
    <title>xyzxyzxyzxyz</title>
    <meta http-equiv="Content-Type" content="text/html; charset=GBK">
    <link rel="stylesheet" href="/aic/skins/1//style.css" type="text/css">
</head>
<body>
<table cellspacing=0 cellpadding=0 width="750" border=0 align=center>
    <tr height="28" class="HeadMainThead">
        <td align="center">xyzxyzxyzxyz</td>
    </tr>
    <tr>
        <td valign=top align=middle>
            <table class="main_table" border="1" cellpadding="3" cellspacing="0" align=center width="750">
                <col width="15%">
                <col width="5%">
                <col width="15%">
                <col width="5%">
                <tr class="main_datalist_thead_sub">
                    <td align="center">xyzxyzxyzxyz</td>
                    <td align="center">xyzxyzxyzxyz</td>
                    <td align="center">xyzxyzxyzxyz</td>
                    <td align="center">xyzxyzxyzxyz</td>
                </tr>
                
                <tr align=middle >
                    <td align=center>xyzxyzxyzxyz</td>
                    <td align=center>2018-07-19&nbsp;</td>
                    <td align=center>&nbsp;</td>
                    <td align=center>&nbsp;</td>
                </tr>
                
                <tr align=middle >
                    <td align=center>xyzxyzxyzxyz</td>
                    <td align=center>2019-08-08&nbsp;</td>
                    <td align=center>&nbsp;</td>
                    <td align=center>&nbsp;</td>
                </tr>
                
                <tr align=middle >
                    <td align=center>xyzxyzxyzxyz</td>
                    <td align=center>2020-07-15&nbsp;</td>
                    <td align=center>&nbsp;</td>
                    <td align=center>&nbsp;</td>
                </tr>
                



            </table>
        </td>
    </tr>
</table>


</body>
</html>
)

if !FileExist(A_ScriptDir . "\testhtml.html")
    FileAppend, % html_code, % A_ScriptDir . "\testhtml.html"

IE := ComObjCreate("InternetExplorer.Application")
IE.Navigate(A_ScriptDir . "\testhtml.html")
while IE.busy || IE.readyState!=4 || IE.document.readyState!="complete"
	Sleep, 10

trElements := IE.document.querySelector(".main_table").querySelectorAll("[align=middle]")
Loop, % trElements.Length
{
    tdElements := trElements[A_Index-1].querySelectorAll("td")
	Loop, % tdElements.Length
	    MsgBox % tdElements[A_Index-1].InnerText
}

IE.Quit()
ExitApp

Re: Obtain information in javascript

Posted: 19 Sep 2020, 05:24
by songdg
TheDewd wrote:
17 Sep 2020, 08:53
This is HTML, not JavaScript.
Thanks for your help and correction. :thumbup:

Re: Obtain information in javascript

Posted: 19 Sep 2020, 05:25
by songdg
Yakshongas wrote:
17 Sep 2020, 11:23
Like this maybe?
Thank you very much.

Re: Obtain information in javascript

Posted: 19 Sep 2020, 05:28
by songdg
Xtra wrote:
18 Sep 2020, 01:52

Code: Select all

#NoEnv

if !FileExist(A_ScriptDir . "\testhtml.html")
    FileAppend, % html_code, % A_ScriptDir . "\testhtml.html"

IE := ComObjCreate("InternetExplorer.Application")
IE.Navigate(A_ScriptDir . "\testhtml.html")
while IE.busy || IE.readyState!=4 || IE.document.readyState!="complete"
	Sleep, 10

trElements := IE.document.querySelector(".main_table").querySelectorAll("[align=middle]")
Loop, % trElements.Length
{
    tdElements := trElements[A_Index-1].querySelectorAll("td")
	Loop, % tdElements.Length
	    MsgBox % tdElements[A_Index-1].InnerText
}

IE.Quit()
ExitApp
Problem solved, Thanks a lot! :bravo:

Re: Obtain information from html

Posted: 19 Sep 2020, 08:29
by teadrinker
Hm, using IE for this purpose is not the best choise, htmlfile is much faster:

Code: Select all

#NoEnv

html_code =
(
<html>
<head>
    <title>xyzxyzxyzxyz</title>
    <meta http-equiv="Content-Type" content="text/html; charset=GBK">
    <link rel="stylesheet" href="/aic/skins/1//style.css" type="text/css">
</head>
<body>
<table cellspacing=0 cellpadding=0 width="750" border=0 align=center>
    <tr height="28" class="HeadMainThead">
        <td align="center">xyzxyzxyzxyz</td>
    </tr>
    <tr>
        <td valign=top align=middle>
            <table class="main_table" border="1" cellpadding="3" cellspacing="0" align=center width="750">
                <col width="15`%">
                <col width="5`%">
                <col width="15`%">
                <col width="5`%">
                <tr class="main_datalist_thead_sub">
                    <td align="center">xyzxyzxyzxyz</td>
                    <td align="center">xyzxyzxyzxyz</td>
                    <td align="center">xyzxyzxyzxyz</td>
                    <td align="center">xyzxyzxyzxyz</td>
                </tr>
                
                <tr align=middle >
                    <td align=center>xyzxyzxyzxyz</td>
                    <td align=center>2018-07-19&nbsp;</td>
                    <td align=center>&nbsp;</td>
                    <td align=center>&nbsp;</td>
                </tr>
                
                <tr align=middle >
                    <td align=center>xyzxyzxyzxyz</td>
                    <td align=center>2019-08-08&nbsp;</td>
                    <td align=center>&nbsp;</td>
                    <td align=center>&nbsp;</td>
                </tr>
                
                <tr align=middle >
                    <td align=center>xyzxyzxyzxyz</td>
                    <td align=center>2020-07-15&nbsp;</td>
                    <td align=center>&nbsp;</td>
                    <td align=center>&nbsp;</td>
                </tr>
            </table>
        </td>
    </tr>
</table>
</body>
</html>
)
Doc := ComObjCreate("htmlfile")
Doc.write("<meta http-equiv=""X-UA-Compatible"" content=""IE=9"">")
Doc.write(html_code)
trElements := Doc.querySelector(".main_table").querySelectorAll("[align=middle]")
Loop, % trElements.Length
{
   tdElements := trElements[A_Index-1].querySelectorAll("td")
   Loop, % tdElements.Length
      MsgBox % tdElements[A_Index-1].InnerText
}
But using RegEx is preferred.