Help with IE COM – loop for correct tag

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
User avatar
menteith
Posts: 51
Joined: 04 Feb 2016, 12:22

Help with IE COM – loop for correct tag

01 Jun 2016, 10:07

Hi all,

Thanks for the answer to my latest post I was able to write another piece of my program. I'm stuck again, however.

Code: Select all

<article class="article-detail-description">
                <h1 class="page-heading">
                    Normatywne i empiryczne teorie demokracji
                    <br /><small>Normative and empirical theories of democracy</small>
                </h1>
                 
                <div>
                    <strong>Author(s): </strong>Jane Doe<br /><strong>Subject(s): </strong>Politics / Political Sciences<br />
                    <strong>Published by: </strong>KSIĘGARNIA AKADEMICKA Sp. z o.o.<br/><
                    strong>Keywords: </strong>democracy; norms; evaluation; explanation<br/>
                </div>
I'd like to get some of elements after <strong> .... </strong> tags e.g. Jane Doe.

Here's my attempt. First I look for every strong tag and then look for "Author(s): ". But I cannot get what is after it.

An edit: I'd like to avoid using RegEx.

Code: Select all

IE := ComObjCreate("InternetExplorer.Application")
IE.Visible := false
IE.Navigate("https://www.ceeol.com/search/article-detail?id=343886")
while IE.readyState != 4 || IE.document.readyState != "complete" || IE.busy
Sleep 10

   script_class := IE.document.querySelectorAll("strong") ;get all classes on page
while A_Index < script_class.length
	if InStr(script_class[A_Index].outerHtml,"Author(s): ")
		MsgBox % script_class[A_Index].innerText
ExitApp
Last edited by menteith on 01 Jun 2016, 10:44, edited 1 time in total.
User avatar
Capn Odin
Posts: 1352
Joined: 23 Feb 2016, 19:45
Location: Denmark
Contact:

Re: Help with IE COM – loop for correct tag

01 Jun 2016, 10:39

Try this, not sure it's the most efficient way possible through.

Code: Select all

OnExit("KillIE")

IE := ComObjCreate("InternetExplorer.Application")
IE.Visible := false
IE.Navigate("https://www.ceeol.com/search/article-detail?id=343886")

while IE.readyState != 4 || IE.document.readyState != "complete" || IE.busy
	Sleep 10

detail := IE.document.getElementsByClassName("article-detail-description")
div := detail[0].getElementsByTagName("div")

MsgBox, % div[0].InnerHTML

RegExMatch(div[0].InnerHTML	, "(?<=</strong>)(\w|\s)*(?=<br>)", Author)
MsgBox, % Author

KillIE(){
	Global IE
	IE.Quit()
}
Please excuse my spelling I am dyslexic.
User avatar
menteith
Posts: 51
Joined: 04 Feb 2016, 12:22

Re: Help with IE COM – loop for correct tag

01 Jun 2016, 10:43

Thanks. I didn't mention that I'd like to avoid using RegEx:)
An ordinary user who needs some help with developing own programs for his own use.
User avatar
Capn Odin
Posts: 1352
Joined: 23 Feb 2016, 19:45
Location: Denmark
Contact:

Re: Help with IE COM – loop for correct tag

01 Jun 2016, 10:52

menteith wrote:Thanks. I didn't mention that I'd like to avoid using RegEx:)
Is this way ok ?

Code: Select all

OnExit("KillIE")

IE := ComObjCreate("InternetExplorer.Application")
IE.Visible := false
IE.Navigate("https://www.ceeol.com/search/article-detail?id=343886")

while IE.readyState != 4 || IE.document.readyState != "complete" || IE.busy
	Sleep 10

detail := IE.document.getElementsByClassName("article-detail-description")
div := detail[0].getElementsByTagName("div")

MsgBox, % div[0].InnerHTML

str := StrSplit(div[0].InnerHTML, "<br>")[1]

Author := SubStr(str, InStr(str, "</strong>") + StrLen("</strong>"))

MsgBox, % Author

KillIE(){
	Global IE
	IE.Quit()
}
Please excuse my spelling I am dyslexic.
User avatar
menteith
Posts: 51
Joined: 04 Feb 2016, 12:22

Re: Help with IE COM – loop for correct tag

01 Jun 2016, 11:08

Capn Odin

Very nice! Thank you for your swift reply. May I have a request? I'd like to have, say, values after "Author(s): " and "Keywords: " assigned to a variable. For example: x = Andrzej Antoszewski.

Many thanks again!
An ordinary user who needs some help with developing own programs for his own use.
ameyrick
Posts: 122
Joined: 20 Apr 2014, 18:12

Re: Help with IE COM – loop for correct tag

01 Jun 2016, 11:14

You almost had it with your first try
A_Index starts at 1 not 0

Code: Select all

IE := ComObjCreate("InternetExplorer.Application")
IE.Visible := false
IE.Navigate("https://www.ceeol.com/search/article-detail?id=343886")
while IE.readyState != 4 || IE.document.readyState != "complete" || IE.busy
Sleep 10
 
   script_class := IE.document.querySelectorAll("strong") ;get all classes on page
loop, % script_class.length
	if InStr(script_class[(A_Index-1)].outerHtml,"Author(s): ")
		MsgBox % script_class[(A_Index-1)].innerText
ExitApp
User avatar
Capn Odin
Posts: 1352
Joined: 23 Feb 2016, 19:45
Location: Denmark
Contact:

Re: Help with IE COM – loop for correct tag

01 Jun 2016, 11:17

menteith wrote:Capn Odin

Very nice! Thank you for your swift reply. May I have a request? I'd like to have, say, values after "Author(s): " and "Keywords: " assigned to a variable. For example: x = Andrzej Antoszewski.

Many thanks again!

Code: Select all

OnExit("KillIE")

IE := ComObjCreate("InternetExplorer.Application")
IE.Visible := false
IE.Navigate("https://www.ceeol.com/search/article-detail?id=343886")

while IE.readyState != 4 || IE.document.readyState != "complete" || IE.busy
	Sleep 10

detail := IE.document.getElementsByClassName("article-detail-description")
div := detail[0].getElementsByTagName("div")

MsgBox, % div[0].InnerHTML

str := StrSplit(div[0].InnerHTML, "<br>")

Authors := SubStr(str[1], InStr(str[1], "</strong>") + StrLen("</strong>"))

for index, val in str{
	if(InStr(val, "Keywords: ")){
		Keywords := StrReplace(val, "<strong>Keywords: </strong>")
		Break
	}
}

MsgBox, % Authors "`n" Keywords

KillIE(){
	Global IE
	IE.Quit()
}
Please excuse my spelling I am dyslexic.
User avatar
menteith
Posts: 51
Joined: 04 Feb 2016, 12:22

Re: Help with IE COM – loop for correct tag

01 Jun 2016, 11:33

Thanks! I did some tests and it works:) One strange thing is that when I use:

Code: Select all

for index, val in str{
	if(InStr(val, "Author(s): ")){
		Authors := StrReplace(val, "<strong>Author(s): </strong>")
		Break
	}
}
String starts with a tab or two.
The following gives the expected result:

Code: Select all

for index, val in str{
	if(InStr(val, "Published by: ")){
		pub := StrReplace(val, "<strong>Published by: </strong>")
		Break
	}
}
An ordinary user who needs some help with developing own programs for his own use.
User avatar
menteith
Posts: 51
Joined: 04 Feb 2016, 12:22

Re: Help with IE COM – loop for correct tag

02 Jun 2016, 06:47

ameyrick wrote:You almost had it with your first try
A_Index starts at 1 not 0

Code: Select all

IE := ComObjCreate("InternetExplorer.Application")
IE.Visible := false
IE.Navigate("https://www.ceeol.com/search/article-detail?id=343886")
while IE.readyState != 4 || IE.document.readyState != "complete" || IE.busy
Sleep 10
 
   script_class := IE.document.querySelectorAll("strong") ;get all classes on page
loop, % script_class.length
	if InStr(script_class[(A_Index-1)].outerHtml,"Author(s): ")
		MsgBox % script_class[(A_Index-1)].innerText
ExitApp
Thank you. I have seen a lot of your post about COM and I even use you modified version of IEGet;)

The thing is that mine code works but I'd like to get what is after "Author(s): " not ""Author(s): "" itself. You can check a reply by Capn Odin. One strange thing is bugging me, thought. Care to tell me why the following code gives the correct result but with one or two spaces (or maybe tabs) before it?

Code: Select all

for index, val in str{
	if(InStr(val, "Author(s): ")){
		Authors := StrReplace(val, "<strong>Author(s): </strong>")
		Break
	}
}
Thanks!
ameyrick
Posts: 122
Joined: 20 Apr 2014, 18:12

Re: Help with IE COM – loop for correct tag

02 Jun 2016, 07:26

On my phone right now will look at it again later.

As for the tabs and spaces... If they exist in the html source, they will be in the result string as they are valid characters. You will have to test for them and remove them.
ameyrick
Posts: 122
Joined: 20 Apr 2014, 18:12

Re: Help with IE COM – loop for correct tag

02 Jun 2016, 19:11

no_whitespace()

Code: Select all

OnExit("KillIE")
 
IE := ComObjCreate("InternetExplorer.Application")
IE.Visible := false
IE.Navigate("https://www.ceeol.com/search/article-detail?id=343886")
 
while IE.readyState != 4 || IE.document.readyState != "complete" || IE.busy
	Sleep 10
 
detail := IE.document.getElementsByClassName("article-detail-description")
div := detail[0].getElementsByTagName("div")
 
MsgBox, % div[0].InnerHTML

MsgBox, % no_whitespace(div[0].InnerHTML)
 
str := StrSplit(div[0].InnerHTML, "<br>")

for index, val in str{
	if(InStr(val, "Author(s): ")){
		Authors := StrReplace(val, "<strong>Author(s): </strong>")
		Authors := no_whitespace(Authors)
		Break
	}
}

for index, val in str{
	if(InStr(val, "Keywords: ")){
		Keywords := StrReplace(val, "<strong>Keywords: </strong>")
		Break
	}
}
 
MsgBox, % Authors "`n" Keywords

no_whitespace(s){
	x:="", r:="", i:=0, spc:=0
	MatchList := Chr(10) "," Chr(13) "," Chr(32) "," A_Space "," A_Tab
	StringSplit, ch, s
	loop % ch0
	{
		if (i < 1)
			if ch%A_Index% in %MatchList%
				continue
		i+=1
		if ch%A_Index% in %MatchList%
			spc+=1
		if (spc > 1){
			i-=1
			continue 
		} else 
			spc:=0		
		x.= ch%A_Index%
	}
	global ch:=
	StringSplit, ch, x
	a:=ch0
	i:=0
	loop, % ch0
	{
		if (a = 0)
			break
		if (i < 1)
			if ch%a% in %MatchList%
			{
				a-=1
				continue 
			}
		i+=1
		r:= ch%a% r
		a-=1
	}
	global ch:=
	return r
} ;written by ameyrick

KillIE(){
	Global IE
	IE.Quit()
}
User avatar
menteith
Posts: 51
Joined: 04 Feb 2016, 12:22

Re: Help with IE COM – loop for correct tag

03 Jun 2016, 03:13

ameyrick wrote:no_whitespace()

Code: Select all

OnExit("KillIE")

}
Thank you! Much appreciated!
An ordinary user who needs some help with developing own programs for his own use.

Return to “Ask for Help (v1)”

Who is online

Users browsing this forum: Google [Bot], RandomBoy, scriptor2016 and 350 guests