How to Get Text From Web Page Element with Javascript / COM?

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
3ggg
Posts: 31
Joined: 11 Nov 2019, 19:52

How to Get Text From Web Page Element with Javascript / COM?

11 Nov 2019, 21:45

I was reading this forum post https://autohotkey.com/board/topic/47052-basic-webpage-controls-with-javascript-com-tutorial/ and it said that the way to get text from a web page element with COM is

Code: Select all

text := wb.document.documentElement.innerText
This is the page html I am trying to obtain the text from

Code: Select all

<h1 class="title inline-block">
             Google
            </h1>
I am new to both javascript and autohotkey so I'm unsure how to implement the code into the COM javascript format, how would I do this?
User avatar
boiler
Posts: 17397
Joined: 21 Dec 2014, 02:44

Re: How to Get Text From Web Page Element with Javascript / COM?

12 Nov 2019, 00:08

If you want to get the text from a single element, you need to use something other than wb.document.documentElement.innerText which gets you all the text on the page. You could just grab all the text (or the html by using innerHTML) and parse it to find what you want, but you're usually better off finding specific elements directly. See this example for getting the first "coloured" username (such as an admin) on the AHK forum page:

Code: Select all

wb := ComObjCreate("InternetExplorer.Application")
wb.Visible := True
wb.Navigate("www.autohotkey.com/boards/")
WaitForLoad(wb)
Text := wb.document.getElementsByClassName("username-coloured")[0].innerText
MsgBox, % Text
return

WaitForLoad(wb)
{
	while wb.busy or wb.ReadyState != 4
		Sleep, 10
}
So you could substitute your URL and the name of the class you're looking for ("title inline-block"), and it will return "Google" if it's the first one of that class on the page. If not, increment the index on the array from 0 until you find it.
3ggg
Posts: 31
Joined: 11 Nov 2019, 19:52

Re: How to Get Text From Web Page Element with Javascript / COM?

12 Nov 2019, 00:56

boiler wrote:
12 Nov 2019, 00:08
If you want to get the text from a single element, you need to use something other than wb.document.documentElement.innerText which gets you all the text on the page. You could just grab all the text (or the html by using innerHTML) and parse it to find what you want, but you're usually better off finding specific elements directly. See this example for getting the first "coloured" username (such as an admin) on the AHK forum page:

So you could substitute your URL and the name of the class you're looking for ("title inline-block"), and it will return "Google" if it's the first one of that class on the page. If not, increment the index on the array from 0 until you find it.
Awesome, thanks so much this is very informative and very helpful. A couple follow up guestions:

I assume for the code below you'd have to do a different strategy than get element by class name since there's multiple hundred "ember-views" in the page. May be clueless here but I guess the identifier here would be

Code: Select all

data-control-name="topcard_employees" 
I assume this would be a get element by ( blank ) but I don't know what that blank is

Code: Select all

<a data-control-name="topcard_employees" href="/sales/search/people/list/employees-for-account/23521" id="ember971" class="ember-view">              46,036 employees
</a>
2nd question, if I want the link i just replace .innerText with .href right?
3ggg
Posts: 31
Joined: 11 Nov 2019, 19:52

Re: How to Get Text From Web Page Element with Javascript / COM?

12 Nov 2019, 00:59

Thanks this is super helpful and super informative. A couple of follow up questions based on another piece of code

Code: Select all

<a data-control-name="topcard_employees" href="/sales/search/people/list/employees-for-account/10667" id="ember971" class="ember-view">              46,036 employees
</a>
I assume we can't use getelementbyclassID and you have to use getelementby something else. I assume data-control-name="topcard_employees" is the unique piece here but I'm unsure what element type this is?

Also if I wanted to get that link I would just replace .innerText with .href right?
User avatar
boiler
Posts: 17397
Joined: 21 Dec 2014, 02:44

Re: How to Get Text From Web Page Element with Javascript / COM?

12 Nov 2019, 08:08

3ggg wrote: I assume for the code below you'd have to do a different strategy than get element by class name since there's multiple hundred "ember-views" in the page. May be clueless here but I guess the identifier here would be

Code: Select all

data-control-name="topcard_employees"
I assume this would be a get element by ( blank ) but I don't know what that blank is

Code: Select all

<a data-control-name="topcard_employees" href="/sales/search/people/list/employees-for-account/23521" id="ember971" class="ember-view">              46,036 employees
</a>
Actually, data-... is a custom attribute, and you could identify that node with something like wb.document.querySelectorAll("[data-control-name=""topcard_employees""]"), but don't have a lot of experience with it, and I find it difficult to get the data I want using that.
3ggg wrote: 2nd question, if I want the link i just replace .innerText with .href right?
Yes, generally that will give you the link URL if there is one associated with it.
3ggg wrote:
12 Nov 2019, 00:59

Code: Select all

<a data-control-name="topcard_employees" href="/sales/search/people/list/employees-for-account/10667" id="ember971" class="ember-view">              46,036 employees
</a>
I assume we can't use getelementbyclassID and you have to use getelementby something else. I assume data-control-name="topcard_employees" is the unique piece here but I'm unsure what element type this is?
Since it has an id, that is a unique identifier which can be selected by getElementById. Note that since it returns a single element, there is no array indexing. So it looks like this:

Code: Select all

text := wb.document.getElementByID("ember-view").innerText

Return to “Ask for Help (v1)”

Who is online

Users browsing this forum: Bing [Bot], Google [Bot], Mateusz53, MrHue, mstrauss2021, Rohwedder and 310 guests