Jump to content

Sky Slate Blueberry Blackcurrant Watermelon Strawberry Orange Banana Apple Emerald Chocolate
Photo

Basic Webpage Controls with JavaScript / COM - Tutorial


  • Please log in to reply
320 replies to this topic
jethrow
  • Moderators
  • 2810 posts
  • Last active:
  • Joined: 24 May 2009

*IMPORTANT: please ask coding questions in the Ask For Help forum

 

Basic Webpage Controls with Javascript / COM
This tutorial requires AutoHotkey(_L) with built-in COM Support.
 

Purpose
The purpose of this tutorial is to teach the intermediate AHK user how to start using COM to control webpages. My goal is to provide methods for controlling webpages, similar to how the AHK Control Commands can control Windows Applications. This tutorial is going to be high level, but will provide links to those who want to dig deeper into these concepts. You don't need to have much programming experience, but I will assume you feel comfortable writing and executing AHK scripts. We will be covering the following three topics:

  • The HTML DOM - (Document Object Model) A basic understanding of the HTML DOM is essential for controlling webpages. This is because it's a "map" or "heirarchy" for accessing parts of the webpage. The HTML DOM is not language dependent, but rather the model for how the webpage document is constructed.
  • Javascript - We will cover some basic Javascript because it is the scripting language of the web, and is supported by most web browsers. Controlling webpages using Javascript is not the primary aim of this tutorial, but it will prove valuable because you should be able to find plenty of useful Javascript examples online. This will be helpful because in my opinion, the simplest way to start using COM is to learn some basic Javascript, and then "translate" that code to use with COM in AHK.
  • COM - (Component Object Model) All you really need to know about COM for this tutorial is that Internet Explorer is a COM object - which means we can use COM to manipulate it. Using COM is the most effective way to control an Internet Explorer Webpage with AHK.

"Let me say it a different way: COM is the steering wheel - the HTML DOM is a road map - AHK is the car" ~tank
 

Disclaimers
For this tutorial, I will show you how to control webpages using Javascript, along with the AHK COM "translations" - also, note Sean's post below. Using COM requires Internet Explorer, but Javascript can be used with most Web Browsers. That being said, the DOM examples in this tutorial were written using Internet Explorer 9. The DOM examples may not be exactly the same in other browsers. In each code example, Javascript will be on the first line with the corresponding AHK code on the second line.
 
Some other tutorials that helped me out, and inspired this tutorial can be found here:

  • COM Tutorial by tank - Study the basics from this; I will not be covering them.
  • W3Schools - Many links listed below will take you to this website, which is an excellent resource for learning Javascript in depth.

Thank You:

  • Chris Malet for creating AutoHotkey
  • Lexikos for AutoHotkey v1.1+ (_L)
  • Sean and fincs for helping with Native COM Support
  • Sean for creating the COM Standard Library
  • tank for the tutorial listed above, and daonlyfreez
  • tank, sinkfaze, and jaco0646 for reviewing this tutorial

Terms - The following terms will be used throughout this tutorial. You may use these links for a more in-depth description of each item.

HTML DOMJavascriptCOMMethodsdocumentvalueelementformnameIDInputTagselectedIndexcheckedinnerTextinnerHTML

 
Methods - The following Methods will be used throughout this tutorial. (comparable to built-in functions for AHK)

alert()getElementById()getElementsByName()getElementsByTagName()focus()click()

 
Before we begin, this tutorial will be using examples that start with javascript: - which you will feed through the URL Address bar. These examples will be based on (and can be used with) the Search Forum webpage. Our first example will use the alert() method to pop-up a message box that says Hello World! Simply put this javascript in your URL Address bar and hit enter:

javascript: alert('Hello World!')
wb.Navigate("javascript: alert('Hello World!')")

Note - some browsers may remove the "javascript: " if you paste the javascript line in the url address bar. You may have to manually type it in.
 

Accessing WebPage Contents - HTML DOM
To understand how to use Javascript to control a webpage, you need a general understanding of the HTML DOM. It is similar to a "map" or "heirarchy" of the Webpage, as shown here:
*Image is from this website, which is another great resource for learning Javascript.
domhierarchy.gif
In the image above, note the document object - you will be using this object quite often. To access the Webpage, you will need to navigate through the HTML DOM. Here are some simple ways to do this:

  • Object Name and Index

Say you want to get the value of the 3rd element in the 2nd form, which will be the Find words Input Box. The path would look like this (a collection of objects starts at 0):

document.forms[1].elements[2].value

Now if you want to show the value of the element in a pop-up, simply put this javascript in your URL Address bar and hit enter:

javascript: alert(document.forms[1].elements[2].value)
MsgBox % wb.document.forms[1].elements[2].value
  • Object Name / ID Attribute

You can also use the objects name or ID. For example, in the 2nd form, Find words Input Box id is query. The following JavaScript fed through the address bar would produce the same results:

javascript: alert(document.forms[1].query.value)
MsgBox % wb.document.forms[1].query.value

... or if you only know the elements id is query, you could display the value of that element using all, which references all the elements on the webpage:

javascript: alert(document.all.query.value)
MsgBox % wb.document.all.query.value
  • getElement Methods

If you want to get an element(s) based on limited criteria, you can use the following 3 methods:

 

- getElementById(id) - returns a reference to the first object with the specified ID

- getElementsByName(name) - Returns a collection of objects with the specified name

- getElementsByTagName(tagname) - Returns a collection of objects with the specified tagname

 

The following example will display the tip under the Find words Input Box, which is the 6th element on the webpage with a SPAN Tag:
(Note - the item number may be dynamic)

javascript: alert(document.getElementsByTagName('span')[5].innerText)
MsgBox % wb.document.getElementsByTagName("span")[5].innerText

Controlling the WebPage
So far we have just retrieved information from the webpage. Now lets start controlling the webpage. Note - if the Javascript doesn't end with a Method, use void 0.

  • Focus on a Webpage Element - Focus()

Sets the focus to the Find words Input Box:

javascript: document.all.query.focus()
wb.document.all.query.focus()
  • Click on a Webpage Element - Click()

Clicks the Search Now button:

javascript: document.all.submit.click()
wb.document.submit.click()
  • Set Value of an Input Field - Value

Sets the value of the Find words Input Box:

javascript: document.all.query.value = 'Input Value'; void 0
wb.document.all.query.value := "Input Value"
  • Dropdown Box Selection - selectedIndex
<select name="search_content" id="search_content">
	<option value="both">Search title and content</option>
	<option value="titles">Only search in titles</option>
	<option value="content">Only search in content</option>
</select>

This is the HTML for the Match Dropdown. The following will set the Dropdown to Only search in titles:

javascript: document.all.search_content.selectedIndex = 1;  void 0 // .value = 'titles'
wb.document.all.search_content.selectedIndex = 1  ;// .value := "titles"
  • Radio / Checkbox Selection - Checked

Sets the Search in section radio to Members:

javascript: document.all.radio_members.checked = true; void 0
wb.document.all.radio_members.checked := true
  • Get Text from a WebPage Element - innerText

Say you want to get the text from the Navigation options at the top of the page (innerHTML will give you all the HTML):

text := wb.document.all.primary_nav.innerText

... or if you want all the text (or html) from the page:

text := wb.document.documentElement.innerText

There you have it! These techniques should help get you started. Next, I would recommend the following:

  • Try these Controls out on some of your favorite webpages.
  • Find some more Javascript examples, and then try "translating" them to COM. (Javascript is well documented online)
  • Learn additional ways to access the HTML DOM.

You may be wondering, "How do I find information about the element so I can access it?" Good question! The following tools can help you with that.
 

 

Helpful Tools

  • iWebBrowser2 Learner - This program will give you information about IE webpage elements as you hover over them.
  • IE HTML Element Spy - This program will show you the information and source code of each element by dragging the curser over the webpage.

 

 

Frequently Asked Questions

  • What is a wb?

An AHK object that contains the web browser object (Internet Explorer). Here is a simple script for creating one:

wb := ComObjCreate("InternetExplorer.Application")  ;// Create an IE object
wb.Visible := true                                  ;// Make the IE object visible
wb.Navigate("www.AutoHotkey.com")                   ;// Navigate to a webpage
  • How to access an existing IE object?

Access an IE object by WinTitle and Internet Explorer_Server Number:

WBGet(WinTitle="ahk_class IEFrame", Svr#=1) {               ;// based on ComObjQuery docs
   static msg := DllCall("RegisterWindowMessage", "str", "WM_HTML_GETOBJECT")
        , IID := "{0002DF05-0000-0000-C000-000000000046}"   ;// IID_IWebBrowserApp
;//     , IID := "{332C4427-26CB-11D0-B483-00C04FD90119}"   ;// IID_IHTMLWindow2
   SendMessage msg, 0, 0, Internet Explorer_Server%Svr#%, %WinTitle%
   if (ErrorLevel != "FAIL") {
      lResult:=ErrorLevel, VarSetCapacity(GUID,16,0)
      if DllCall("ole32\CLSIDFromString", "wstr","{332C4425-26CB-11D0-B483-00C04FD90119}", "ptr",&GUID) >= 0 {
         DllCall("oleacc\ObjectFromLresult", "ptr",lResult, "ptr",&GUID, "ptr",0, "ptr*",pdoc)
         return ComObj(9,ComObjQuery(pdoc,IID,IID),1), ObjRelease(pdoc)
      }
   }
}

Access an IE object by Window/Tab Name:

IEGet(name="") {
   IfEqual, Name,, WinGetTitle, Name, ahk_class IEFrame     ;// Get active window if no parameter
   Name := (Name="New Tab - Windows Internet Explorer")? "about:Tabs":RegExReplace(Name, " - (Windows|Microsoft) Internet Explorer")
   for wb in ComObjCreate("Shell.Application").Windows
      if wb.LocationName=Name and InStr(wb.FullName, "iexplore.exe")
         return wb
}

*For information how these functions differ, see this post.

  • How to know when the webpage in done loading?

You can use the Busy and/or ReadyState property, which should work in most scenarios:

wb.Navigate("www.AutoHotkey.com")
while wb.busy or wb.ReadyState != 4
   Sleep 10

... or here is an example using the DocumentComplete event:

ComObjConnect(wb, "IE_")   ;// Connect the webbrowser object
loading := true            ;// Set the variable "loading" as TRUE
wb.Navigate("www.AutoHotkey.com")
while loading
   Sleep 10
MsgBox DONE!
ComObjConnect(wb)          ;// Disconnect the webbrowser object (or just wb := "")

IE_DocumentComplete() {    ;// "IE_" prefix corresponds to the 2nd param in ComObjConnect()
   global loading := false ;// Break the While-Loop
}
  • What if all this stuff is too confusing for me?

Try this Tutorial written by Mickers which is designed to help non-coders (n00bz) grasp the basic concepts of AutoHotkey_L COM: Basic Ahk_L COM Tutorial for Webpages



tank
  • Moderators
  • 4242 posts
  • Last active: Yesterday, 10:35 PM
  • Joined: 21 Dec 2007
BRAVO :!: :!: :idea:

Thanks for this. My threads are out of date and i just dont have the kind of patience and grammer to produce something like this.

Users should be directed to this thread in the future instead of my IE based COM threads. This work supercedes all of mine

Sean
  • Members
  • 2462 posts
  • Last active: Feb 07 2012 04:00 AM
  • Joined: 12 Feb 2007
Very nice. IMO, however, it would've been even more clear if done using COM_L for AHK_L than COM for the old AHK. COM_Invoke being explicit may distract users from the main point.

jethrow
  • Moderators
  • 2810 posts
  • Last active:
  • Joined: 24 May 2009
Thanks for the feedback guys - I was hoping this would prove benificial to the community. :D

IMO, however, it would've been even more clear if done using COM_L for AHK_L than COM for the old AHK. COM_Invoke being explicit may distract users from the main point.

I agree 100% :D If fact, that distraction was one of the main reasons I had originally decided to write this tutorial - to help bridge users gap in understanding. I had just finished writing this when Lexikos released AHK_L with object support. Then, when you released COM_L, I thought this tutorial would become rapidly outdated (which I still think). However, who knows when Chris will implement object support into the regular AHK release? Once that happens, I'll probably update this tutorial.

Sean
  • Members
  • 2462 posts
  • Last active: Feb 07 2012 04:00 AM
  • Joined: 12 Feb 2007
It's a DOM tutorial, so, there are a lot to clarify other than just syntax. When, using JavaScript the top/base object is the Window object, not WebBrowser object. In fact, WebBrowser object is neither part of nor essential for DOM manipulation. So, in strict sense,
javascript: document.all[11].click()
is actually equivalent to
COM_Invoke(window, "document.all[11].click()")
not to
COM_Invoke(pwb, "document.all[11].click()")
However, this tutorial uses pwb in all of the examples, which might lead readers to believe that using WebBrowser object is inevitable for DOM manipulation, not a mere convenience.

jethrow
  • Moderators
  • 2810 posts
  • Last active:
  • Joined: 24 May 2009

When, using JavaScript the top/base object is the Window object, not WebBrowser object.

Thank you for pointing this out Sean. I hadn't even thought about, nor fully understood this. In your example, this would be technically accurate, right?
window := COM_Invoke(pwb, "contentWindow")

EDIT - :oops: and these are the ignorant questions I end up asking if I don't study & test the code before I ask questions. See tank's answer in the next post.

tank
  • Moderators
  • 4242 posts
  • Last active: Yesterday, 10:35 PM
  • Joined: 21 Dec 2007
Contentwindow is not accessable from the browser object but from a frame object

to get a window object from a pwb is done as thus
window:=com_invoke(pwb,"document.parentwindow")
window:=COM_QueryService(pwb,	"{332C4427-26CB-11D0-B483-00C04FD90119}",	"{332C4427-26CB-11D0-B483-00C04FD90119}")
contentWindow is typically used to aquire a window handle from a frame

tank
  • Moderators
  • 4242 posts
  • Last active: Yesterday, 10:35 PM
  • Joined: 21 Dec 2007
Even more confusing because Document must be accessed with COM prior to accessing Window.

I generally dont climb back up to or query DOM for the window unles i need to use the execScript method

For sheer convenience I am going to post some JScript and VBS that as far as I can tell isn’t easily found elsewhere.
The main thing to know is the browser encompasses the toolbars and address bar while the window object refers only to the content area
JScript that creates a new browser and navigates
<script language="jscript">
var  pwb = new ActiveXObject("InternetExplorer.Application");
pwb.visible=true;
pwb.Navigate("www.AutoHotkey.com");
</script>
Same as above but in VBS
<script language="vbscript">
set pwb = CreateObject("InternetExplorer.Application")
pwb.visible=true
pwb.Navigate("www.AutoHotkey.com")
set pwb =  nothing ' releases the variable
</script>

Replicates most of the functionality of the IEGet function posted above in AHK.
<script language="vbscript">
Function IEGet(Title)
   ''Gets window by the title
   ''Title is case sensitive
   Set oShellApplication = CreateObject( "Shell.Application" )
   Set oWindows = oShellApplication.Windows
   For Each oWindow In oWindows
      if   InStr(1,oWindow.LocationName,Title,1) > 0 and InStr(1,oWindow.FullName,"iexplore.exe",1) > 0 then
         set IEGet=oWindow '' create an object variable
         exit for
      end if
   next
   Set oWindows = nothing   
   Set oShellApplication = nothing   
End Function
</script>
obviously this can be done in JScript as well but I will not bother. From here we can begin to see the similarities in the code structure

majkinetor
  • Moderators
  • 4512 posts
  • Last active: Oct 02 2013 02:33 PM
  • Joined: 24 May 2006
I guess its possible to create script and framework from the scripts around the forum that:

1. Loads given HTML into the IE control.
2. Crates the tree presentation of DOM or all automatable objects with user filter.
3. Let the user associate some code with the objectl; some hardcoded operations would be nice for start and plugin architecture to add more; click, send, size for the start. Dynamic function calls could be used as plugins for new operations if desired.
4. Let the user save such project and run it.

tank
  • Moderators
  • 4242 posts
  • Last active: Yesterday, 10:35 PM
  • Joined: 21 Dec 2007
are you refering to an ahk Browser in a GUI that functions much the same as HTA applications?
If so of course IMO triggering an ahk script to run from an html page is super easy if its a trusted source


you can use the FileRun function with an onclick event to trigger scripts even pass command line parameters the below i copied from my own proprietary library. I use this in a call center environment with great reliability
IF not obvious the below is VBS
Set oShell = CreateObject( "WScript.Shell" )
Set objFSO = CreateObject("Scripting.FileSystemObject")
Function FileRun(strFile,filename,async)
'' strFile - full run command line example format """C:\Program Files\AutoHotkey\AutoHotkey.exe"" /ErrorStdOut ""C:\Documents and Settings\nbk64jq\Desktop\MyHots.ahk"" CMDArg"""
'' filename - path and filename
	FileRun=false
'' async - wait for file to exit before continuing accepts true false
	e=FileExists(filename)
	if	e then
		r=isRunning(filename)
		if	not r then
			oShell.Run strFile, 6, async 
			FileRun=true
			
		end if
	end if
End Function

Function isRunning(CmdTitle)
	isRunning=false
	For Each strProcess In GetObject("winmgmts:").InstancesOf("win32_process")
		if	InStr(1,strProcess.CommandLine,CmdTitle,1) > "2" then
			isRunning=true
		end if
	next

End Function


Function FileExists(strFile)
	FileExists = objFSO.FileExists(strFile)
	
End Function

Edit I have tried using events from ahk for this but in non standard deployment if IE these fail and in the past caused the gui to crash if to many events were triggered

sinkfaze
  • Moderators
  • 6365 posts
  • Last active:
  • Joined: 18 Mar 2008
So given the discussion in this thread about "recursing" through frames to a page element, is it possible to amend some of the current iWeb functions or create a new iWeb function(s) to accommodate frame recursion? Or is it just something worth mentioning in the course of this tutorial, for example?

BTW - Excellent tutorial jethrow, glad I could help in whatever small way I could. :wink:

tank
  • Moderators
  • 4242 posts
  • Last active: Yesterday, 10:35 PM
  • Joined: 21 Dec 2007
edit i spose your refering to this library
iWeb functions
i have considered recurse into frames for some time now but unfortunately the options and depth of recursion seem a bit complex to make reliable I might get around to it sometime in the future but it really seems that frames might be somthing we want to leave ot users more versed in DOM and using Invoke exclusively

sinkfaze
  • Moderators
  • 6365 posts
  • Last active:
  • Joined: 18 Mar 2008
Would something like this be feasible? Obviously the user will be responsible for obtaining the frames themselves:

iWeb_clickDomObj(pwb,obj,[color=red]frame=""[/color]) { ; frame is comma delimited list of frames by name/id/number

  If pWin:=iWeb_DomWin(pwb) {
    [color=red]if frame {
      Loop, Parse, frame, `,
        framePath.="document.all[" A_LoopField "].contentwindow."
      COM_Invoke(pWin,framePath . "document.all.item[" obj "].click")
    }[/color]
    else
      COM_Invoke(pWin,"Document.all.item[" obj "].click")
    d=1
    COM_Release(pWin), [color=red]VarSetCapacity(framePath,0)[/color]
  }
  Return d
}

; example usage
iWeb_clickDomObj(pwb,"username","_sweclient,_swecontent,_sweview,_svf1")

Probably very sloppy code by your standards but you get the general idea.

Carcophan
  • Members
  • 1578 posts
  • Last active: Nov 27 2013 06:46 PM
  • Joined: 24 Dec 2008
Does this mean I have to change my Sig now? :(

tank
  • Moderators
  • 4242 posts
  • Last active: Yesterday, 10:35 PM
  • Joined: 21 Dec 2007
Very astute my young padawan :D
I spose if you want to update the other functions I havent time right now ill update the zip file with it or you can wait months (perhaps years :oops: ) tll i get around to it


@Carcophan no probably not the vast majority still wont follow this tutorial since they have no interest whatsoever in actually learning javascript or DOM. This tutorial is still aimed at folks who know at least that much and to help them convert similar javascript to compatable COM.AHK

Most still wont understand COM DOM or scripting so I think the quote still applies