Updated: 11/10/11
After a long break I've finally decided on what I believe to be a valuable skill when working with IE. Looping through web pages.
The advantages of using this method include dealing with dynamicly created elements, checking if a specific element exists, and using elements without having the element's name.
Most of the data used to make this guide was either provided by members of the community or
the MSDN.
While working through this tutorial understand that it's only as complicated as you make it. Because the concepts of this are a little harder to wrap one's mind around I decided to use the forums for testing instead of good ole Google.
For starters let's get a handle to an open IE window:
Code:
For IE in ComObjCreate("Shell.Application").Windows ; for each open window
If InStr(IE.FullName, "iexplore.exe") ; check if it's an ie window
break ; keep that window's handle
; this assumes an ie window is available. it won't work if not
This snippet is based on sean's IEGet() Function.
If you'd like to see more on the Shell Application and the method's available check out this post by
jethrow.
The reason I chose to use this method is so that you don't have to constantly open up new browsers while testing.

Moving on. Now that we have established our handle we can access the elements inside.
Code:
IE.Navigate("http://www.autohotkey.com/forum/") ; go to the ahk forum page
While IE.Busy ; simple wait for page to load
Sleep, 100 ; 1/10 second
For testing purposes we want to make sure that we always start in the same place.
Now this page doesn't have many forms or buttons like we had in the previous section. Instead were going to access the Links (or Hyperlinks) collection.
Code:
Links := IE.Document.Links ; collection of hyperlinks on the page
Now that we have all the links we can loop through them. First let's see what the collection contains.
Code:
Loop % Links.Length ; check each link
If ((Link := Links[A_Index-1].InnerText) != "") ; if the link is not blank
Msg .= A_Index-1 . " " . Link . "`n" ; add item to the list
MsgBox % Msg ; display the list
What weve done here is use the .Length property to see how many items are in the collection. Then we looped through it and stored each links text using the .InnerText method that aren't empty into our variable so we can see what's all in the collection.The number to the left of each item is the index within the collection. From this we can determine which of the links is blank because it would have been skipped.
Now that we can retrieve the links from a webpage let's do something with them. Were going to search the page for your favorite tutorial writer's link and go to his profile.
Code:
ComObjError(false)
Links := IE.Document.Links
Loop % Links.Length
If (Links[A_Index-1].InnerText = "Mickers") { ; replace my name with whatever text you'd rather search for
;or If InStr(Links[A_Index-1].InnerText, "Mickers") ;
Links[A_Index-1].Click()
break
}
Ok now were getting somewhere. Weve looped through a page and chose a link based on it's display text.
Now let's search for a specific post within General Chat.
Now we could use the same method of parsing through the links untill we find the one we want but sometimes just knowing the link's innertext isn't good enough. For example say we want to select a link only when a specific username is present within the row of text.
To do this we are going to look at a new property of the Document.All object. The Tags() property returns a collection of elements specified by what is between the "(--tag--)". Now what's a tag you say? It's an HTML term for a specific body part of any web page. We are going to use 2 of these: Tables and Rows. Test this snippet of code to see what I mean:
Code:
MsgBox % IE.Document.All.Tags("table").Length
Place it after the first wait after you navigate to the forum from the first part of this section.
The MsgBox should have given you a 9. This is the total number of tables on that specific page. Now tables by themselves arn't going to be of much use to us in this tutorial so let's take it one step further.

Were now going to access the Rows property of the Tables collection.
Code:
MsgBox % IE.Document.All.Tags("table").Rows.Length
Uh oh what happend? It says Rows is an unknown name what gives? Sorry I led you astray but you have just learned something very important that will save you frustration in the future. You can't access a property of a collection without specifying a single item of that collection.
Let me explain say you have a collection of say.... Girls. Now you want a specific girl from the collection of all the girls in your school. However to choose one you want to know things like what hair color they have, if their a cheerleader ect. But in this example you can't simply say give me a girl that's a cheerleader and a redhead. Why because you have a COLLECTION and you can't just say give me one that fits this profile GO. Let me show you via some pseudo code:
Code:
Girls := ComObjCreate("Girls.Everywhere") ; here's our handle to the object
CollegeGirls := Girls.Document.All.Tags("College") ; this is our tables example
Loop % (Cheerleaders := CollegeGirls[Cheerleaders].Length) ; cheerleaders is our row example
If ((DreamGirl := Cheerleaders.Girl[A_Index]) = "Redhead") ; here the girl property let's us look at each girl individually
break ; break the loop once we find one
MsgBox % DreamGirl.Name ; use the name property to see the name of your future girlfriend
;if your female replace girls with boys and cheerleaders for football player
To save any uneccesary confusion this is a fake object and none of these properties apply to IE.
Now we know that to access the text of a table we have to specify which table then look at each row one at a time like so:
Code:
Rows := IE.Document.All.Tags("table")[4].Rows ; table 4 holds all the post information
Loop % Rows.Length ; for each post
Msg .= Rows[A_Index].InnerText . "`n" ; save the post data
MsgBox % Msg
Now we can see the text inside each row. At this point you can use simple Regex or InStr functions to look for specific data.
Let's search for a specific post:
Code:
Rows := IE.Document.All.Tags("table")[4].Rows
Loop % Rows.Length
If InStr(Rows[(Row := A_Index)].InnerText, "What's on your mind?") and InStr(Rows[A_Index].InnerText, "tomoe_uehara")
break
MsgBox % Rows[Row].InnerText
I chose to search for this particular post because it's the most popular one on the site and is always on the first page.
Now that weve proven that we can locate a post based on the name and author let's try to navigate to that post. I haven't had any luck using the Click() method to follow the hyperlink so instead were going to use a work around:
Code:
URL := IE.LocationURL ; grab the current url
Rows := IE.Document.All.Tags("table")[4].Rows ; table 4 holds all the post information
Loop % Rows.Length
If InStr(Rows[A_Index].InnerText, "What's on your mind?") and InStr(Rows[A_Index].InnerText, "tomoe_uehara") { ; if post title and author match
HTML := Rows[A_Index].Cells[1].InnerHTML ; pull the html off the page
Needle := "viewtopic.php?t=" ; string to search for
StringGetPos, Pos, HTML, %Needle% ; get starting position of search string
Pos += 17 ; search 17 characters to the right
StringMid, Post, HTML, Pos, 5 ; pull the unique post number out of the html
StringTrimRight, URL, URL, 13 ; cut off forum.php?f=
URL .= "topic.php?t=" . Post ; set to topic + unique post number
IE.Navigate(URL) ; navigate to the post
break
}
Here is what the actual InnerHTML looks like:
Quote:
<SPAN class=topictitle><A class=topictitle href="viewtopic.php?t=57174">« What's on your mind? »</A></SPAN><SPAN class=gensmall><BR>[ <IMG title="Goto page" alt="Goto page" src="templates/subSilver/images/icon_minipost.gif">Goto page: <A href="viewtopic.php?t=57174&start=0">1</A> ... <A href="viewtopic.php?t=57174&start=885">60</A>, <A href="viewtopic.php?t=57174&start=900">61</A>, <A href="viewtopic.php?t=57174&start=915">62</A> ] </SPAN>
You can see that the post's unique number is in there multiple times. I spent several hours trying to get the Click() method to work for this example but sometimes it's just easier to stick with what you know.

For the last part of this tutorial we are going to factor in one last thing. What if the post were searching for is not on the first page? I have set up a post on a later page of General Chat for us to search for.
Code:
ComObjError(false) ; if post is not found it will cause a com error
Found := false ; initialize to false
URL := IE.LocationURL
Loop {
Rows := IE.Document.All.Tags("table")[4].Rows
Loop % Rows.Length
If InStr(Rows[A_Index].InnerText, "IE Test Post") and InStr(Rows[A_Index].InnerText, "Mickers") { ;this is the test post i put together
HTML := Rows[A_Index].Cells[1].InnerHTML
Needle := "viewtopic.php?t="
StringGetPos, Pos, HTML, %Needle%
Pos += 17
StringMid, Post, HTML, Pos, 5
StringTrimRight, URL, URL, 13
URL .= "topic.php?t=" . Post
IE.Navigate(URL)
Found := true ; set to found
break
}
If (Found = false) { ; if post wasn't located
Links := IE.Document.Links
Loop % Links.Length
If (Links[A_Index-1].InnerText = "Next") { ; find the "next" link
Links[A_Index-1].Click() ; and click it
break
}
}
} until (Found = true) ; until post is located
All weve done here is add a check to see if we located our post. If it's not located in a cycle it will hit the "Next" button and begin searching that page until found.
That concludes the new secion of my guide. I know this was a little more complex then simply filling forms and clicking buttons that you know exist on a we page. However if you want to move onto more complicated IE COM scripts being able to search for items and test if they exist on the page is very useful.
Here is the complete script which should run stand alone:
Code:
#SingleInstance force
ComObjError(false)
Found := false
For IE in ComObjCreate("Shell.Application").Windows
If InStr(IE.FullName, "iexplore.exe")
break
IE.Navigate("http://www.autohotkey.com/forum/")
While IE.Busy
Sleep, 100
Links := IE.Document.Links
Loop % Links.Length
If (Links[(A_Index - 1)].InnerText = "General Chat") {
Links[(A_Index - 1)].Click()
break
}
While IE.Busy
Sleep, 100
URL := IE.LocationURL
Loop {
Rows := IE.Document.All.Tags("table")[4].Rows
Loop % Rows.Length
If InStr(Rows[(A_Index - 1)].InnerText, "IE Test Post") and InStr(Rows[(A_Index - 1)].InnerText, "Mickers") {
HTML := Rows[(A_Index - 1)].Cells[1].InnerHTML
Needle := "viewtopic.php?t="
StringGetPos, Pos, HTML, %Needle%
Pos += 17
StringMid, Post, HTML, Pos, 5
StringTrimRight, URL, URL, 13
URL .= "topic.php?t=" . Post
IE.Navigate(URL)
Found := true
break
}
If (Found = false) {
Links := IE.Document.Links
Loop % Links.Length
If (Links[(A_Index - 1)].InnerText = "Next") {
Links[(A_Index - 1)].Click()
break
}
}
} until (Found = true)
ExitApp
Please leave me some feedback or questions you may have. Thanks for reading!
