Web Scraping with AutoHotkey & COM Tutorial- GUI syntax writer and demo videos

Helpful script writing tricks and HowTo's
User avatar
Joe Glines
Posts: 770
Joined: 30 Sep 2013, 20:49
Location: Dallas
Contact:

Web Scraping with AutoHotkey & COM Tutorial- GUI syntax writer and demo videos

24 May 2015, 20:12

I created an AutoHotKey script that helps writing AutoHotKey syntax for Web Scraping with AutoHotkey.

YouTube Demonstration videos:
1) Intro- Pointer, Get values and Page Navigation
1.5) Intro- Troubleshooting & Getting correct content from page
2) Intro- Set values & clicks / Buttons
3) Itermediate- Isolating area and leveraging DOM/HTML
4) Advanced- Dealing with Frames
5) Intro- Troubleshooting tips
6) Intermediate- Loop over pages & extract data
7) Intermediate- Web scraping using ClassName
8) Intermediate- Web scraping using QuerySelector and QuerySelectorAll
9) Intro- Webinar on Intro to Web Scraping :superhappy:
10) Intro- Update to Web Scraping syntax writer
11) Intermediate-EventListners & Triggering Events
12) Intermediate-Saving files / Pictures from a URL / Hyperlink
13) Intro-Review of Web scraping tools
14) Intermediate- Passing Method or Property to COM in a function
15) Intermediate- Intermediate  Extracting data from a table by walking the DOM


Also check out my other tutorials on using Selenium with AutoHotkey (this allows for scraping with Chrome, FireFox, IE, etc.) and using Chrome.ahk from GeekDude to Get and Set values.

Manipulating the Document Object Model in Javascript is a good video talking through the DOM from O'Reilly

I highly recommend using Fiddler to help monitor network traffic. Check out this page where I have some videos showing how to use Fiddler ot monitor network traffic.

Here is where you can get the source code as well as compiled version of my AutoHotkey Web Scraping Syntax Writer

Videos and scripts to Login to Websites:
  1. Login to Facebook This first video is much longer & in-depth! I cover many of the reasons why I pick one method over another. I also have HellBent sit-in and ask questions so it should be a great starting point for noobs to Web Scraping.
  2. Login to Amazon
  3. Login to LinkedIn
  4. Login to Gmail / Google / YouTube
  5. Login to Pinterest
  6. Login to Twitter
  7. Login to Reddit

Examples of work automated via Web Scraping with AutoHotkey
  1. Submit StumbleUpon submissions
  2. Transfer data from one website/system to another
  3. How I exported over 4 million contacts from Lexis Nexis
  4. Extract status from SharePoint and email colleagues
  5. Select “x” number of items on website form
  6. Obtain Behavioral Targeting Data from your own Web Site
  7. Determine House status on Real Estate site
  8. Extract meta data about videos from website
  9. Automating saving invoices on Amazon for taxes
  10. Waiting for an element to be visible before clicking (this is using FindText instead of COM)
Comparision of Web Scraping to API calls


If you're new to web scraping, API calls, and http protocol, this is a great discussion you need to see!
Web scraping, APIS and HTTP Traffic
If you're trying to use URLdownload to Var, WinHTTPRequests, or automate Chrome, this is a great overview of the high-level principles of what you're doing and different approaches.
Last edited by Joe Glines on 03 Apr 2021, 08:11, edited 37 times in total.
Sign-up for the 🅰️HK Newsletter

ImageImageImageImage:clap:
AHK Tutorials:Web Scraping | | Webservice APIs | AHK and Excel | Chrome | RegEx | Functions
Training: AHK Webinars Courses on AutoHotkey :ugeek:
YouTube

:thumbup: Quick Access Popup, the powerful Windows folders, apps and documents launcher!
User avatar
jethrow
Posts: 188
Joined: 30 Sep 2013, 19:52
Location: Iowa

Re: Intro to WebScraping and COM

25 May 2015, 01:37

Nice - should help make IE Com stuff way easy for beginners. Plus videos are good - & I felt way famous after watching ...

... a couple things...
  • The "M" in COM & DOM stands for Model
  • False=0 & 0!=-1 meaning -1=True
  • I prefix raw pointers with "p" to signify it's a pointer (pwb, pdoc, etc.) - outside of the raw pointers in the WBGet() function, you aren't using any raw pointers in your script - only wrapped COM objects. Not that you have to follow my naming conventions, just sayin ...
  • iWB2 Learner FRAME.# should be interpreted as FRAME.DEPTH
I'm interested to see your next videos - if they're good, I'll likely link them in my tutorial.
User avatar
Joe Glines
Posts: 770
Joined: 30 Sep 2013, 20:49
Location: Dallas
Contact:

Re: Intro to WebScraping and COM

25 May 2015, 07:06

Thanks for pointing out my inaccuracies! :)

And you, Mickers, Tank, Blackholyman, Sinkafaze, Lexikos, Sean (and I'm sure many others) ARE famous in my eyes as you have all greatly helped me and countless others!
Last edited by Joe Glines on 26 Nov 2015, 09:42, edited 1 time in total.
Sign-up for the 🅰️HK Newsletter

ImageImageImageImage:clap:
AHK Tutorials:Web Scraping | | Webservice APIs | AHK and Excel | Chrome | RegEx | Functions
Training: AHK Webinars Courses on AutoHotkey :ugeek:
YouTube

:thumbup: Quick Access Popup, the powerful Windows folders, apps and documents launcher!
AmirOulad
Posts: 3
Joined: 05 Jun 2015, 17:44

Re: WebScraping and COM- GUI syntax writer and demo videos

07 Jun 2015, 12:49

Never mind,

Stupid computer with an .dll error.
Last edited by AmirOulad on 07 Jun 2015, 15:12, edited 2 times in total.
User avatar
jethrow
Posts: 188
Joined: 30 Sep 2013, 19:52
Location: Iowa

Re: WebScraping and COM- GUI syntax writer and demo videos

07 Jun 2015, 14:14

AHK has some syntax designs that don't translate well into other languages. A good example is in AHK, the following 2 calls are the same:

Code: Select all

object.key
object["key"]
That being said, if you are focusing on web-scraping & tutorials, I'd highly recommend making your code easily translatable to jscript/javascript.

In your video you use:

Code: Select all

parentWindow.frames.2.0.document.location.href
This does not work in javascript:

Code: Select all

;// incorrect:
javascript: alert(window.frames.2.0.document.location.href)
;// correct:
javascript: alert(window.frames[2][0].document.location.href)
Note that the correct javascript syntax also works in AHK:

Code: Select all

parentWindow.frames[2][0].document.location.href
Another situation that has been frustrating for me when going to other languages is that AHK will allow you to call COM methods without using the parenthesis:

Code: Select all

shell := ComObjCreate("Shell.Application")
;// windows method w/ parenthesis - arguably more proper
MsgBox % shell.windows().count
;// windows method w/o parenthesis - still works
MsgBox % shell.windows.count
Note the difference in jscript:

Code: Select all

var shell = new ActiveXObject("Shell.Application")
;// windows method w/ parenthesis
WScript.echo( shell.windows().count )
;// windows method w/o parenthesis - Error: Object doesn't support this property or method
WScript.echo( shell.windows.count )
Again, with your personal coding, you can of course do whatever works. But, if you're creating a code creation tool & tutorials, I'd highly recommend doing object member syntax so it works in other comparable languages as well.
User avatar
Joe Glines
Posts: 770
Joined: 30 Sep 2013, 20:49
Location: Dallas
Contact:

Re: WebScraping and COM- GUI syntax writer and demo videos

07 Jun 2015, 15:44

Thanks for the edification jethrow! While I can fumble through things, I'm definitely not the right person to be giving advice on best practices!
Sign-up for the 🅰️HK Newsletter

ImageImageImageImage:clap:
AHK Tutorials:Web Scraping | | Webservice APIs | AHK and Excel | Chrome | RegEx | Functions
Training: AHK Webinars Courses on AutoHotkey :ugeek:
YouTube

:thumbup: Quick Access Popup, the powerful Windows folders, apps and documents launcher!
User avatar
Soft
Posts: 174
Joined: 07 Jan 2015, 13:18
Location: Seoul
Contact:

Re: WebScraping and COM- GUI syntax writer and demo videos

12 Sep 2015, 17:17

Very useful for me XD
AutoHotkey & AutoHotkey_H v1.1.22.07
User avatar
Joe Glines
Posts: 770
Joined: 30 Sep 2013, 20:49
Location: Dallas
Contact:

Re: WebScraping and COM- GUI syntax writer and demo videos

12 Sep 2015, 18:20

Thanks! I'd stopped doing more as this didn't get the traffic that I was hoping for. :(
Sign-up for the 🅰️HK Newsletter

ImageImageImageImage:clap:
AHK Tutorials:Web Scraping | | Webservice APIs | AHK and Excel | Chrome | RegEx | Functions
Training: AHK Webinars Courses on AutoHotkey :ugeek:
YouTube

:thumbup: Quick Access Popup, the powerful Windows folders, apps and documents launcher!
boris321
Posts: 1
Joined: 16 Sep 2015, 15:53

Re: WebScraping and COM- GUI syntax writer and demo videos

17 Sep 2015, 14:08

[quote="Joe_Glines_Joetazz"]I created an AutoHotKey script that helps writing AutoHotKey syntax for WebScraping.

"I've also created a demo video talking though how to use it. Right now I'm thinking I'll have at least 3 videos but we'll see how bored I get..."

I found them useful. I have been wanting to know how to do this for years. Thank you!

Just a quick question, the inclusion of:

#Persistent
#SingleInstance Force
#NoEnv

Do they need to go in a specific folder to make the .ahk script functional?

Thank you!
Boris
subodhjoshi
Posts: 5
Joined: 26 Nov 2015, 07:06

Re: WebScraping and COM- GUI syntax writer and demo videos

26 Nov 2015, 08:53

Joe,
I use tabs to navigate page elements but obviously, it is severely restricted. This methods you use will make it much easier and far more powerful. Quick question - what extension do you use in your SciTe editor to get the control+left click menu that you use so extensively? (Actually, looks like you have written a script for it per your first line! Can you share it? thx.)
User avatar
Joe Glines
Posts: 770
Joined: 30 Sep 2013, 20:49
Location: Dallas
Contact:

Re: WebScraping and COM- GUI syntax writer and demo videos

26 Nov 2015, 09:12

that isn't a "SciTE" thing- that is my AutoHotKey script which writes my AutoHotKey syntax. (yes that sounds confusing) but if you run the script writer, you'll then be able to control Left click and the menus will appear. I noticed on Win10 they removed a few of the icons thus the script will not run as-is. If you're on Win10 it will take some tweaks (or just simply comment out the lines that it says it cannot find the icons)
Sign-up for the 🅰️HK Newsletter

ImageImageImageImage:clap:
AHK Tutorials:Web Scraping | | Webservice APIs | AHK and Excel | Chrome | RegEx | Functions
Training: AHK Webinars Courses on AutoHotkey :ugeek:
YouTube

:thumbup: Quick Access Popup, the powerful Windows folders, apps and documents launcher!
User avatar
Joe Glines
Posts: 770
Joined: 30 Sep 2013, 20:49
Location: Dallas
Contact:

Re: WebScraping and COM- GUI syntax writer and demo videos

26 Nov 2015, 09:25

One more thing- while using COM has a learning curve it is light-years ahead of sending tabs! Once you get the hang of it, it is pretty easy and much, much more reliable! If you haven't done so already I highly recommend working through Jethrow's tutorial.

Another good one is on BlackHolyman's site regarding Logging into a website
Sign-up for the 🅰️HK Newsletter

ImageImageImageImage:clap:
AHK Tutorials:Web Scraping | | Webservice APIs | AHK and Excel | Chrome | RegEx | Functions
Training: AHK Webinars Courses on AutoHotkey :ugeek:
YouTube

:thumbup: Quick Access Popup, the powerful Windows folders, apps and documents launcher!
wolf_II
Posts: 2688
Joined: 08 Feb 2015, 20:55

Re: WebScraping and COM- GUI syntax writer and demo videos

26 Nov 2015, 10:23

@Joe_Glines_Joetazz
Please, is there a md5 for iWB2Learner.exe available?
Or a known download location?

I might have a corrupted copy. :(
User avatar
Joe Glines
Posts: 770
Joined: 30 Sep 2013, 20:49
Location: Dallas
Contact:

Re: WebScraping and COM- GUI syntax writer and demo videos

26 Nov 2015, 11:52

I'm not sure what you mean by md5 but you can download the files from here
Sign-up for the 🅰️HK Newsletter

ImageImageImageImage:clap:
AHK Tutorials:Web Scraping | | Webservice APIs | AHK and Excel | Chrome | RegEx | Functions
Training: AHK Webinars Courses on AutoHotkey :ugeek:
YouTube

:thumbup: Quick Access Popup, the powerful Windows folders, apps and documents launcher!
wolf_II
Posts: 2688
Joined: 08 Feb 2015, 20:55

Re: WebScraping and COM- GUI syntax writer and demo videos

26 Nov 2015, 12:24

Joe_Glines_Joetazz wrote:I'm not sure what you mean by md5 but you can download the files from here
@Joe_Glines_Joetazz:

Yes, that's where I got it from. (Links to http://www.autohotkey.net/~rbrtryn/Appl ... earner.zip)
But I get a virus warning from Avira. Which is the first time for me. Avira is usually very good with AHK exe's.
I wonder if autohotkey.net could have been corrupted? or maybe just the zip-file?

Anyway, MD5 is a commonly used checksum, and I got this:

Code: Select all

iWB2Learner.zip     c68647261aaefbc264bf29ffcf8c26e2
iWB2 Learner.exe    609e65a6e56eb45e95c4f1930fd24704
Can anybody please confirm that this is a valid file to use?
User avatar
Joe Glines
Posts: 770
Joined: 30 Sep 2013, 20:49
Location: Dallas
Contact:

Re: WebScraping and COM- GUI syntax writer and demo videos

26 Nov 2015, 13:13

It has been reported several times before as a false positive
Sign-up for the 🅰️HK Newsletter

ImageImageImageImage:clap:
AHK Tutorials:Web Scraping | | Webservice APIs | AHK and Excel | Chrome | RegEx | Functions
Training: AHK Webinars Courses on AutoHotkey :ugeek:
YouTube

:thumbup: Quick Access Popup, the powerful Windows folders, apps and documents launcher!
wolf_II
Posts: 2688
Joined: 08 Feb 2015, 20:55

Re: WebScraping and COM- GUI syntax writer and demo videos

26 Nov 2015, 13:19

@Joe_Glines_Joetazz: Thank you very much.
User avatar
Joe Glines
Posts: 770
Joined: 30 Sep 2013, 20:49
Location: Dallas
Contact:

Re: WebScraping and COM- GUI syntax writer and demo videos

26 Nov 2015, 14:21

I updated my source code above to remove icons that are not in Win10 and incorporate the use of getElementsByClassName which was introduced and explained to me by BlackHolyman. A lot of pages frequently have ClassNames and they are my "go to" method call now! :dance:
Sign-up for the 🅰️HK Newsletter

ImageImageImageImage:clap:
AHK Tutorials:Web Scraping | | Webservice APIs | AHK and Excel | Chrome | RegEx | Functions
Training: AHK Webinars Courses on AutoHotkey :ugeek:
YouTube

:thumbup: Quick Access Popup, the powerful Windows folders, apps and documents launcher!
subodhjoshi
Posts: 5
Joined: 26 Nov 2015, 07:06

Re: WebScraping and COM- GUI syntax writer and demo videos

28 Nov 2015, 08:12

@Joe - thx for link to Jethrow's tutorial. Seems like you are producing video version of the tutorial. Thats very helpful - thx again for your effort. So far, I have managed to check out page elements. So far so good. I need to see how I can manipulate web page, feed values and click buttons. Thats what I am after and not just scraping data from a rendered page. Eager to try out further videos above.

One problem - iWB2Learner does not work as it does in your video. It seems to 'skew' page elements when it outlines, it just misses them etc. I have IE 11 and I see same problem with iWb2Lerner downloaded from link below as well as the one on Jethrow's page. But I can get page element names from page source so while it would have been very convenient, its not a showstopper.

@Wolf_II - I downloaded iWB2Learner from sourceforge - http://sourceforge.net/projects/ahkcn/f ... 20Learner/
This seems to be newer version compared to one from Jethrow's page.
User avatar
Joe Glines
Posts: 770
Joined: 30 Sep 2013, 20:49
Location: Dallas
Contact:

Re: WebScraping and COM- GUI syntax writer and demo videos

28 Nov 2015, 08:52

change IE to be at 100% zoom level. It does this to me as well but should fix the issue
Sign-up for the 🅰️HK Newsletter

ImageImageImageImage:clap:
AHK Tutorials:Web Scraping | | Webservice APIs | AHK and Excel | Chrome | RegEx | Functions
Training: AHK Webinars Courses on AutoHotkey :ugeek:
YouTube

:thumbup: Quick Access Popup, the powerful Windows folders, apps and documents launcher!

Return to “Tutorials (v1)”

Who is online

Users browsing this forum: No registered users and 28 guests