Webscraping .ASPX pages - variable problem

Posts: 1736
Joined: 22 Jan 2017, 19:37

Webscraping .ASPX pages - variable problem

09 Feb 2017, 13:54

Hello, thanks in advance to anyone who can help me with this problem. If I can get by this one hurdle, I believe it'll help me tremendously.
My AHK toolset is still extremely limited, as you can probably tell.
I'm scraping a public website with no scraping restrictions. The working code is below.
1) I can use startVar to increment a counter here, I think because it never gets converted to a string during processing (not sure).
2) What I'd like to do (not only for this but for lots of other pages that use these ListView repeaters) is replace
"_2076ade26f0648268e57e8bd7e3c3bd1_ListView_ElementsRepeater_txt_1_" with a variable that will also get processed.

So far I've tried every escape sequence, double-quote and other trick I can think of, but the variable never gets processed.

Code: Select all

#SingleInstance, Force

pwb := WBGet()
URL = [URL omitted, but works fine]
pwb.Navigate(URL) ;Navigate to URL

while pwb.busy or pwb.ReadyState != 4 ;Wait for page to load
	Sleep, 100

startVar := "0"
while startVar <=26 {
jName := pwb.document.GetElementByID("_2076ade26f0648268e57e8bd7e3c3bd1_ListView_ElementsRepeater_txt_1_"startVar).InnerText
jTCAA := pwb.document.GetElementbyID("_2076ade26f0648268e57e8bd7e3c3bd1_ListView_ElementsRepeater_txt_5_"startVar).InnerText
jPhone := pwb.document.GetElementbyID("_2076ade26f0648268e57e8bd7e3c3bd1_ListView_ElementsRepeater_txt_6_"startVar).InnerText
jInfo = %jName%|%jPhone%|%jTCAA% `r
FileAppend, %jInfo%, c:\test\judgescrape2.txt
I'm sure it's something very simple - I've read all the variable and expression pages, but haven't found the answer. I bet it's leaping out from the page at me, but I just can't find it.
I've tried things like:
FF = `"_2076ade26f0648268e57e8bd7e3c3bd1_ListView_ElementsRepeater_txt_1_`"
FF := ""_2076ade26f0648268e57e8bd7e3c3bd1_ListView_ElementsRepeater_txt_1_""
FF = `_2076ade26f0648268e57e8bd7e3c3bd1_ListView_ElementsRepeater_txt_1_

The replacement would read something like "jName := pwb.document.GetElementByID(FFstartVar).InnerText". I hope I'm getting my concept across, although I'm afraid maybe not.
In the long run, I can keep on using the variables like they are, as they work fine, but I'd love to simplify them.

Stuff n Things
Posts: 18
Joined: 23 Jan 2017, 12:42

Re: Webscraping .ASPX pages - variable problem

09 Feb 2017, 16:32

Have you tried?

Code: Select all

jName := pwb.document.GetElementByID("_2076ade26f0648268e57e8bd7e3c3bd1_ListView_ElementsRepeater_txt_1_" . startVar).InnerText
Posts: 1736
Joined: 22 Jan 2017, 19:37

Re: Webscraping .ASPX pages - variable problem

09 Feb 2017, 17:55

Hi, thanks for your response. The code I've listed works (no "." needed).
What I wanted to do was replace "_2076ade26f0648268e57e8bd7e3c3bd1_ListView_ElementsRepeater_txt_1_" with another variable, say "preFix", doesn't matter, and concatenate it with "startVar".
User avatar
Capn Odin
Posts: 1352
Joined: 23 Feb 2016, 19:45
Location: Denmark

Re: Webscraping .ASPX pages - variable problem

09 Feb 2017, 18:26

I do believe you need a space or dot.

Code: Select all

FF := "_2076ade26f0648268e57e8bd7e3c3bd1_ListView_ElementsRepeater_txt_1_"
jName := pwb.document.GetElementByID(FF startVar).InnerText
Please excuse my spelling I am dyslexic.
Posts: 1736
Joined: 22 Jan 2017, 19:37

Re: Webscraping .ASPX pages - variable problem

09 Feb 2017, 18:43

Thanks, Capn Odin, but neither works. Might I need to be using AHK_H instead? Maybe I'll try that. If it works I'll post right away.

Again, thanks!
Posts: 1736
Joined: 22 Jan 2017, 19:37

Re: Webscraping .ASPX pages - variable problem

09 Feb 2017, 18:46

Thanks, Capn Odin.

I tried both solutions, but neither worked.

I will try to use AHK_H, and see if there is something in that version that might let me do what I'm trying to do.

I'll post right away if it works.

User avatar
Capn Odin
Posts: 1352
Joined: 23 Feb 2016, 19:45
Location: Denmark

Re: Webscraping .ASPX pages - variable problem

09 Feb 2017, 18:58

Code: Select all

pwb := ComObjCreate("InternetExplorer.Application") ; Create an IE object
pwb.Visible := True ; Make the IE object visible
pwb.Navigate("http://www.google.com") ; Replace the www with the URL of a web page that has the example drop down html above on it.
While, pwb.ReadyState != 4 ; Wait for page to load

var1 := "sf"
var2 := "div"

MsgBox, % pwb.document.getElementById(var1 var2).InnerHTML

Edit: I have confirmed that this script works and it should be similar to your problem.
Please excuse my spelling I am dyslexic.
Posts: 1736
Joined: 22 Jan 2017, 19:37

Re: Webscraping .ASPX pages - variable problem

09 Feb 2017, 19:35

Thanks, Capn Odin. Still no joy, but I think you've got me on the right track. It does appear to be a timing issue. Please see the code below, which works.
But I have no timing issues when I don't try to use the second variable (FF).

#SingleInstance, Force

pwb := WBGet()
URL = https://[MyURLHere].
pwb.Navigate(URL) ;Navigate to URL
while pwb.busy or pwb.ReadyState != 4 ;Wait for page to load
Sleep, 100

FF := "_2076ade26f0648268e57e8bd7e3c3bd1_ListView_ElementsRepeater_txt_1_"
startVar := "0"

msgbox Wait a while ; If I wait five or ten seconds before clicking out of the msgbox, the code below works, and I get all 26 records. I admit to not understanding.

while startVar <=26 {
jName := pwb.document.GetElementByID(FF startVar).InnerText
jTCAA := pwb.document.GetElementbyID("_2076ade26f0648268e57e8bd7e3c3bd1_ListView_ElementsRepeater_txt_5_"startVar).InnerText
jPhone := pwb.document.GetElementbyID("_2076ade26f0648268e57e8bd7e3c3bd1_ListView_ElementsRepeater_txt_6_"startVar).InnerText
jInfo = %jName%|%jPhone%|%jTCAA% `r
FileAppend, %jInfo%, c:\test\judgescrape2.txt


p.s. It worked for a while, then stopped. I am stumped.

