IE - web scrapping wait for element function

Get help with using AutoHotkey and its commands and hotkeys
DanRim
Posts: 85
Joined: 20 Jul 2018, 15:16

IE - web scrapping wait for element function

23 Oct 2020, 14:02

Hello web scrappers masters,

I trying to create function which will help to handle loading page delays, but for some reason this function does nothing.
Could someone help me with this function why it is not working? Or maybe someone uses something similar?

I planing to do - While element is not visible on page start timing for 1 min, if after one min will not appear that element, close function.

Code: Select all

#SingleInstance, Force

^y::
; www.yahoo.com ; searching for id element Header
pwb := WBGet()

waitForIdElement("Header") ; call function, 

waitForIdElement(htmlIdElement)
{
   while(!Element := pwb.document.getElementByID(%htmlIdElement%)) ; Make sure element exist before moving forward
	Sleep, 50
	MsgBox, what kind part it is?

   while(Element.offsetWidth=0) AND (Element.offsetHeight=0) ; if height = 0 and width = 0 then not visible
	Sleep, 50
   MsgBox, this is second msg box
}
return

; connection to IE
WBGet(WinTitle="ahk_class IEFrame", Svr#=1) {               ;// based on ComObjQuery docs
   static msg := DllCall("RegisterWindowMessage", "str", "WM_HTML_GETOBJECT")
        , IID := "{0002DF05-0000-0000-C000-000000000046}"   ;// IID_IWebBrowserApp
;//     , IID := "{332C4427-26CB-11D0-B483-00C04FD90119}"   ;// IID_IHTMLWindow2
   SendMessage msg, 0, 0, Internet Explorer_Server%Svr#%, %WinTitle%
   if (ErrorLevel != "FAIL") {
      lResult:=ErrorLevel, VarSetCapacity(GUID,16,0)
      if DllCall("ole32\CLSIDFromString", "wstr","{332C4425-26CB-11D0-B483-00C04FD90119}", "ptr",&GUID) >= 0 {
         DllCall("oleacc\ObjectFromLresult", "ptr",lResult, "ptr",&GUID, "ptr",0, "ptr*",pdoc)
         return ComObj(9,ComObjQuery(pdoc,IID,IID),1), ObjRelease(pdoc)
      }
   }
}

^+x::ExitApp
User avatar
rommmcek
Posts: 1219
Joined: 15 Aug 2014, 15:18

Re: IE - web scrapping wait for element function

23 Oct 2020, 15:48

Try: getElementByID(htmlIdElement) (removed % signs).
DanRim
Posts: 85
Joined: 20 Jul 2018, 15:16

Re: IE - web scrapping wait for element function

24 Oct 2020, 03:36

@rommmcek removing %% signs did not help, I tried it before also. I imagine maybe it is not possible to make work it in function and to pass parameter correctly.
User avatar
Jim Dunn
Posts: 378
Joined: 01 Sep 2020, 20:21
Location: NSW

Re: IE - web scrapping wait for element function

24 Oct 2020, 03:58

Does the page you are trying to load actually have an element with ID="Header"?

(also, I agree, the %s will probably need to be removed for the code to work - there are some other issues - but let's clarify exactly what you want it to wait for first... ;) )

Where does the code below ; connection to IE come from? (I'm assuming you didn't write it yourself?)
Last edited by Jim Dunn on 24 Oct 2020, 04:12, edited 1 time in total.
DanRim
Posts: 85
Joined: 20 Jul 2018, 15:16

Re: IE - web scrapping wait for element function

24 Oct 2020, 04:11

@Jim Dunn I will try describe in more detail. Lets say I am on google.com page, pressing hotkey to look for html element "Header", code start working and do nothing, thats normal, then I change to yahoo.com on the main page there is html element "Header", than page is fully loaded I want to get confirmation that such element is located and page is fully loaded.

Problem is that script does nothing. I removed %% signs. I start thinking that function is good, just to apply such syntax with COM for html elements is not valid.

html element which I looking for exist in yahoo hmtl structure.
Attachments
header.JPG
header.JPG (175.27 KiB) Viewed 252 times
User avatar
Jim Dunn
Posts: 378
Joined: 01 Sep 2020, 20:21
Location: NSW

Re: IE - web scrapping wait for element function

24 Oct 2020, 04:19

Ok.

You could try

Code: Select all

while(!(pwb.document.getElementByID(htmlIdElement).outerHtml)) ; Make sure element exist before moving forward
	Sleep, 50
Element := pwb.document.getElementByID(htmlIdElement)
(untested)

But that's a bit 'hacky' - there's probably a better way.
DanRim
Posts: 85
Joined: 20 Jul 2018, 15:16

Re: IE - web scrapping wait for element function

24 Oct 2020, 04:39

@Jim Dunn not working. What I discovered that if I trying run code not as function I get error.
i tried and your code, the same way but does nothing. I think to make it as function is not valid.

Code: Select all

^y::

; www.yahoo.com ; searching for id element Header
pwb := WBGet()

;~ waitForIdElement("Header") ; call function with 

;~ waitForIdElement(htmlIdElement)
;~ {
   while(!Element := pwb.document.getElementByID("Header")) ; Make sure element exist before moving forward
	Sleep, 50
	MsgBox, what kind part it is?

   while(Element.offsetWidth=0) AND (Element.offsetHeight=0) ; if height = 0 and width = 0 then not visible
	Sleep, 50
   MsgBox, this is second msg box
;~ }

/* ; example from forum
while(!(pwb.document.getElementByID("Header").length())) ; Make sure element exist before moving forward
	Sleep, 50
Element := pwb.document.getElementByID("Header")
*/

return
Attachments
aaaaaaaa.JPG
aaaaaaaa.JPG (42.47 KiB) Viewed 237 times
User avatar
Jim Dunn
Posts: 378
Joined: 01 Sep 2020, 20:21
Location: NSW

Re: IE - web scrapping wait for element function

24 Oct 2020, 04:48

That error message is telling you that document is not a valid member of pwb at that point - so you don't have a proper DOM loaded in pwb.
Until you do you can't search it for anything.

You probably need to do something like:

Code: Select all

while pwb.busy || pwb.ReadyState != 4
   Sleep, 25
I haven't ventured into WBGet() to see what that's trying to do, but I'm assuming you copied it exactly from somewhere and know it works, and loads a valid DOM which you can query like this...
DanRim
Posts: 85
Joined: 20 Jul 2018, 15:16

Re: IE - web scrapping wait for element function

24 Oct 2020, 05:15

@Jim Dunn I did not think about showing error before, because did not think to try run it not like function :)
adding this code (bellow) did not helped also. I tried to put it inside function out side function. I think my WBGet() probably is correct because it works with another scripts. But I really not sure what is logic behind so you never now.

Code: Select all

while pwb.busy || pwb.ReadyState != 4
   Sleep, 25
I found another code which is working (but not as a function), but problem is that it works and creates new IE and I need it on already existing IE. Tried to adapt code, but I still getting errors.

Code: Select all

^y::
; page is till the same www.yahoo.com home page, in just trying to allocate another ID
; works but it creates new IE window, does not work on active window, does not work with ComObjActive also
pwb := WBGet()
IE := ComObjCreate("InternetExplorer.Application") ; testing with ComObjActive - did not worked
IE.Visible := 1
IE.Navigate("www.yahoo.com")

while IE.Busy
   Sleep 10

;if I change IE to pwb I get error
if IE.document.getElementById("darla-assets-js-top")
   MsgBox Element id="ysch" exists
else
   MsgBox Element id="ysch" does not exist
return
Attachments
pwb.JPG
pwb.JPG (34.31 KiB) Viewed 209 times
User avatar
Jim Dunn
Posts: 378
Joined: 01 Sep 2020, 20:21
Location: NSW

Re: IE - web scrapping wait for element function

24 Oct 2020, 05:39

You need to declare pwb as global inside your function, or the function can't see it.

I should have spotted that earlier... :oops: - but I didn't notice until I tested the whole code.

This works for me with an IE window open to yahoo.com

Code: Select all

#SingleInstance, Force

^y::
; www.yahoo.com ; searching for id element Header
soundbeep
pwb := WBGet()

while pwb.busy || pwb.ReadyState != 4
   Sleep, 25

 
waitForIdElement("Header") ; call function, 

waitForIdElement(htmlIdElement)
{
	global pwb
	
	while(!pwb.document.getElementByID(htmlIdElement).outerHtml) ; Make sure element exist before moving forward
		Sleep, 50
	Element := pwb.document.getElementByID(htmlIdElement)
	
	MsgBox % Element.outerHTML
	
	MsgBox, what kind part it is?

   while(Element.offsetWidth=0) AND (Element.offsetHeight=0) ; if height = 0 and width = 0 then not visible
	Sleep, 50
   MsgBox, this is second msg box
}
return

; connection to IE
WBGet(WinTitle="ahk_class IEFrame", Svr#=1) {               ;// based on ComObjQuery docs
   static msg := DllCall("RegisterWindowMessage", "str", "WM_HTML_GETOBJECT")
        , IID := "{0002DF05-0000-0000-C000-000000000046}"   ;// IID_IWebBrowserApp
;//     , IID := "{332C4427-26CB-11D0-B483-00C04FD90119}"   ;// IID_IHTMLWindow2
   SendMessage msg, 0, 0, Internet Explorer_Server%Svr#%, %WinTitle%
   if (ErrorLevel != "FAIL") {
      lResult:=ErrorLevel, VarSetCapacity(GUID,16,0)
      if DllCall("ole32\CLSIDFromString", "wstr","{332C4425-26CB-11D0-B483-00C04FD90119}", "ptr",&GUID) >= 0 {
         DllCall("oleacc\ObjectFromLresult", "ptr",lResult, "ptr",&GUID, "ptr",0, "ptr*",pdoc)
         return ComObj(9,ComObjQuery(pdoc,IID,IID),1), ObjRelease(pdoc)
      }
   }
}

^+x::ExitApp

Last edited by Jim Dunn on 24 Oct 2020, 05:51, edited 3 times in total.
gregster
Posts: 5814
Joined: 30 Sep 2013, 06:48

Re: IE - web scrapping wait for element function

24 Oct 2020, 05:40

DanRim wrote:

Code: Select all

pwb := WBGet()
IE := ComObjCreate("InternetExplorer.Application") ; testing with ComObjActive - did not worked
Use pwb. instead of IE.. Currently, you first get a reference to an existing Internet Explorer ( pwb := WBGet() ) - assuming there is one - but then you never use pwb, but a newly created IE. COM object instead. Replace the IE.s with pwb. , in the whole snippet.

And remove

Code: Select all

IE := ComObjCreate("InternetExplorer.Application") ;
This will otherwise overwrite the reference you got in the line above - at least, if you would use pwb for both lines. Or it would create an unused IE instance, if you leave it (which just takes up memory).
User avatar
tank
Posts: 2855
Joined: 28 Sep 2013, 22:15
Facebook: charlie.simmons.7334
Google: ttnnkkrr
GitHub: ttnnkkrr
Location: Irving TX
Contact:

Re: IE - web scrapping wait for element function

24 Oct 2020, 06:14

if you look closely i have littered the solution in a couple places

Code: Select all

while (try !(element := some reference) && (a_index < 200)) ; just some arbitrary max limit
sleep 500
purposely written poorly because im trying to get you to understand and apply not spoon feed
We are troubled on every side‚ yet not distressed; we are perplexed‚
but not in despair; Persecuted‚ but not forsaken; cast down‚ but not destroyed;
https://www.facebook.com/ahkscript.org
If you have forum suggestions please submit a pull request
Check Out WebWriter
Thanks Tank :thumbup:
User avatar
rommmcek
Posts: 1219
Joined: 15 Aug 2014, 15:18

Re: IE - web scrapping wait for element function

24 Oct 2020, 13:21

@DanRim: I think you showed enough self initiative to help you more holistically, however I'm amateur too and there is no guarantee, even if it works for me...

Code: Select all

#NoEnv
#SingleInstance, Force

Url:= "www.yahoo.com"
htmlIdElement:= "darla-assets-js-top"

^y::
    IfWinNotExist, ahk_class IEFrame
    {   ; if no IExporer window is open then create a new one and navigate to desired site
        wb:= ComObjCreate("InternetExplorer.Application")
        wb.Visible:= true
        wb.Navigate(Url)
    } else {
        WinActivate, ahk_class IEFrame
        WinGetTitle, WinTitle, ahk_class IEFrame
        wb:= pwb_Get(WinTitle)
        WinAddr:= wb.document.url
        if (WinAddr ~= "yahoo\.com") ; if some yahoo site is open (and tab is selected - active) then use this one and
            wb.Navigate(Url)         ; navigate to desired site
        else   
            wb.Navigate(Url, 2048) ; otherwide create a new tab and navigate to desired site
        while !WinActive("Yahoo") ;&& A_Index <= 40
            sleep, 50
        wb:= pwb_Get()
    }

    While wb.readyState != 4 || wb.document.readyState != "complete" || wb.busy
            sleep, 50

    While !wb.document.GetElementById(htmlIdElement) ;&& A_Index <= 20
        Sleep, 50
        
    if wb.document.getElementById(htmlIdElement)
       MsgBox Element id="ysch" exists
    else
       MsgBox Element id="ysch" does not exist
return

PWB_Get(WinTitle="A", Svr#=1) ; Jethrow - http://www.autohotkey.com/board/topic/47052-basic-webpage-controls-with-javascript-com-tutorial/
{
	Static msg := DllCall("RegisterWindowMessage", "str", "WM_HTML_GETOBJECT")
	, IID := "{0002DF05-0000-0000-C000-000000000046}" ; IID_IWebBrowserApp
	;,IID := "{332C4427-26CB-11D0-B483-00C04FD90119}" ; IID_IHTMLWindow2
	SendMessage, msg, 0, 0, Internet Explorer_Server%Svr#%, %WinTitle%
	If (ErrorLevel != "FAIL") {
		lResult := ErrorLevel, VarSetCapacity(GUID, 16, 0)
		If (DllCall("ole32\CLSIDFromString", "wstr", "{332C4425-26CB-11D0-B483-00C04FD90119}", "ptr", &GUID) >= 0) {
			DllCall("oleacc\ObjectFromLresult", "ptr", lResult, "ptr", &GUID, "ptr", 0, "ptr*", pdoc)
			Return ComObj(9, ComObjQuery(pdoc, IID, IID), 1), ObjRelease(pdoc)
		}
	}
	MsgBox, 262160, %A_ScriptName% - %A_ThisFunc%(): Error,  Unable to obtain browser object (PWB) from window:`n`n%WinTitle%
}
Note: There is probably a typo in tank's code ( 500*200=100000 - more then 1.5 minute).

Return to “Ask For Help”

Who is online

Users browsing this forum: hasantr and 46 guests