Could not get the webpage source into a variable Topic is solved

Get help with using AutoHotkey and its commands and hotkeys
Posts: 3
Joined: 02 Feb 2018, 17:29

Could not get the webpage source into a variable

19 Feb 2018, 08:25

Hello... I know this kind of question was widely mentioned in this forum, but I'm actually unable to find the best workaround for my needs... I'd like to get a webpage source into a variable; the following are my tries (after reviewing a lot of examples found here)

Code: Select all

[;This code works:
urlDownloadToFile,, index.txt

;This is not working

urlDownloadToVar(url, byref src = "")
;obj.SetRequestHeader("User-Agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)")
return src:=obj.ResponseText
; Function executes, bbut instead of source code of given webpage it returns the following:
<title>403 Forbidden</title>
<p>You don't have permission to access /
on this server.<br />

; And one more example (that also doesn't work)

sourceDownloadToVar(url, byref src = "")
ieobj := ComObjCreate("InternetExplorer.Application")
ieobj.Visible := False
html := wb.Document.All[0].outerhtml
; Also tried html := wb.Document.body.innerhtml (no luck)
; Also tried html:=Wb.document.documentElement.outerhtml (the same result)
return src:=html
; This code outputs the following error:
Error:  0x80004005 - Unspecified error
Source: (null)
Description: (null)
HelpFile: (null)
HelpContext: 0
Specifically: document
User avatar
Posts: 946
Joined: 30 Sep 2013, 10:54
Location: Brazil

Re: Could not get the webpage source into a variable

19 Feb 2018, 08:58

Hello Pleveris.

Welcome to the AutoHotkey community forums.

The following example uses HttpQuery() By DerRaphael. It is as easy to use as passing the URL as the first parameter and than setting the return value of the function to a variable of your choice (it will contain the HTML).

Code: Select all

; exmpl.searchAHKforum.httpQuery.ahk
; Searches the forum for a given Phrase: in this case httpQuery
VarURL      := ""
VarHtml := httpQuery(URL := VarURL)
msgbox % VarHtml

; HttpQuery() By DerRaphael

; httpQuery-0-3-6.ahk
httpQuery(byref p1 = "", p2 = "", p3="", p4="")
{   ; v0.3.6 (w) Oct, 26 2010 by derRaphael / zLib-Style release
   ; currently the verbs showHeader, storeHeader, and updateSize are supported in httpQueryOps
   ; in case u need a different UserAgent, Proxy, ProxyByPass, Referrer, and AcceptType just
   ; specify them as global variables - mind the varname for referrer is httpQueryReferer [sic].
   ; Also if any special dwFlags are needed such as INTERNET_FLAG_NO_AUTO_REDIRECT or cache
   ; handling this might be set using the httpQueryDwFlags variable as global
   global httpQueryOps, httpAgent, httpProxy, httpProxyByPass, httpQueryReferer, httpQueryAcceptType
       , httpQueryDwFlags
   ; Get any missing default Values
   ; check for syntax
   if ( VarSetCapacity(p1) != 0 )
      dReturn:=true,  result := "", lpszUrl := p1, POSTDATA := p2, HEADERS  := p3
      result := p1, lpszUrl := p2, POSTDATA := p3, HEADERS  := p4
   defaultOps =
   (LTrim Join|
      if StrLen(%httpOption%)=0
         %httpOption% := httpDefault

   ; Load Library
   hModule := DllCall("LoadLibrary", "Str", "WinINet.Dll")

   ; SetUpStructures for URL_COMPONENTS / needed for InternetCrackURL
   offset_name_length:= "4-lpszScheme-255|16-lpszHostName-1024|28-lpszUserName-1024|"
                  . "36-lpszPassword-1024|44-lpszUrlPath-1024|52-lpszExtrainfo-1024"
   ; Struc Size               ; Scheme Size                  ; Max Port Number
   NumPut(60,URL_COMPONENTS,0), NumPut(255,URL_COMPONENTS,12), NumPut(0xffff,URL_COMPONENTS,24)

   ; Split the given URL; extract scheme, user, pass, authotity (host), port, path, and query (extrainfo)

   ; Update variables to retrieve results
   ; Import any set dwFlags
   dwFlags := httpQueryDwFlags
   ; For some reasons using a selfsigned https certificates doesnt work
   ; such as an own webmin service - even though every security is turned off
   ; https with valid certificates works when
   if (lpszScheme = "https")

   ; Check for Header and drop exception if unknown or invalid URL
   if (lpszScheme="unknown") {
      Result := "ERR: No Valid URL supplied."
      Return StrLen(Result)

   ; Initialise httpQuery's use of the WinINet functions.
   hInternet := DllCall("WinINet\InternetOpenA"
                  ,(httpProxy != 0 ?  INTERNET_OPEN_TYPE_PROXY : INTERNET_OPEN_TYPE_DIRECT )

   ; Open HTTP session for the given URL
   hConnect := DllCall("WinINet\InternetConnectA"
                  ,"uInt",hInternet,"Str",lpszHostname, "Int",nPort
                  ,"Str",lpszUserName, "Str",lpszPassword,"uInt",INTERNET_SERVICE_HTTP

   ; Do we POST? If so, check for header handling and set default
   if (Strlen(POSTDATA)>0) {
      if StrLen(Headers)=0
         Headers:="Content-Type: application/x-www-form-urlencoded"
   } else ; otherwise mode must be GET - no header defaults needed

   ; Form the request with proper HTTP protocol version and create the request handle
   hRequest := DllCall("WinINet\HttpOpenRequestA"
                  ,"uInt",hConnect,"Str",HTTPVerb,"Str",lpszUrlPath . lpszExtrainfo
                  ,"Str",ProVer := "HTTP/1.1", "Str",httpQueryReferer,"Str",httpQueryAcceptTypes
                  ,"uInt",dwFlags,"uInt",Context:=0 )

   ; Send the specified request to the server
   sRequest := DllCall("WinINet\HttpSendRequestA"
                  , "uInt",hRequest,"Str",Headers, "uInt",Strlen(Headers)
                  , "Str",POSTData,"uInt",Strlen(POSTData))

   VarSetCapacity(header, 2048, 0)  ; max 2K header data for httpResponseHeader
   VarSetCapacity(header_len, 4, 0)
   ; Check for returned server response-header (works only _after_ request been sent)
   Loop, 5
     if ((headerRequest:=DllCall("WinINet\HttpQueryInfoA","uint",hRequest

   If (headerRequest=1) {
      Loop,% headerLength
         if (*(&res-1+a_index)=0) ; Change binary zero to linefeed
   } else
      res := "timeout"

   ; Get 1st Line of Full Response
      RetValue := A_LoopField
   ; No Connection established - drop exception
   If (RetValue="timeout") {
      html := "Error: timeout"
      return -1
   ; Strip protocol version from return value
   RetValue := RegExReplace(RetValue,"HTTP/1\.[01]\s+")
    ; List taken from
   HttpRetCodes := "100=Continue|101=Switching Protocols|102=Processing (WebDAV) (RFC 2518)|"
              . "200=OK|201=Created|202=Accepted|203=Non-Authoritative Information|204=No"
              . " Content|205=Reset Content|206=Partial Content|207=Multi-Status (WebDAV)"
              . "|300=Multiple Choices|301=Moved Permanently|302=Found|303=See Other|304="
              . "Not Modified|305=Use Proxy|306=Switch Proxy|307=Temporary Redirect|400=B"
              . "ad Request|401=Unauthorized|402=Payment Required|403=Forbidden|404=Not F"
              . "ound|405=Method Not Allowed|406=Not Acceptable|407=Proxy Authentication "
              . "Required|408=Request Timeout|409=Conflict|410=Gone|411=Length Required|4"
              . "12=Precondition Failed|413=Request Entity Too Large|414=Request-URI Too "
              . "Long|415=Unsupported Media Type|416=Requested Range Not Satisfiable|417="
              . "Expectation Failed|418=I'm a teapot (RFC 2324)|422=Unprocessable Entity "
              . "(WebDAV) (RFC 4918)|423=Locked (WebDAV) (RFC 4918)|424=Failed Dependency"
              . " (WebDAV) (RFC 4918)|425=Unordered Collection (RFC 3648)|426=Upgrade Req"
              . "uired (RFC 2817)|449=Retry With|500=Internal Server Error|501=Not Implem"
              . "ented|502=Bad Gateway|503=Service Unavailable|504=Gateway Timeout|505=HT"
              . "TP Version Not Supported|506=Variant Also Negotiates (RFC 2295)|507=Insu"
              . "fficient Storage (WebDAV) (RFC 4918)|509=Bandwidth Limit Exceeded|510=No"
              . "t Extended (RFC 2774)"
   ; Gather numeric response value
   RetValue := SubStr(RetValue,1,3)
   ; Parse through return codes and set according informations
      HttpReturnCode := SubStr(A_LoopField,1,3)    ; Numeric return value see above
      HttpReturnMsg  := SubStr(A_LoopField,5)      ; link for additional information
      if (RetValue=HttpReturnCode) {
         RetMsg := HttpReturnMsg

   ; Global HttpQueryOps handling
   if strlen(HTTPQueryOps)>0 {
      ; Show full Header response (usefull for debugging)
      if (instr(HTTPQueryOps,"showHeader"))
         MsgBox % res
      ; Save the full Header response in a global Variable
      if (instr(HTTPQueryOps,"storeHeader"))
         global HttpQueryHeader := res
      ; Check for size updates to export to a global Var
      if (instr(HTTPQueryOps,"updateSize")) {
            If RegExMatch(A_LoopField,"Content-Length:\s+?(?P<Size>\d+)",full) {
               global HttpQueryFullSize := fullSize
         if (fullSize+0=0)
            HttpQueryFullSize := "size unavailable"

   ; Check for valid codes and drop exception if suspicious
   if !(InStr("100 200 201 202 302",RetValue)) {
      Result := RetValue " " RetMsg
      return StrLen(Result)

   fsize := 0
   Loop            ; the receiver loop - rewritten in the need to enable
   {               ; support for larger file downloads
      bc := A_Index
      VarSetCapacity(buffer%bc%,1024,0) ; setup new chunk for this receive round
      ReadFile := DllCall("wininet\InternetReadFile"
      ReadBytes := NumGet(BytesRead)    ; how many bytes were received?
      If ((ReadFile!=0)&&(!ReadBytes))  ; we have had no error yet and received no more bytes
         break                         ; we must be done! so lets break the receiver loop
      Else {
         fsize += ReadBytes            ; sum up all chunk sizes for correct return size
         sizeArray .= ReadBytes "|"
      if (instr(HTTPQueryOps,"updateSize"))
         Global HttpQueryCurrentSize := fsize
   sizeArray := SubStr(sizeArray,1,-1)   ; trim last PipeChar
   VarSetCapacity( ( dReturn == true ) ? result : p1 ,fSize+1,0)      ; reconstruct the result from above generated chunkblocks
   Dest := ( dReturn == true ) ? &result : &p1                 ; to a our ByRef result variable
      , Dest += A_LoopField
   DllCall("WinINet\InternetCloseHandle", "uInt", hRequest)   ; close all opened
   DllCall("WinINet\InternetCloseHandle", "uInt", hInternet)
   DllCall("WinINet\InternetCloseHandle", "uInt", hConnect)
   DllCall("FreeLibrary", "UInt", hModule)                    ; unload the library
   if ( dReturn == true ) {
      VarSetCapacity( result, -1 )
      ErrorLevel := fSize
      return Result
   } else
      return fSize                      ; return the size - strings need update via VarSetCapacity(res,-1)

You can also use the second example at the docs for URLDownLoadToFile if you want to trade off bulkiness for absolute simplicity at retrieving the html.

Code: Select all

whr := ComObjCreate("WinHttp.WinHttpRequest.5.1")
whr.Open("GET", "", true)
; Using 'true' above and the call below allows the script to remain responsive.
version := whr.ResponseText
MsgBox % version
Posts: 3
Joined: 02 Feb 2018, 17:29

Re: Could not get the webpage source into a variable

19 Feb 2018, 10:38

Hello Gio, this is somehow incorrect. Talking about your first example, that code returns the following "-1". As I understood from that function, this means something like HTML timeout or so. Talking about the 2nd, as I said in the main message, it outputs: "<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<title>403 Forbidden</title>
<p>You don't have permission to access /
on this server.<br />
</body></html>". Maybe I should set something on that server to get this WinHTTP Obj Property work correctly...
User avatar
Posts: 946
Joined: 30 Sep 2013, 10:54
Location: Brazil

Re: Could not get the webpage source into a variable

19 Feb 2018, 10:55

On a second test, it seems that the HttpQuery() code i posted only works in the ANSI version (which was my current setting when i ran it). This is to be expected i suppose, since the code is from 2008, but i cannot find a work-around for using this function in unicode versions ATM.

Regarding the second example, i got that forbidden message once, but after running it again, the html was captured correctly. It even worked on Unicode versions here, but It's not that bulky a code, so i don't think it is that reliable. If you fail to get responses after multiple attempts, it may also have something to do with your firewall settings.
Posts: 3
Joined: 02 Feb 2018, 17:29

Re: Could not get the webpage source into a variable

19 Feb 2018, 11:45

Hmm, tried with firewall-disabled and antivirus-disabled, the same result anyway. Well, if I try to change the website from to, for example,, then it works successfully...
User avatar
Posts: 509
Joined: 19 Nov 2013, 09:15

Re: Could not get the webpage source into a variable  Topic is solved

19 Feb 2018, 12:59

You can try this it downloads the same text as urldownloadtofile ,code posted by jNizM see

Code: Select all

;urlDownloadToFile,, index.txt

;see jNizM

msgbox % downloadtostring( "")

DownloadToString(url, encoding = "utf-8")
    static a := "AutoHotkey/" A_AhkVersion
    if (!DllCall("LoadLibrary", "str", "wininet") || !(h := DllCall("wininet\InternetOpen", "str", a, "uint", 1, "ptr", 0, "ptr", 0, "uint", 0, "ptr")))
        return 0
    c := s := 0, o := ""
    if (f := DllCall("wininet\InternetOpenUrl", "ptr", h, "str", url, "ptr", 0, "uint", 0, "uint", 0x80003000, "ptr", 0, "ptr"))
        while (DllCall("wininet\InternetQueryDataAvailable", "ptr", f, "uint*", s, "uint", 0, "ptr", 0) && s > 0)
            VarSetCapacity(b, s, 0)
            DllCall("wininet\InternetReadFile", "ptr", f, "ptr", &b, "uint", s, "uint*", r)
            o .= StrGet(&b, r >> (encoding = "utf-16" || encoding = "cp1200"), encoding)
        DllCall("wininet\InternetCloseHandle", "ptr", f)
    DllCall("wininet\InternetCloseHandle", "ptr", h)
    return o

Return to “Ask For Help”

Who is online

Users browsing this forum: Albireo, Bing [Bot], boiler, Eureka, JawGBoi, Weshuggah, Zeppy and 145 guests