Best way for scrap?

Get help with using AutoHotkey and its commands and hotkeys
zuzu_kuc
Posts: 35
Joined: 30 Mar 2016, 12:36

Best way for scrap?

14 Jun 2020, 09:58

What is the best way to scrap this webpage?
https://www.zbozi.cz/vyrobek/dell-p2418ht/
I have to click on the "Další obchody (63)" to see rest of prices, and then a have to scrap all prices from the web site.
i have 100+ models (100+ pages to scrap) sooo what do you think?
Thank you.
malcev
Posts: 826
Joined: 12 Aug 2014, 12:37

Re: Best way for scrap?

14 Jun 2020, 11:14

WinHttpRequest.
zuzu_kuc
Posts: 35
Joined: 30 Mar 2016, 12:36

Re: Best way for scrap?

14 Jun 2020, 15:40

unfortunately it doesn't work
malcev
Posts: 826
Joined: 12 Aug 2014, 12:37

Re: Best way for scrap?

14 Jun 2020, 15:42

Search for examples in forum.
teadrinker
Posts: 2052
Joined: 29 Mar 2015, 09:41
Contact:

Re: Best way for scrap?

14 Jun 2020, 17:40

I'm not sure if it is possible to get the page contents through WinHttpRequest, but I see that this site has API which you can use like this:

Code: Select all

url := "https://www.zbozi.cz/api/v1/product/dell-p2418ht?availability=&limit=100&normalizedNameExt=&region="

whr := ComObjCreate("WinHttp.WinHttpRequest.5.1")
whr.Open("GET", url, false)
whr.Send()
status := whr.status
if (status != 200)
   throw Failed to download json

arr := whr.responseBody
pData := NumGet(ComObjValue(arr) + 8 + A_PtrSize)
length := arr.MaxIndex() + 1
json := StrGet(pData, length, "utf-8")

obj := JsonToAHK(json)

for k, v in obj.products[1].offers
   info .= (k = 1 ? "" : "`r`n") . v.shop.displayName . "   " . v.price//100
MsgBox, % Clipboard := info

JsonToAHK(json, rec := false) {
   static doc := ComObjCreate("htmlfile")
         , __ := doc.write("<meta http-equiv=""X-UA-Compatible"" content=""IE=9"">")
         , JS := doc.parentWindow
   if !rec
      obj := %A_ThisFunc%(JS.eval("(" . json . ")"), true)
   else if !IsObject(json)
      obj := json
   else if JS.Object.prototype.toString.call(json) == "[object Array]" {
      obj := []
      Loop % json.length
         obj.Push( %A_ThisFunc%(json[A_Index - 1], true) )
   }
   else {
      obj := {}
      keys := JS.Object.keys(json)
      Loop % keys.length {
         k := keys[A_Index - 1]
         obj[k] := %A_ThisFunc%(json[k], true)
      }
   }
   Return obj
}
Last edited by teadrinker on 06 Jul 2020, 09:07, edited 1 time in total.
malcev
Posts: 826
Joined: 12 Aug 2014, 12:37

Re: Best way for scrap?

14 Jun 2020, 18:58

teadrinker wrote: I'm not sure if it is possible to get the page contents through WinHttpRequest
But You have got it with WinHttpRequest.
teadrinker
Posts: 2052
Joined: 29 Mar 2015, 09:41
Contact:

Re: Best way for scrap?

15 Jun 2020, 05:08

I've thought you know how to download the page contents how it looks after the button was clicked.

Return to “Ask For Help”

Who is online

Users browsing this forum: AHKStudent, Kellyzkorner_NJ, mad3d, mikeyww, twstech and 64 guests