How to extract only first paragraph from Wikipedia using API Calls? Topic is solved

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
User avatar
Sabestian Caine
Posts: 528
Joined: 12 Apr 2015, 03:53

How to extract only first paragraph from Wikipedia using API Calls?

19 Jul 2019, 11:14

Hello friends..

I am trying to learn API calls. I want to grab only first paragraph and show it into msgbox from this site - https://en.wikipedia.org/wiki/AutoHotkey
Please look at this image-
19_07_19 @9_37_02.PNG
19_07_19 @9_37_02.PNG (93.82 KiB) Viewed 3642 times
in the above image you can see the paragraph marked in red box, i want to grab it and show it into msgbox using API calls.

I have these codes-

Code: Select all

url := "https://en.wikipedia.org/wiki/AutoHotkey"
whr := ComObjCreate("WinHttp.WinHttpRequest.5.1")
whr.Open("POST", url, false)
whr.Send()
html := whr.ResponseText
MsgBox, % html
When i run these codes it shows all html codes of that site, but i want to grab only first paragraph. Please tell me how to do that?
I know it can be done easily using web scrapping like browser automation using selenium etc, but i want to do it using API class.

Please help and guide..

Thanks a lot..
I don't normally code as I don't code normally.
User avatar
TheDewd
Posts: 1513
Joined: 19 Dec 2013, 11:16
Location: USA

Re: How to extract only first paragraph from Wikipedia using API Calls?

19 Jul 2019, 12:42

Code: Select all

url := "https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&exintro&explaintext&redirects=1&titles=AutoHotkey"
whr := ComObjCreate("WinHttp.WinHttpRequest.5.1")
whr.Open("POST", url, false)
whr.Send()
html := whr.ResponseText

RegExMatch(html, """extract"":""(.*?)""}", Match)

MsgBox, % Match1

Code: Select all

url := "https://en.wikipedia.org/w/api.php?format=xml&action=query&prop=extracts&exintro&explaintext&redirects=1&titles=AutoHotkey"
whr := ComObjCreate("WinHttp.WinHttpRequest.5.1")
whr.Open("POST", url, false)
whr.Send()
html := whr.ResponseText

RegExMatch(html, "<extract.*?>(.*?)<\/extract>", Match)

MsgBox, % Match1
User avatar
Sabestian Caine
Posts: 528
Joined: 12 Apr 2015, 03:53

Re: How to extract only first paragraph from Wikipedia using API Calls?

20 Jul 2019, 11:43

TheDewd wrote:
19 Jul 2019, 12:42

Code: Select all

url := "https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&exintro&explaintext&redirects=1&titles=AutoHotkey"
whr := ComObjCreate("WinHttp.WinHttpRequest.5.1")
whr.Open("POST", url, false)
whr.Send()
html := whr.ResponseText

RegExMatch(html, """extract"":""(.*?)""}", Match)

MsgBox, % Match1

Code: Select all

url := "https://en.wikipedia.org/w/api.php?format=xml&action=query&prop=extracts&exintro&explaintext&redirects=1&titles=AutoHotkey"
whr := ComObjCreate("WinHttp.WinHttpRequest.5.1")
whr.Open("POST", url, false)
whr.Send()
html := whr.ResponseText

RegExMatch(html, "<extract.*?>(.*?)<\/extract>", Match)

MsgBox, % Match1



Thanks dear TheDewd for these great codes...

Sir, please tell me where did you get this url from - https://en.wikipedia.org/w/api.php?format=xml&action=query&prop=extracts&exintro&explaintext&redirects=1&titles=AutoHotkey

As, normally when we go to wikipedia and search for Autohotkey key it shows this url - https://en.wikipedia.org/wiki/AutoHotkey

So, i request you tell me something about it..

Secondly, I would like to ask that when we make and api call then servers returns the data in XML or JSON but in the above codes it is returning the data in html, please tell me why so?

Please help and guide me...

Thanks a lot sir..
I don't normally code as I don't code normally.
User avatar
Sabestian Caine
Posts: 528
Joined: 12 Apr 2015, 03:53

Re: How to extract only first paragraph from Wikipedia using API Calls?

21 Jul 2019, 08:15

Please help me regarding this....!!!
I don't normally code as I don't code normally.
tmplinshi
Posts: 1604
Joined: 01 Oct 2013, 14:57

Re: How to extract only first paragraph from Wikipedia using API Calls?

21 Jul 2019, 08:41

Open https://en.wikipedia.org, go to the bottom of the page, there is a Developers link. Open it, and in the Developers page there is a Web APIs link.. :beer:
User avatar
Sabestian Caine
Posts: 528
Joined: 12 Apr 2015, 03:53

Re: How to extract only first paragraph from Wikipedia using API Calls?

21 Jul 2019, 11:33

tmplinshi wrote:
21 Jul 2019, 08:41
Open https://en.wikipedia.org, go to the bottom of the page, there is a Developers link. Open it, and in the Developers page there is a Web APIs link.. :beer:
Thanks dear tmplinshi for your kind reply...

Could you please tell me one more thing-
I would like to ask that when we make and api call then servers returns the data in XML or JSON but in the above codes it is returning the data in html, please tell me why so?
Please help and guide..

Thanks..
I don't normally code as I don't code normally.
tmplinshi
Posts: 1604
Joined: 01 Oct 2013, 14:57

Re: How to extract only first paragraph from Wikipedia using API Calls?

21 Jul 2019, 11:55

Sabestian Caine wrote:but in the above codes it is returning the data in html, please tell me why so?
I don't understand your question. The 1st code from TheDewd returns a json format, and the 2nd returns a xml format, both are correct as expected.
User avatar
Sabestian Caine
Posts: 528
Joined: 12 Apr 2015, 03:53

Re: How to extract only first paragraph from Wikipedia using API Calls?  Topic is solved

22 Jul 2019, 00:02

tmplinshi wrote:
21 Jul 2019, 11:55
Sabestian Caine wrote:but in the above codes it is returning the data in html, please tell me why so?
I don't understand your question. The 1st code from TheDewd returns a json format, and the 2nd returns a xml format, both are correct as expected.

oh!!! I didn't notice that... got confused... Thanks a lot dear tmplinshi.... You really helped me a lot... :bravo:
Thank you so much sir... :salute:
I don't normally code as I don't code normally.
teadrinker
Posts: 4330
Joined: 29 Mar 2015, 09:41
Contact:

Re: How to extract only first paragraph from Wikipedia using API Calls?

22 Jul 2019, 04:54

Sabestian Caine wrote: in the above image you can see the paragraph marked in red box, i want to grab it and show it into msgbox using API calls.
It's not clear what you meant by «API calls». @TheDewd gave you an example how to do it using MediaWiki API —> query —> extracts.
However, I suppose you wanted to ask how to do it using COM objects. If so:

Code: Select all

url := "https://en.wikipedia.org/wiki/AutoHotkey"
whr := ComObjCreate("WinHttp.WinHttpRequest.5.1")
whr.Open("POST", url, false)
whr.Send()
html := whr.ResponseText

doc := ComObjCreate("htmlfile")
doc.write("<meta http-equiv=""X-UA-Compatible"" content=""IE=9"">")
doc.write(html)

; text content of the first <p> tag
MsgBox, % doc.getElementsByTagName("p").(0).innerText
; or
MsgBox, % doc.querySelector("p").innerText
User avatar
Sabestian Caine
Posts: 528
Joined: 12 Apr 2015, 03:53

Re: How to extract only first paragraph from Wikipedia using API Calls?

23 Jul 2019, 12:39

teadrinker wrote:
22 Jul 2019, 04:54
Sabestian Caine wrote: in the above image you can see the paragraph marked in red box, i want to grab it and show it into msgbox using API calls.
It's not clear what you meant by «API calls». @TheDewd gave you an example how to do it using MediaWiki API —> query —> extracts.
However, I suppose you wanted to ask how to do it using COM objects. If so:

Code: Select all

url := "https://en.wikipedia.org/wiki/AutoHotkey"
whr := ComObjCreate("WinHttp.WinHttpRequest.5.1")
whr.Open("POST", url, false)
whr.Send()
html := whr.ResponseText

doc := ComObjCreate("htmlfile")
doc.write("<meta http-equiv=""X-UA-Compatible"" content=""IE=9"">")
doc.write(html)

; text content of the first <p> tag
MsgBox, % doc.getElementsByTagName("p").(0).innerText
; or
MsgBox, % doc.querySelector("p").innerText

You are great sir.. :salute:

You guys are really awesome... :clap:

I wonder how you do that!!!! :roll:

I am also trying to learn coding as you guys know...

Now i have many ways to extract that text...

Thanks a lot dear teadrinker.. :bravo: :clap: :superhappy:
I don't normally code as I don't code normally.

Return to “Ask for Help (v1)”

Who is online

Users browsing this forum: Google [Bot], mebelantikjaya and 299 guests