Check if website changed?

Ask gaming related questions (AHK v1.1 and older)
User avatar
Nwb
Posts: 444
Joined: 29 Nov 2016, 08:56

Check if website changed?

25 Feb 2018, 00:04

Heya does anybody know how to approach this (check if website changed)? Keep in mind this should be a background thing. I thought about urltofile but that doesn't work for me, or I must have been doing it wrong? I'm a newbie and I'm clueless. :crazy:

The idea is that the slightest change should trigger the script and for instance let's say, open a msgbox saying "Website changed!". Or else keep looping the checking-thread.
Preferably check every 10seconds. I don't mind if it's a big performance eater, ahk doesn't need that much anyway.
I am your average ahk newbie. Just.. a tat more cute. ;)
gregster
Posts: 9021
Joined: 30 Sep 2013, 06:48

Re: Check if website changed?

25 Feb 2018, 00:19

If the change is reflected in the HTML, UrlDownloadtoFile or one of the UrlDownloadtoVar user functions should be fine. So that would be the first thing to determine. If you then compare the whole variable/html contents or just parts of it, depends on your goal... the COM interface of Internet Explorer is another option, especially if you are checking for specific details (via the DOM).

But I guess there could be some webpages with javascript that cannot be handled that way. The solution might has to be customized, perhaps even OCR is needed.

I would recommend to tell us the URL, or, if that is no option for you, to take an example URL that has a similar HTML structure.

Btw, there are websites that don't like constant reloading. Every 10 seconds might be ok, but there is no way to be sure. I know websites where you will run into cloudflare protection pages, if you reload them several times a minute over some time. But sometimes there are other solutions, like APIs that might be provided by the site owner.
User avatar
Nwb
Posts: 444
Joined: 29 Nov 2016, 08:56

Re: Check if website changed?

25 Feb 2018, 01:00

gregster wrote:If the change is reflected in the HTML, UrlDownloadtoFile or one of the UrlDownloadtoVar user functions should be fine. So that would be the first thing to determine. If you then compare the whole variable/html contents or just parts of it, depends on your goal... the COM interface of Internet Explorer is another option, especially if you are checking for specific details (via the DOM).

But I guess there could be some webpages with javascript that cannot be handled that way. The solution might has to be customized, perhaps even OCR is needed.

I would recommend to tell us the URL, or, if that is no option for you, to take an example URL that has a similar HTML structure.

Btw, there are websites that don't like constant reloading. Every 10 seconds might be ok, but there is no way to be sure. I know websites where you will run into cloudflare protection pages, if you reload them several times a minute over some time. But sometimes there are other solutions, like APIs that might be provided by the site owner.
Here's the website https://www.growtopiagame.com/forums/fo ... ouncements
I tried UrlDownloadtoFile but the thing is, it told me that the website has changed every single time it compared the variables.
I might have done something wrong in the code itself so here's the code that I used.

Code: Select all

UrlDownloadToFile, https://www.growtopiagame.com/forums/forumdisplay.php?43-Announcements, myip.txt
FileRead, originalip, myip.txt
				;save ip when script was launched to originalip

SetTimer, checkip, 10000 ; checks ip every 10seconds, anticipating change.
return

	checkip:
UrlDownloadToFile, https://www.growtopiagame.com/forums/forumdisplay.php?43-Announcements, myip.txt
FileRead, newip, myip.txt
				; saves ip to newip


If (newip = originalip)
return
				;if ip has not changed, return.
	Else 
	{
	SetTimer, checkip, off 
	Msgbox Website Updated!
	}
return ; or ExitApp



Esc:: ExitApp
OCR sounds interesting. I play some text-based games so that could be useful. Can you give me links for this if you know any?
I'm not sure about all the stuffs you have mentioned. Probably just some of their meanings but you get the point I don't have ahk knowledge let a lone computers.

I didn't consider that the fact that the website might not like what I'm doing, thanks for bringing it up. Maybe every minute instead of 10seconds. Please tell me about the API though that seems like a good option.

I'll write a private message to a moderator to see if it's okay with them. They're kind they will give me an answer.
I am your average ahk newbie. Just.. a tat more cute. ;)
User avatar
Nwb
Posts: 444
Joined: 29 Nov 2016, 08:56

Re: Check if website changed?

25 Feb 2018, 01:23

Oh was this thread moved?
I am your average ahk newbie. Just.. a tat more cute. ;)
gregster
Posts: 9021
Joined: 30 Sep 2013, 06:48

Re: Check if website changed?

25 Feb 2018, 01:52

This website doesn't have an API, I'm a afraid. But it is not surprising that its source code is constantly changing. Alone the line "There are currently x users browsing this forum." probably changes a lot.
Perhaps it would be enough just to compare a subset (the beginning ?) of the source code to check for changes. Let's say you cut away the end with the users online line (via string functions), but keep enough of the topics related html code to check for a change. But that would have to be tested... there might also be changes in the html that are not visible in normal browser view.

If that doesn't help, it get's more complicated. So, you probably want to look if the newest (non-sticky?) thread changed?
That means, you have to analyze and compare the source code by using string operations, or - more sophisticated - analyze the DOM structure and access it via Internet Explorer's COM interface. Joe Glines' website has a lot of information about webscraping that could help with understanding how it works (http://the-automator.com/web-scraping-with-autohotkey/). But there are also a lot of forum threads about this techniques. If I have the time, I will try to take a closer look at the website today - but I can't promise.

It is too long ago that I used (some basic) OCR, but there should be some topics on the forum - that said, it is not the easisest topic and most probably not suitable for this problem (working with html should be much more reliable). For some games (like text-based games) it could be helpful, I guess, depending on multiple factors like needed speed and accuracy.
Last edited by gregster on 25 Feb 2018, 01:56, edited 1 time in total.
User avatar
Nwb
Posts: 444
Joined: 29 Nov 2016, 08:56

Re: Check if website changed?

25 Feb 2018, 01:55

gregster wrote:This website doesn't have an API, I'm a afraid. But it is not surprising that its source code is constantly changing. Alone the line "There are currently x users browsing this forum." probably changes a lot.
Perhaps it would be enough just to compare a subset (the beginning ?) of the source code to check for changes. Let's say you cut away the end with the users online line (via string functions), but keep enough of the topics related html code to check for a change. But that would have to be tested... there might also be changes in the html that are not visible in normal browser view.

If that doesn't help, it get's more complicated. So, you probably want to look if the newest (non-sticky?) thread changed?
That means, you have to analyze and compare the source code by using string operations, or - more sophisticated - analyze the DOM structure and access it via Internet Explorer's COM interface. Joe Glines' website has a lot of information about webscraping that could help with understanding how it works (http://the-automator.com/web-scraping-with-autohotkey/). But there are also a lot of forum threads about this techniques. If I have the time, I will try to take a closer look at the website today - but I can't promise.

It is too long ago that I used (some basic) OCR, but there should be some topics on the forum - that said, it is not the easisest topic and most probably not suitable for this problem (working with html should be much more reliable). For some games it may be ok, depending on multiple factors like needed speed and accuracy.
Okay thanks a lot!
Can anybody else help me with gregster's idea if they have experience with this or add to the suggestion. You see when things become complex I get dumbfound. All I know still, is using commands. I don't know anything about using no internet explorer coms etc. :P

Thanks again for the link I will check that out when I have time.
I am your average ahk newbie. Just.. a tat more cute. ;)
gregster
Posts: 9021
Joined: 30 Sep 2013, 06:48

Re: Check if website changed?

26 Feb 2018, 04:52

Here is the simple approach I talked about. Take two url snapshots at different times, cut away the 'users online' part and compare the rest. Seems to work here - but surely won't work everywhere...

Code: Select all

url := "https://www.growtopiagame.com/forums/forumdisplay.php?43-Announcements"

html := URLDownloadToVar(url) 
orig_array := strSplit( html, "There are currently")  
orig := orig_array[1]

;SetTimer, checkip, 60000 				; 60 secs
return

#a::
checkip:
html := URLDownloadToVar(url) 
new_array := strSplit( html, "There are currently")  
new := new_array[1]

If (orig != new)
{
	SetTimer, checkip, off 
	Msgbox Website Updated!
}
else
	traytip, Changed?, No!
orig := new				; needed, if you would keep the timer running instead of ending it after the first encountered change
return ; or ExitApp

;-------------------------------------------------------------------------
Esc:: ExitApp
;-------------------------------------------------------------------------
URLDownloadToVar(url,ByRef variable=""){			
	try
	{	
		hObject:=ComObjCreate("WinHttp.WinHttpRequest.5.1")
		hObject.Open("GET",url)
		hObject.Send()
		variable:=hObject.ResponseText
		return variable
	}
}
I used a user-defined URLDownloadToVar( ) function, so that you don't have to save and re-read the html to/from disk. Also, I commented out the timer and used the hotkey Win+a instead for testing (just un-comment the SetTimer line, if you want). I also added a Traytip for testing, to indicate that there was no change. If you would take the untrimmed htlm source codes you would probably a lot of changes, like you experienced with your code - but the rest of the page doesn't seem to change, unless there is a new post.
Hope that helps!

Btw, one minute updates still seem a bit short on this announcement page - there is not much going on. I would take one hour or more...

Edit: I am not sure, if it really works, because there were no new posts - but when I test it on another subforum it sometimes reports changes, although I don't see anything obvoius. So... this approach might be too simplistic after all :think:
Last edited by gregster on 26 Feb 2018, 05:13, edited 2 times in total.
gregster
Posts: 9021
Joined: 30 Sep 2013, 06:48

Re: Check if website changed?

26 Feb 2018, 05:11

I guess, there are the reported views that change often in the main forums - so no luck. You will have to look into the more sophisticated approaches ;) (or do a lot more string operations)
User avatar
Nwb
Posts: 444
Joined: 29 Nov 2016, 08:56

Re: Check if website changed?

26 Feb 2018, 05:37

gregster wrote:I guess, there are the reported views that change often in the main forums - so no luck. You will have to look into the more sophisticated approaches ;) (or do a lot more string operations)
Oh no I'm not interested in the main forum. Just this subforum for announcements: https://www.growtopiagame.com/forums/fo ... ouncements.

What's the least amount of time for the settimer that you recommend? Because on one hand, the moderator said he directed my message and it might take time to get a reply, and that means that connecting too often can get a bad influence on their server detection. On the other hand, I want to know as soon as possible when there is a new post.

There is one day every month I anticipate a new post. So like literally if I could have the check loop every 1millisecond I would hehe.

Thanks a lot for all the help, greg. :wave:


OH MY GOD IT WORKS LIKE A MIRACLE CAN'T THANK YOU ENOUGH GREG! :dance:
I am your average ahk newbie. Just.. a tat more cute. ;)
gregster
Posts: 9021
Joined: 30 Sep 2013, 06:48

Re: Check if website changed?

26 Feb 2018, 05:48

I am afraid, the problem with the views most probably also applies to the announcements forum. That means, as soon as someone views any of these posts, you will probably get a "false positive" with the script above. But I might have an idea to adapt that script a little... I will take a look later.
gregster
Posts: 9021
Joined: 30 Sep 2013, 06:48

Re: Check if website changed?

26 Feb 2018, 06:01

Just for clarification... the post you are looking for will be a completely new topic and not an updated one?
User avatar
Nwb
Posts: 444
Joined: 29 Nov 2016, 08:56

Re: Check if website changed?

26 Feb 2018, 06:16

gregster wrote:Just for clarification... the post you are looking for will be a completely new topic and not an updated one?
Either of them. Both new topics and updates should warn me.

Oh I see now.. forgot about the views..
I am your average ahk newbie. Just.. a tat more cute. ;)
gregster
Posts: 9021
Joined: 30 Sep 2013, 06:48

Re: Check if website changed?

26 Feb 2018, 09:30

Ok, now the IE COM/DOM approach:
This script looks at the creation times of posts (the 10 topmost ones) and their sequence. Then it compares two snapshots (saved in arrays), again.
If a new post or topic was added, at least one time should have changed. This script looks also at the sticky threads. For testing, again a hotkey instead of a timer and another subforum - where there is a little more action!!
Tested only superficially!
In rare cases, there could be a new post that gets accidentally an "old time" and therefore goes unnoticed- not very likely, but possible. But this should get you started, if you want to add improvements ;)

Code: Select all

#NoEnv  ; Recommended for performance and compatibility with future AutoHotkey releases.
SendMode Input  ; Recommended for new scripts due to its superior speed and reliability.
SetWorkingDir %A_ScriptDir%  ; Ensures a consistent starting directory.
#SingleInstance, Force

arrOldItems := [ ]															; create an empty array
wb := ComObjCreate("InternetExplorer.Application") 								; create a IE instance
wb.Visible := false															; make Internet Explorer run in background
url := "https://www.growtopiagame.com/forums/forumdisplay.php?44-Update-Discussions"	; "https://www.growtopiagame.com/forums/forumdisplay.php?43-Announcements"
gosub, check						; get some start data	
; SetTimer, check, 60000				; 60s
return
;--------------------------------------------------------------------------------------------

#a::
check:
arrItems := [ ]													; create an empty array
wb.Navigate(url)												; Load URL
While wb.busy 													; wait for the page to load
   Sleep, 50

elements := wb.document.getElementsbyClassname("time")
; collect first 10 post creation times
loop 10													; % elements.length
{
		item := elements.item[A_index-1].innerText
		arrItems.push(item)												; add item to array
}
; Compare times with last time
if(arrOldItems.length() > 0)
{
	loop 10											;% arrOldItems.length()
	{
		item := arrItems[A_index]
		item_old := arrOldItems[A_index]
		if(item != item_old)								; compare times with last check's times
		{	
			msgbox "New post or topic!"
			arrOldItems := arrItems.clone()
			return										; Or ExitApp?
		}
	}
	TrayTip, New?, No!
}
else 
	Traytip, First turn, Nothing to compare
arrOldItems := arrItems.clone()					; copy ("clone") array object for later use
return

;--------------------------------------------------------------------------------------------
Esc::
wb:=""
ExitApp
Edit: removed a little bug. Find the other bugs :D
Edits 2+3: small improvements: once it found a difference it wouldn't look anymore...
User avatar
Nwb
Posts: 444
Joined: 29 Nov 2016, 08:56

Re: Check if website changed?

26 Feb 2018, 11:22

gregster wrote:Ok, now the IE COM/DOM approach:
This script looks at the creation times of posts (the 10 topmost ones) and their sequence. Then it compares two snapshots (saved in arrays), again.
If a new post or topic was added, at least one time should have changed. This script looks also at the sticky threads. For testing, again a hotkey instead of a timer and another subforum - where there is a little more action!!
Tested only superficially!
In rare cases, there could be a new post that gets accidentally an "old time" and therefore goes unnoticed- not very likely, but possible. But this should get you started, if you want to add improvements ;)

Code: Select all

#NoEnv  ; Recommended for performance and compatibility with future AutoHotkey releases.
SendMode Input  ; Recommended for new scripts due to its superior speed and reliability.
SetWorkingDir %A_ScriptDir%  ; Ensures a consistent starting directory.
#SingleInstance, Force

arrOldItems := [ ]															; create an empty array
wb := ComObjCreate("InternetExplorer.Application") 								; create a IE instance
wb.Visible := false															; make Internet Explorer run in background
url := "https://www.growtopiagame.com/forums/forumdisplay.php?44-Update-Discussions"	; "https://www.growtopiagame.com/forums/forumdisplay.php?43-Announcements"
gosub, check						; get some start data	
; SetTimer, check, 60000				; 60s
return
;--------------------------------------------------------------------------------------------

#a::
check:
arrItems := [ ]													; create an empty array
wb.Navigate(url)												; Load URL
While wb.busy 													; wait for the page to load
   Sleep, 50

elements := wb.document.getElementsbyClassname("time")
; collect first 10 post creation times
loop 10													; % elements.length
{
		item := elements.item[A_index-1].innerText
		arrItems.push(item)												; add item to array
}
; Compare times with last time
if(arrOldItems.length() > 0)
{
	loop 10											;% arrOldItems.length()
	{
		item := arrItems[A_index]
		item_old := arrOldItems[A_index]
		if(item != item_old)								; compare times with last check's times
		{	
			msgbox "New post or topic!"
			arrOldItems := arrItems.clone()
			return										; Or ExitApp?
		}
	}
	TrayTip, New?, No!
}
else 
	Traytip, First turn, Nothing to compare
arrOldItems := arrItems.clone()					; copy ("clone") array object for later use
return

;--------------------------------------------------------------------------------------------
Esc::
wb:=""
ExitApp
Edit: removed a little bug. Find the other bugs :D
Edits 2+3: small improvements: once it found a difference it wouldn't look anymore...
Thanks a lot dude :dance: .. You're a dang genius. :lol:
I am your average ahk newbie. Just.. a tat more cute. ;)
gregster
Posts: 9021
Joined: 30 Sep 2013, 06:48

Re: Check if website changed?

26 Feb 2018, 13:58

I wish I was. But remember: No warranty here! :D
But I have it running and I try it from time to time (via Win+a) and compare with the actual page. The latest version seems to work - of course the html code needs to be downloaded and analyzed - and that takes a bit of time.
Just remember that you need to un-comment the timer, if you want to automate it. But it is very hard to say, which interval will be ok for the forum owners. It depends on their server and what they think is reasonable. I have seen totally different policys in this regard.
Bamsi-73
Posts: 1
Joined: 12 Apr 2021, 09:05

Re: Check if website changed?

12 Apr 2021, 09:10

It works very good. :bravo: Thanks :dance:

Return to “Gaming Help (v1)”

Who is online

Users browsing this forum: Rohwedder, yuu453 and 85 guests