AutoHotkey Community

It is currently May 27th, 2012, 10:12 am

All times are UTC [ DST ]




Post new topic Reply to topic  [ 9 posts ] 
Author Message
 Post subject: Search for URLs
PostPosted: November 2nd, 2004, 2:27 am 
Offline

Joined: June 24th, 2004, 1:00 am
Posts: 114
Location: Malta
Hi all!
Well Let me just say this first.
I go to sea for 15 day periods and when I come back to see what's new with AHK I must say it's a joy every time.
So many things to try out.

And now to my prob.
I have been trying to find a way to select Links in a text file for some time but have not been very successful.
I want to prune a text file and leave only the urls and links. the start is not so difficult I search for http, www,ftp etc but the end of the string can be anything, from .htm to just a normal word like when it's just pointing to a folder.
Any ideas?

Thanks all
Regards
CG


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: November 2nd, 2004, 4:21 am 
Offline

Joined: October 27th, 2004, 1:22 am
Posts: 64
Location: GA
http://www.autohotkey.com/forum/viewtopic.php?t=1183

this topic might be able to help you out.. I was trying to click on a link, and bobo's script help me out it uses the % name to find the matching url link .

Here is Bobo's sript:

Quote:
URLDownloadToFile, http://www.webster.com/cgi-bin/dictiona ... ionary&va=%MyWord%, Webster.txt
IfErrorlevel = 0
Loop, Read, Webster.txt
{
IfInString, A_LoopReadLine, .wav=%MyWord%
{
StringGetPos, OutputVar, A_LoopReadLine, .wav
Outputvar -= 7
StringMid, SoundFile, A_LoopReadLine, %OutputVar%, 8
Run, "C:\Program Files\Internet Explorer\IEXPLORE.EXE" "http://www.webster.com/cgi-bin/audio.pl?%SoundFile%.wav=%MyWord%",,
Break
}
}


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: November 2nd, 2004, 9:15 am 
Helpfull would be a sample of your source (the text file you wanna parse for links).

As parsing is (IMHO) to search for patterns - I guess the killer criteria to identify a link is that it starts with one of your above mentioned prefixes (http, ftp, ...) and it ends with ... a space ? A delimiter (comma, semi-colon, ...) ?

Do you create the source yourself ? Is it a downloaded weppage ? Do you like chicken ? :lol: :lol: :lol:

8)


Report this post
Top
  
Reply with quote  
 Post subject:
PostPosted: November 2nd, 2004, 1:54 pm 
Offline

Joined: March 2nd, 2004, 3:36 pm
Posts: 10720
Quote:
I want to prune a text file and leave only the urls and links.
This has not been thoroughly tested but appears to work. There are probably better utilites out there to extract links from a web page:
Code:
; Example #2: A working script that attempts to extract all FTP and HTTP
; URLs from a text or HTML file:
FileSelectFile, SourceFile, 3,, Pick a text or HTML file to analyze.
if SourceFile =
   return  ; This will exit in this case.

SplitPath, SourceFile,, SourceFilePath,, SourceFileNoExt
DestFile = %SourceFilePath%\%SourceFileNoExt% Extracted Links.txt

IfExist, %DestFile%
{
   MsgBox, 4,, Overwrite the existing links file? Press No to append to it.`n`nFILE: %DestFile%
   IfMsgBox, Yes
      FileDelete, %DestFile%
}

LinkCount = 0
Loop, read, %SourceFile%, %DestFile%
{
   URLSearchString = %A_LoopReadLine%
   Gosub, URLSearch
}
MsgBox %LinkCount% links were found and written to "%DestFile%".
return


URLSearch:
; It's done this particular way because some URLs have other URLs embedded inside them:
StringGetPos, URLStart1, URLSearchString, http://
StringGetPos, URLStart2, URLSearchString, ftp://
StringGetPos, URLStart3, URLSearchString, www.

; Find the left-most starting position:
URLStart = %URLStart1%  ; Set starting default.
Loop
{
   ; It helps performance (at least in a script with many variables) to resolve
   ; "URLStart%A_Index%" only once:
   StringTrimLeft, ArrayElement, URLStart%A_Index%, 0
   if ArrayElement =  ; End of the array has been reached.
      break
   if ArrayElement = -1  ; This element is disqualified.
      continue
   if URLStart = -1
      URLStart = %ArrayElement%
   else ; URLStart has a valid position in it, so compare it with ArrayElement.
   {
      if ArrayElement <> -1
         if ArrayElement < %URLStart%
            URLStart = %ArrayElement%
   }
}

if URLStart = -1  ; No URLs exist in URLSearchString.
   return

; Otherwise, extract this URL, then find its ending space or tab:
StringTrimLeft, URL, URLSearchString, %URLStart%  ; Omit the beginning/irrelevant part.
Loop, parse, URL, %A_Tab%%A_Space%<>  ; Find the first space, tab, or angle (if any).
{
   URL = %A_LoopField%
   break  ; i.e. perform only one loop iteration to fetch the first "field".
}
; If the above loop had zero iterations because there are no spaces or tabs,
; leave the contents of the URL var untouched.

; If the URL ends in a double quote, remove it.  For now, StringReplace is used, but
; note that it seems that double quotes can legitimately exist inside URLs, so this
; might damage them:
StringReplace, URLCleansed, URL, ",, All
FileAppend, %URLCleansed%`n
LinkCount += 1

; See if there are any other URLs in this line:
StringLen, CharactersToOmit, URL
CharactersToOmit += %URLStart%
StringTrimLeft, URLSearchString, URLSearchString, %CharactersToOmit%
Gosub, URLSearch  ; Recursive call to self.
return

Edit: Made file selectable via FileSelectFile.
Edit2: Added detection of < and > characters as terminators of a URL. Possibly some other improvements.


Last edited by Chris on September 23rd, 2005, 5:17 pm, edited 5 times in total.

Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: November 2nd, 2004, 2:02 pm 
Offline

Joined: July 22nd, 2004, 6:33 am
Posts: 193
Location: cedar city UT
thanx for that example chris glad i came by the forum i was always wondering how that was done hope it helps bahri :lol: 8)

_________________
^sleepy^


Report this post
Top
 Profile  
Reply with quote  
 Post subject: Search for URLs
PostPosted: November 2nd, 2004, 2:41 pm 
Offline

Joined: June 24th, 2004, 1:00 am
Posts: 114
Location: Malta
Hi!
Thanks chris and all, I will try things out and see how it goes as chris's code needs some munching (not chicken legs).
Regards
CG


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: November 2nd, 2004, 4:33 pm 
Offline

Joined: October 27th, 2004, 1:22 am
Posts: 64
Location: GA
Sorry I didn't copy the entir sript
SetBatchLines -1
MyWord = bumblebee

URLDownloadToFile, http://www.webster.com/cgi-bin/dictiona ... ionary&va=%MyWord%, Webster.txt
IfErrorlevel = 0
Loop, Read, Webster.txt
{
IfInString, A_LoopReadLine, .wav=%MyWord%
{
StringGetPos, OutputVar, A_LoopReadLine, .wav
Outputvar -= 7
StringMid, SoundFile, A_LoopReadLine, %OutputVar%, 8
Run, "C:\Program Files\Internet Explorer\IEXPLORE.EXE" "http://www.webster.com/cgi-bin/audio.pl?%SoundFile%.wav=%MyWord%",, Hide
Break
}
}
Exit

IfInString, A_LoopReadLine, .wav=%MyWord% this line search for the matching wave file link for the MyWord, which is define in the previos line MyWord = bumblebee, and click it. I used this sript with my own in which it was pasting the word from the clip board to the MyWor d value (MyWord=%clipboard%)

There is some more stuff in this autohotkey topic:
http://www.autohotkey.com/forum/viewtopic.php?t=1183


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: August 31st, 2011, 9:42 pm 
Offline

Joined: June 1st, 2007, 2:00 pm
Posts: 180
tpatel5 wrote:
http://www.autohotkey.com/forum/viewtopic.php?t=1183

this topic might be able to help you out.. I was trying to click on a link, and bobo's script help me out it uses the % name to find the matching url link .

Here is Bobo's sript:

Quote:
URLDownloadToFile, http://www.webster.com/cgi-bin/dictiona ... ionary&va=%MyWord%, Webster.txt
IfErrorlevel = 0
Loop, Read, Webster.txt
{
IfInString, A_LoopReadLine, .wav=%MyWord%
{
StringGetPos, OutputVar, A_LoopReadLine, .wav
Outputvar -= 7
StringMid, SoundFile, A_LoopReadLine, %OutputVar%, 8
Run, "C:\Program Files\Internet Explorer\IEXPLORE.EXE" "http://www.webster.com/cgi-bin/audio.pl?%SoundFile%.wav=%MyWord%",,
Break
}
}


I would like the possibility of process a group of files, not only one.

Best Regards


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: August 31st, 2011, 10:01 pm 
Offline

Joined: December 26th, 2010, 7:40 pm
Posts: 4172
Location: Awesometown, USA
You can Loop through a filepattern: Loop (Files and folders)

_________________
Autofire, AutoClick, Toggle, SpamWindow Control Tools
Recommended: AutoHotkey_L


Report this post
Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 9 posts ] 

All times are UTC [ DST ]


Who is online

Users browsing this forum: Bing [Bot], BrandonHotkey, Maestr0, Yahoo [Bot] and 63 guests


You can post new topics in this forum
You can reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Powered by phpBB® Forum Software © phpBB Group