 |
AutoHotkey Community Let's help each other out
|
| View previous topic :: View next topic |
| Author |
Message |
Payam
Joined: 07 Apr 2004 Posts: 59
|
Posted: Wed Sep 29, 2004 9:17 am Post subject: Parsing html files...an easier way? |
|
|
1) Do you guys know of a method or command line tool that will strip a file of its html tags?
Parsing is a bit tedious to code (i.e. StringRight StringGetPos StringTrimLeft etc ) but add html into there and its truly an amazon jungle.
2) On another note, do you have any ideas on how programs "submit" data/forms behind the scenes (i.e. not by clicking buttons on browser but by interactively sending and retreiving data). I do know and use this method which works on MOST things: http://somedomain.com?&query=xxx&sort=33 and just launch a browser or do a UrlDownloadToFile (I find the parameters by looking at the html source code).
My question in essense then applies to forms where there is a Session ID for example. Or, how do Auction Sniping programs work and is it possible to do something similar using AHK? (i.e. login in the background, fill appropriate forms etc)
Thanks for your input.
Payam |
|
| Back to top |
|
 |
Pallie
Joined: 05 Jul 2004 Posts: 57 Location: London
|
|
| Back to top |
|
 |
Payam
Joined: 07 Apr 2004 Posts: 59
|
Posted: Wed Sep 29, 2004 6:21 pm Post subject: |
|
|
Thanks, although I am familiar with your posts and those methods. What I am looking for is *behind the background* data transfer/manipulation (so no browser needed). As an example, similar method (Its just that I dont know HOW they do it) of desktop programs that will BID for you on eBay (it never opens a browser etc, yet it logs in for you and bids).
I appreciate any input. |
|
| Back to top |
|
 |
jordis
Joined: 30 Jul 2004 Posts: 79
|
Posted: Wed Sep 29, 2004 9:18 pm Post subject: |
|
|
I think this utility would probably help you...
I guess you could be interested in the POST features...
[quoted from another post]
| BoBo wrote: | As it's not a big thing to connect external apps to AHK through the command line have a look at cURL
| Quote: | Curl is a command line tool for transferring files with URL syntax, supporting FTP, FTPS, HTTP, HTTPS, GOPHER, TELNET, DICT, FILE and LDAP. Curl supports HTTPS certificates, HTTP POST, HTTP PUT, FTP uploading, kerberos, HTTP form based upload, proxies, cookies, user+password authentication, file transfer resume, http proxy tunneling and a busload of other useful tricks.
Curl is free and open software that compiles under a wide variety of operating systems. Curl exists thanks to efforts from many authors, with Daniel Stenberg being primary author and project maintainer. Curl is the result of many spare time hours of programming. Voluntary contributions are vital. We greatly appreciate your help! |
[more...] |
|
|
| Back to top |
|
 |
Payam
Joined: 07 Apr 2004 Posts: 59
|
Posted: Thu Sep 30, 2004 4:14 am Post subject: |
|
|
Thanks! I will take a hefty look at this.
Also, regarding my earlier question #1, any way to easily parse html files (or remove the tags)? |
|
| Back to top |
|
 |
James Guest
|
Posted: Sat Mar 19, 2005 1:06 am Post subject: Easy way to strip HTML tags |
|
|
I'd use Perl to do the strip. For example, if you wanted to remove all the tags as a filter you could do this.
command1 | perl -p -e "s/\<.+?\>//g" | command2
The -p option says to process each line in the file and print it out after it is processed. With each line you will substitute anything inside <> with nothing.
You can get a copy of perl (open source, freeware) from www.perl.org or you can download a freeware windows binary with installer from www.activestate.com
BTW, Perl has a number of scripts and modules specifically designed to interact with web sites without the need for a browser.
Hope this helps ...
- James |
|
| Back to top |
|
 |
|
|
You can post new topics in this forum You can reply to topics in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|