AutoHotkey Homepage AutoHotkey Community
Let's help each other out
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Parsing html files...an easier way?

 
Post new topic   Reply to topic    AutoHotkey Community Forum Index -> Ask for Help
View previous topic :: View next topic  
Author Message
Payam



Joined: 07 Apr 2004
Posts: 59

PostPosted: Wed Sep 29, 2004 9:17 am    Post subject: Parsing html files...an easier way? Reply with quote

1) Do you guys know of a method or command line tool that will strip a file of its html tags?
Parsing is a bit tedious to code (i.e. StringRight StringGetPos StringTrimLeft etc Smile ) but add html into there and its truly an amazon jungle.


2) On another note, do you have any ideas on how programs "submit" data/forms behind the scenes (i.e. not by clicking buttons on browser but by interactively sending and retreiving data). I do know and use this method which works on MOST things: http://somedomain.com?&query=xxx&sort=33 and just launch a browser or do a UrlDownloadToFile (I find the parameters by looking at the html source code).
My question in essense then applies to forms where there is a Session ID for example. Or, how do Auction Sniping programs work and is it possible to do something similar using AHK? (i.e. login in the background, fill appropriate forms etc)

Thanks for your input.
Payam
Back to top
View user's profile Send private message Send e-mail MSN Messenger
Pallie



Joined: 05 Jul 2004
Posts: 57
Location: London

PostPosted: Wed Sep 29, 2004 9:43 am    Post subject: Reply with quote

Depending on what it is you want to do, you may want to take a look at this post.
http://www.autohotkey.com/forum/viewtopic.php?t=923
or this one
http://www.autohotkey.com/forum/viewtopic.php?t=911.
I have had some success entering JavaScript in the address bar to get information about the page's HTML and manipulate the forms it contains. If you give me some more details on what you are trying to achieve maybe I can help.

Mike
Back to top
View user's profile Send private message
Payam



Joined: 07 Apr 2004
Posts: 59

PostPosted: Wed Sep 29, 2004 6:21 pm    Post subject: Reply with quote

Pallie wrote:
Depending on what it is you want to do, you may want to take a look at this post.
http://www.autohotkey.com/forum/viewtopic.php?t=923
or this one
http://www.autohotkey.com/forum/viewtopic.php?t=911.
I have had some success entering JavaScript in the address bar to get information about the page's HTML and manipulate the forms it contains. If you give me some more details on what you are trying to achieve maybe I can help.

Mike


Thanks, although I am familiar with your posts and those methods. What I am looking for is *behind the background* data transfer/manipulation (so no browser needed). As an example, similar method (Its just that I dont know HOW they do it) of desktop programs that will BID for you on eBay (it never opens a browser etc, yet it logs in for you and bids).

I appreciate any input.
Back to top
View user's profile Send private message Send e-mail MSN Messenger
jordis



Joined: 30 Jul 2004
Posts: 79

PostPosted: Wed Sep 29, 2004 9:18 pm    Post subject: Reply with quote

I think this utility would probably help you...
I guess you could be interested in the POST features...
[quoted from another post]

BoBo wrote:
As it's not a big thing to connect external apps to AHK through the command line have a look at cURL

Quote:
Curl is a command line tool for transferring files with URL syntax, supporting FTP, FTPS, HTTP, HTTPS, GOPHER, TELNET, DICT, FILE and LDAP. Curl supports HTTPS certificates, HTTP POST, HTTP PUT, FTP uploading, kerberos, HTTP form based upload, proxies, cookies, user+password authentication, file transfer resume, http proxy tunneling and a busload of other useful tricks.

Curl is free and open software that compiles under a wide variety of operating systems. Curl exists thanks to efforts from many authors, with Daniel Stenberg being primary author and project maintainer. Curl is the result of many spare time hours of programming. Voluntary contributions are vital. We greatly appreciate your help!


[more...]
Back to top
View user's profile Send private message
Payam



Joined: 07 Apr 2004
Posts: 59

PostPosted: Thu Sep 30, 2004 4:14 am    Post subject: Reply with quote

Thanks! I will take a hefty look at this.

Also, regarding my earlier question #1, any way to easily parse html files (or remove the tags)?
Back to top
View user's profile Send private message Send e-mail MSN Messenger
James
Guest





PostPosted: Sat Mar 19, 2005 1:06 am    Post subject: Easy way to strip HTML tags Reply with quote

I'd use Perl to do the strip. For example, if you wanted to remove all the tags as a filter you could do this.

command1 | perl -p -e "s/\<.+?\>//g" | command2

The -p option says to process each line in the file and print it out after it is processed. With each line you will substitute anything inside <> with nothing.

You can get a copy of perl (open source, freeware) from www.perl.org or you can download a freeware windows binary with installer from www.activestate.com

BTW, Perl has a number of scripts and modules specifically designed to interact with web sites without the need for a browser.

Hope this helps ...

- James
Back to top
Display posts from previous:   
Post new topic   Reply to topic    AutoHotkey Community Forum Index -> Ask for Help All times are GMT
Page 1 of 1

 
Jump to:  
You can post new topics in this forum
You can reply to topics in this forum


Powered by phpBB © 2001, 2005 phpBB Group