AutoHotkey Homepage AutoHotkey Community
Let's help each other out
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

URL Encoding and Decoding of Special Characters

 
Reply to topic    AutoHotkey Community Forum Index -> Ask for Help
View previous topic :: View next topic  
Author Message
ESUHSD
Guest





PostPosted: Thu May 03, 2007 6:16 pm    Post subject: URL Encoding and Decoding of Special Characters Reply with quote

I am trying to write a simple function to encode and decode special characters in URLs.

For example if given the following URL:

http://www.someplace.com/a%20folder/2nd%2Dfolder/

The function would return:

http://www.someplace.com/a folder/2nd-folder/

or vice-versa.

Here's what I have so far:
Code:
AutoTrim, Off
url_temp_clip = %clipboard%
Send, ^c
clipboard := RegExReplace(clipboard, "\%([0-9A-F]{2})" , hex_to_dec("0x$1"))
Send, ^v
clipboard = %url_temp_clip%
AutoTrim, On

hex_to_dec(x)
{
   Loop
      If RegExMatch(x, "i)(.*)(0x[a-f\d]*)(.*)", y)
         x := y1 . y2+0 . y3           ; convert hex numbers to decimal
      Else Break
   return x
}


In Perl the code would be something like:
Code:

$str =~ s/([^A-Za-z0-9])/sprintf("%%%02X", ord($1))/seg; # Encode string
$str =~ s/\%([A-Fa-f0-9]{2})/pack('C', hex($1))/seg; # Decode


But I can't seem to get the same thing done in AHK. Any help would be much appreciated.
Back to top
engunneer



Joined: 30 Aug 2005
Posts: 8255
Location: Maywood, IL

PostPosted: Thu May 03, 2007 6:17 pm    Post subject: Reply with quote

I have seen scripts for this - did you search the forum?
_________________

(Common Answers)
Back to top
View user's profile Send private message Visit poster's website
ESUHSD
Guest





PostPosted: Thu May 03, 2007 6:20 pm    Post subject: Reply with quote

Yes. I tried searching for things like:

URL encode
URL decode
Hex decode
convert URL

Found some things that were similar but not what I was looking for.
Back to top
polyethene



Joined: 11 Aug 2004
Posts: 5244
Location: UK

PostPosted: Thu May 03, 2007 6:41 pm    Post subject: URI/URL Encode and Decode functions Reply with quote

Here are two functions:

Code:
uriDecode(str) {
   Loop
      If RegExMatch(str, "i)(?<=%)[\da-f]{1,2}", hex)
         StringReplace, str, str, `%%hex%, % Chr("0x" . hex), All
      Else Break
   Return, str
}

uriEncode(str) {
   f = %A_FormatInteger%
   SetFormat, Integer, Hex
   If RegExMatch(str, "^\w+:/{0,2}", pr)
      StringTrimLeft, str, str, StrLen(pr)
   StringReplace, str, str, `%, `%25, All
   Loop
      If RegExMatch(str, "i)[^\w\.~%]", char)
         StringReplace, str, str, %char%, % "%" . Asc(char), All
      Else Break
   SetFormat, Integer, %f%
   Return, pr . str
}

e.g. MsgBox, % uriDecode("http://www.someplace.com/a%20folder/2nd%2Dfolder/")
_________________
GitHubScriptsIronAHK Contact by email not private message.
Back to top
View user's profile Send private message Send e-mail Visit poster's website
ESUHSD
Guest





PostPosted: Thu May 03, 2007 6:50 pm    Post subject: Thanks! Reply with quote

Exactly what I'm looking for. Thanks for the quick response!
Back to top
SKAN



Joined: 26 Dec 2005
Posts: 8688

PostPosted: Thu May 03, 2007 6:59 pm    Post subject: Reply with quote

@Titan. Nice pair of useful functions. Smile
Back to top
View user's profile Send private message Send e-mail
ESUHSD
Guest





PostPosted: Fri May 04, 2007 9:09 pm    Post subject: Reply with quote

I made some modifications to get the shortcut I was trying to create. Here is the code in case anyone is interested.
Code:
^!+5::
   AutoTrim, Off
   url_temp_clip = %clipboard%
   Send, ^c
   IfInString, clipboard, `%
   {
      clipboard := uriDecode(clipboard)
   }
   else
   {
      clipboard := uriEncode(clipboard)
   }
   Send, ^v
   clipboard = %url_temp_clip%
   AutoTrim, On
return

uriDecode(str)
{
   ; Find uri encoded characters such as %20 (space) and replace with ascii character

   pos = 1
   Loop
      If pos := RegExMatch(str, "i)(?<=%)[\da-f]{2}", hex, pos++)
         StringReplace, str, str, `%%hex%, % Chr("0x" . hex), All
      Else Break
   Return, str
}

uriEncode(str)
{
   ; Replace characters with uri encoded version except for letters, numbers,
   ; and the following: /.~:&=-

   f = %A_FormatInteger%
   SetFormat, Integer, Hex
   pos = 1
   Loop
      If pos := RegExMatch(str, "i)[^\/\w\.~`:%&=-]", char, pos++)
         StringReplace, str, str, %char%, % "%" . Asc(char), All
      Else Break
   SetFormat, Integer, %f%
   StringReplace, str, str, 0x, , All
   Return, str
}
Back to top
Ageless



Joined: 15 May 2007
Posts: 2

PostPosted: Fri Jun 15, 2007 2:27 am    Post subject: Odd... Reply with quote

Strange, when I run the uriEncode() function that Titan posted, I had the following problems:

1) It was encoding characters that didn't need to be encoded. In my case, these were colons (:) and forward slashes (/).

2) It was outputting hex character numbers with a 0x at the beginning, which is consistent with what the SetFormat documentation says regarding hexadecimal format:

Quote:
Hexadecimal numbers all start with the prefix 0x (e.g. 0xA9).


So I modified the function slightly, here is the result:

Code:
uriEncode(str) {
   f = %A_FormatInteger%
   SetFormat, Integer, Hex
   If RegExMatch(str, "^\w+:/{0,2}", pr)
      StringTrimLeft, str, str, StrLen(pr)
   StringReplace, str, str, `%, `%25, All
   Loop
      If RegExMatch(str, "i)[^\w\.~%/:]", char)
         StringReplace, str, str, %char%, % "%" . SubStr(Asc(char),3), All
      Else Break
   SetFormat, Integer, %f%
   Return, pr . str
}


The second RegExMatch() now ignores colons and forward slashes, and the Asc() function is now wrapped with a SubStr that strips the extra characters.

Please note that I made no attempt at a universal fix: I just modified the parts that were causing problems in my specific and very limited usage of this function.

Thanks for posting the original code, Titan!

PS: I'm a AutoHotKey noob, so I'm probably missing something obvious that would explain why I was having problems with the function in the first place. I'm just posting this in case someone else has the same problems that I was having.
Back to top
View user's profile Send private message
JoeSchmoe as guest
Guest





PostPosted: Wed Apr 21, 2010 5:22 pm    Post subject: Figuring out Titan's regex Reply with quote

Hello,

I'm having trouble figuring out one of the two regular expressions that Titan used in his code. I'd like to modify his code somewhat, and I need to figure out what is going on first.

The code in question is at the heart of the encoding function:
Code:
If RegExMatch(str, "i)[^\w\.~%]", char)
   StringReplace, str, str, %char%, % "%" . Asc(char), All


It looks to me like that regular expression would match any single character that isn't whitespace, a period, a ~, or a %. Wouldn't this catch every single alphanumeric character?

It seems like better code would be to insert an escape character before the caret to remove its special meaning:
Code:
If RegExMatch(str, "i)[\^\w\.~%]", char)
   StringReplace, str, str, %char%, % "%" . Asc(char), All


Am I missing something? Was Titan's code for an older version of the PCRE lib in which carets didn't have a special meaning?
Back to top
sinkfaze



Joined: 18 Mar 2008
Posts: 5010
Location: the tunnel(?=light)

PostPosted: Wed Apr 21, 2010 5:42 pm    Post subject: Re: Figuring out Titan's regex Reply with quote

JoeSchmoe wrote:
It looks to me like that regular expression would match any single character that isn't whitespace, a period, a ~, or a %.


That should be any single character that isn't a word character (isn't an alphanumeric character), a literal period, a ~ or a %.
_________________
Try Quick Search for Autohotkey or see the tutorial for newbies.
Back to top
View user's profile Send private message Send e-mail
iamattamai



Joined: 06 Nov 2010
Posts: 3
Location: Atlanta, GA, USA

PostPosted: Sat Nov 06, 2010 1:40 pm    Post subject: Stuck on this one -- post not working Reply with quote

I am trying to convert a string of csv text to URL-format to be posted using uriEncode and httpQuery together.
I am able to post simple strings using the code below, but not an example like shown -- I suspect it might be the % signs???
Admittedly a newbie to ahk and assistance much appreciated.

Code:

#noenv
uriEncode(str)
{
   ; Replace characters with uri encoded version except for letters, numbers,
   ; and the following: /.~:&=-

   f = %A_FormatInteger%
   SetFormat, Integer, Hex
   pos = 1
   Loop
      If pos := RegExMatch(str, "i)[^\/\w\.~`:%&=-]", char, pos++)
         StringReplace, str, str, %char%, % "%" . Asc(char), All
      Else Break
   SetFormat, Integer, %f%
   StringReplace, str, str, 0x, , All
   Return, str
}

estring = 2010,11,5,18,0,55,"ROC177262","CPSOS - Search - Search",""2010,11,5,18,1,0,"ROC177262","Logon

status",""2010,11,5,18,1,6,"ROC177262","AT&T U-Verse CRM Customer Interaction Manager : Release 14 -

csrPG4cmem105",""2010,11,5,18,1,9,"ROC177262","AT&T Wireline - Synchronoss Technologies, Inc. - Windows Inter - \\Remote, 128-bit SSL/TLS.",""

newstring = % uriEncode(estring)
msgbox, %newstring% ;valid conversion confirmed here

html     := ""
URL      := "http://www.mysite.com/act_raw_upload.cfm"
POSTData := "raw_data= %newstring%"

length := httpQuery(html,URL,POSTdata)
varSetCapacity(html,-1)
   
#include httpQuery.ahk
Back to top
View user's profile Send private message
[VxE]



Joined: 07 Oct 2006
Posts: 3234
Location: Simi Valley, CA

PostPosted: Sat Nov 06, 2010 5:25 pm    Post subject: Reply with quote

Instead of
Code:
POSTData := "raw_data= %newstring%"

try
Code:
POSTData := "raw_data=" newstring


and take a look at FAQ: When exactly are variable names enclosed in percent signs?
_________________
Ternary (a ? b : c) guide     TSV Table Manipulation Library
Post code inside [code][/code] tags!
Back to top
View user's profile Send private message
iamattamai



Joined: 06 Nov 2010
Posts: 3
Location: Atlanta, GA, USA

PostPosted: Sun Nov 07, 2010 12:45 am    Post subject: What about CR/LF encoding? Reply with quote

Many thanks VxE.
I also found that switching to the later post/mod of the uriEncoder by Ageless sealed the deal.

One more humble request: How would you modify the Ageless code below to encode CR/LF characters in the source string? I'm a challenged with code as terse as this stuff.

Code:

uriEncode(str) {
   f = %A_FormatInteger%
   SetFormat, Integer, Hex
   If RegExMatch(str, "^\w+:/{0,2}", pr)
      StringTrimLeft, str, str, StrLen(pr)
   StringReplace, str, str, `%, `%25, All
   Loop
      If RegExMatch(str, "i)[^\w\.~%/:]", char)
         StringReplace, str, str, %char%, % "%" . SubStr(Asc(char),3), All
      Else Break
   SetFormat, Integer, %f%
   Return, pr . str
}
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    AutoHotkey Community Forum Index -> Ask for Help All times are GMT
Page 1 of 1

 
Jump to:  
You can post new topics in this forum
You can reply to topics in this forum


Powered by phpBB © 2001, 2005 phpBB Group