 |
AutoHotkey Community Let's help each other out
|
| View previous topic :: View next topic |
| Author |
Message |
ESUHSD Guest
|
Posted: Thu May 03, 2007 6:16 pm Post subject: URL Encoding and Decoding of Special Characters |
|
|
I am trying to write a simple function to encode and decode special characters in URLs.
For example if given the following URL:
http://www.someplace.com/a%20folder/2nd%2Dfolder/
The function would return:
http://www.someplace.com/a folder/2nd-folder/
or vice-versa.
Here's what I have so far:
| Code: | AutoTrim, Off
url_temp_clip = %clipboard%
Send, ^c
clipboard := RegExReplace(clipboard, "\%([0-9A-F]{2})" , hex_to_dec("0x$1"))
Send, ^v
clipboard = %url_temp_clip%
AutoTrim, On
hex_to_dec(x)
{
Loop
If RegExMatch(x, "i)(.*)(0x[a-f\d]*)(.*)", y)
x := y1 . y2+0 . y3 ; convert hex numbers to decimal
Else Break
return x
} |
In Perl the code would be something like: | Code: |
$str =~ s/([^A-Za-z0-9])/sprintf("%%%02X", ord($1))/seg; # Encode string
$str =~ s/\%([A-Fa-f0-9]{2})/pack('C', hex($1))/seg; # Decode |
But I can't seem to get the same thing done in AHK. Any help would be much appreciated. |
|
| Back to top |
|
 |
engunneer
Joined: 30 Aug 2005 Posts: 8255 Location: Maywood, IL
|
Posted: Thu May 03, 2007 6:17 pm Post subject: |
|
|
I have seen scripts for this - did you search the forum? _________________
(Common Answers) |
|
| Back to top |
|
 |
ESUHSD Guest
|
Posted: Thu May 03, 2007 6:20 pm Post subject: |
|
|
Yes. I tried searching for things like:
URL encode
URL decode
Hex decode
convert URL
Found some things that were similar but not what I was looking for. |
|
| Back to top |
|
 |
polyethene
Joined: 11 Aug 2004 Posts: 5244 Location: UK
|
Posted: Thu May 03, 2007 6:41 pm Post subject: URI/URL Encode and Decode functions |
|
|
Here are two functions:
| Code: | uriDecode(str) {
Loop
If RegExMatch(str, "i)(?<=%)[\da-f]{1,2}", hex)
StringReplace, str, str, `%%hex%, % Chr("0x" . hex), All
Else Break
Return, str
}
uriEncode(str) {
f = %A_FormatInteger%
SetFormat, Integer, Hex
If RegExMatch(str, "^\w+:/{0,2}", pr)
StringTrimLeft, str, str, StrLen(pr)
StringReplace, str, str, `%, `%25, All
Loop
If RegExMatch(str, "i)[^\w\.~%]", char)
StringReplace, str, str, %char%, % "%" . Asc(char), All
Else Break
SetFormat, Integer, %f%
Return, pr . str
} |
e.g. MsgBox, % uriDecode("http://www.someplace.com/a%20folder/2nd%2Dfolder/") _________________ GitHub • Scripts • IronAHK • Contact by email not private message. |
|
| Back to top |
|
 |
ESUHSD Guest
|
Posted: Thu May 03, 2007 6:50 pm Post subject: Thanks! |
|
|
| Exactly what I'm looking for. Thanks for the quick response! |
|
| Back to top |
|
 |
SKAN
Joined: 26 Dec 2005 Posts: 8688
|
Posted: Thu May 03, 2007 6:59 pm Post subject: |
|
|
@Titan. Nice pair of useful functions.  |
|
| Back to top |
|
 |
ESUHSD Guest
|
Posted: Fri May 04, 2007 9:09 pm Post subject: |
|
|
I made some modifications to get the shortcut I was trying to create. Here is the code in case anyone is interested.
| Code: | ^!+5::
AutoTrim, Off
url_temp_clip = %clipboard%
Send, ^c
IfInString, clipboard, `%
{
clipboard := uriDecode(clipboard)
}
else
{
clipboard := uriEncode(clipboard)
}
Send, ^v
clipboard = %url_temp_clip%
AutoTrim, On
return
uriDecode(str)
{
; Find uri encoded characters such as %20 (space) and replace with ascii character
pos = 1
Loop
If pos := RegExMatch(str, "i)(?<=%)[\da-f]{2}", hex, pos++)
StringReplace, str, str, `%%hex%, % Chr("0x" . hex), All
Else Break
Return, str
}
uriEncode(str)
{
; Replace characters with uri encoded version except for letters, numbers,
; and the following: /.~:&=-
f = %A_FormatInteger%
SetFormat, Integer, Hex
pos = 1
Loop
If pos := RegExMatch(str, "i)[^\/\w\.~`:%&=-]", char, pos++)
StringReplace, str, str, %char%, % "%" . Asc(char), All
Else Break
SetFormat, Integer, %f%
StringReplace, str, str, 0x, , All
Return, str
} |
|
|
| Back to top |
|
 |
Ageless
Joined: 15 May 2007 Posts: 2
|
Posted: Fri Jun 15, 2007 2:27 am Post subject: Odd... |
|
|
Strange, when I run the uriEncode() function that Titan posted, I had the following problems:
1) It was encoding characters that didn't need to be encoded. In my case, these were colons (:) and forward slashes (/).
2) It was outputting hex character numbers with a 0x at the beginning, which is consistent with what the SetFormat documentation says regarding hexadecimal format:
| Quote: | | Hexadecimal numbers all start with the prefix 0x (e.g. 0xA9). |
So I modified the function slightly, here is the result:
| Code: | uriEncode(str) {
f = %A_FormatInteger%
SetFormat, Integer, Hex
If RegExMatch(str, "^\w+:/{0,2}", pr)
StringTrimLeft, str, str, StrLen(pr)
StringReplace, str, str, `%, `%25, All
Loop
If RegExMatch(str, "i)[^\w\.~%/:]", char)
StringReplace, str, str, %char%, % "%" . SubStr(Asc(char),3), All
Else Break
SetFormat, Integer, %f%
Return, pr . str
} |
The second RegExMatch() now ignores colons and forward slashes, and the Asc() function is now wrapped with a SubStr that strips the extra characters.
Please note that I made no attempt at a universal fix: I just modified the parts that were causing problems in my specific and very limited usage of this function.
Thanks for posting the original code, Titan!
PS: I'm a AutoHotKey noob, so I'm probably missing something obvious that would explain why I was having problems with the function in the first place. I'm just posting this in case someone else has the same problems that I was having. |
|
| Back to top |
|
 |
JoeSchmoe as guest Guest
|
Posted: Wed Apr 21, 2010 5:22 pm Post subject: Figuring out Titan's regex |
|
|
Hello,
I'm having trouble figuring out one of the two regular expressions that Titan used in his code. I'd like to modify his code somewhat, and I need to figure out what is going on first.
The code in question is at the heart of the encoding function: | Code: | If RegExMatch(str, "i)[^\w\.~%]", char)
StringReplace, str, str, %char%, % "%" . Asc(char), All
|
It looks to me like that regular expression would match any single character that isn't whitespace, a period, a ~, or a %. Wouldn't this catch every single alphanumeric character?
It seems like better code would be to insert an escape character before the caret to remove its special meaning: | Code: | If RegExMatch(str, "i)[\^\w\.~%]", char)
StringReplace, str, str, %char%, % "%" . Asc(char), All
|
Am I missing something? Was Titan's code for an older version of the PCRE lib in which carets didn't have a special meaning? |
|
| Back to top |
|
 |
sinkfaze
Joined: 18 Mar 2008 Posts: 5010 Location: the tunnel(?=light)
|
Posted: Wed Apr 21, 2010 5:42 pm Post subject: Re: Figuring out Titan's regex |
|
|
| JoeSchmoe wrote: | | It looks to me like that regular expression would match any single character that isn't whitespace, a period, a ~, or a %. |
That should be any single character that isn't a word character (isn't an alphanumeric character), a literal period, a ~ or a %. _________________ Try Quick Search for Autohotkey or see the tutorial for newbies. |
|
| Back to top |
|
 |
iamattamai
Joined: 06 Nov 2010 Posts: 3 Location: Atlanta, GA, USA
|
Posted: Sat Nov 06, 2010 1:40 pm Post subject: Stuck on this one -- post not working |
|
|
I am trying to convert a string of csv text to URL-format to be posted using uriEncode and httpQuery together.
I am able to post simple strings using the code below, but not an example like shown -- I suspect it might be the % signs???
Admittedly a newbie to ahk and assistance much appreciated.
| Code: |
#noenv
uriEncode(str)
{
; Replace characters with uri encoded version except for letters, numbers,
; and the following: /.~:&=-
f = %A_FormatInteger%
SetFormat, Integer, Hex
pos = 1
Loop
If pos := RegExMatch(str, "i)[^\/\w\.~`:%&=-]", char, pos++)
StringReplace, str, str, %char%, % "%" . Asc(char), All
Else Break
SetFormat, Integer, %f%
StringReplace, str, str, 0x, , All
Return, str
}
estring = 2010,11,5,18,0,55,"ROC177262","CPSOS - Search - Search",""2010,11,5,18,1,0,"ROC177262","Logon
status",""2010,11,5,18,1,6,"ROC177262","AT&T U-Verse CRM Customer Interaction Manager : Release 14 -
csrPG4cmem105",""2010,11,5,18,1,9,"ROC177262","AT&T Wireline - Synchronoss Technologies, Inc. - Windows Inter - \\Remote, 128-bit SSL/TLS.",""
newstring = % uriEncode(estring)
msgbox, %newstring% ;valid conversion confirmed here
html := ""
URL := "http://www.mysite.com/act_raw_upload.cfm"
POSTData := "raw_data= %newstring%"
length := httpQuery(html,URL,POSTdata)
varSetCapacity(html,-1)
#include httpQuery.ahk
|
|
|
| Back to top |
|
 |
[VxE]
Joined: 07 Oct 2006 Posts: 3234 Location: Simi Valley, CA
|
|
| Back to top |
|
 |
iamattamai
Joined: 06 Nov 2010 Posts: 3 Location: Atlanta, GA, USA
|
Posted: Sun Nov 07, 2010 12:45 am Post subject: What about CR/LF encoding? |
|
|
Many thanks VxE.
I also found that switching to the later post/mod of the uriEncoder by Ageless sealed the deal.
One more humble request: How would you modify the Ageless code below to encode CR/LF characters in the source string? I'm a challenged with code as terse as this stuff.
| Code: |
uriEncode(str) {
f = %A_FormatInteger%
SetFormat, Integer, Hex
If RegExMatch(str, "^\w+:/{0,2}", pr)
StringTrimLeft, str, str, StrLen(pr)
StringReplace, str, str, `%, `%25, All
Loop
If RegExMatch(str, "i)[^\w\.~%/:]", char)
StringReplace, str, str, %char%, % "%" . SubStr(Asc(char),3), All
Else Break
SetFormat, Integer, %f%
Return, pr . str
}
|
|
|
| Back to top |
|
 |
|
|
You can post new topics in this forum You can reply to topics in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|