Jump to content

Sky Slate Blueberry Blackcurrant Watermelon Strawberry Orange Banana Apple Emerald Chocolate
Photo

Remove illegal chars from a string .,<>:;'"/|\(){


  • Please log in to reply
9 replies to this topic
Veovis
  • Members
  • 389 posts
  • Last active: Mar 17 2009 12:24 AM
  • Joined: 13 Feb 2006
One of the code techniques that i use ALOT is using a dymanic string (that i have no control over) and changing it into a variable. I quickly ran into the well known:

Error: This variable or function name contains an illegal character.


One example is "C:\Documents and Settings\" the colon, slashes, and spaces are all illegal.

So I threw together a function that solved my problem.

It makes a string able to be a variable name by replacing all spaces with _ and simply removing all (i think) other illegal chars.

varize(var)
{
   stringreplace,var,var,%A_space%,_,a
   chars = .,<>:;'"/|\(){}=-+!`%^&*~
   loop, parse, chars,
      stringreplace,var,var,%A_loopfield%,,a
   return var
}

Unfortunately this might cuase ambiguity if someone has similar inputs, like "file,ext" and "file.ext" (cant think of a better example...)

Also im not sure if that is all of the illegal chars. As far as I can that is all the illegals that are on my keyboard.

I was also wondering if there is something built in to AHK that already did this. It seems kind of useful to have a function that removes all illegal chars from a string.
Posted Image
"Power can be given overnight, but responsibility must be taught. Long years go into its making."

PhiLho
  • Moderators
  • 6850 posts
  • Last active: Jan 02 2012 10:09 PM
  • Joined: 27 Dec 2005
Another way is to verify that a character is a legal variable char:
Varize(_var, _bReplaceChars=false)
{
	local var

	Loop Parse, _var
	{
		If A_LoopField is alnum
		{
			Gosub Varize_AddChar
			Continue
		}
		If A_LoopField in #_@$?[]
		{
			Gosub Varize_AddChar
			Continue
		}
		If (Asc(A_LoopField) > 128)
		{
			Gosub Varize_AddChar
			Continue
		}
		If (_bReplaceChars)
			var := var "?" Asc(A_LoopField)
	}
	Return var

Varize_AddChar:
	var = %var%%A_LoopField%
Return
}

x := Varize("E:\Dev\AutoHotkey\Docs & Infos")
%x% = Foo
MsgBox %x%
x := Varize("€:\Dév\« ÂùtöH°tkéÿ™ »\¡Dœ©s & Inf¤ß!")
%x% = Foo
MsgBox %x%

x := Varize("E:\Dev\AutoHotkey\Docs & Infos", true)
%x% = Foo
MsgBox %x%
x := Varize("€:\Dév\« ÂùtöH°tkéÿ™ »\¡Dœ©s & Inf¤ß!", true)
%x% = Foo
MsgBox %x%
The boolean option resolves your dilemna about "file,ext" vs. "file.ext"...
To be complete, we should also truncate the string to 256 chars, but it is quite unlikely (unless trying to convert the result of a FileRead...).
Posted Image vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2")

Veovis
  • Members
  • 389 posts
  • Last active: Mar 17 2009 12:24 AM
  • Joined: 13 Feb 2006
Amazing PhiLho!

E?58?92Dev?92AutoHotkey?92Docs?32?38?32Infos

I wouldnt have thought about replacing the illegals with their ascii codes. You almost change it back to normal text this way! All you would have to do is also turn "?" into ascii before you do anyothers, then you can parse the string. Not that that is important as i dont need to convert anything back at this point. But still that is pretty smart! Thankyou!
Posted Image
"Power can be given overnight, but responsibility must be taught. Long years go into its making."

PhiLho
  • Moderators
  • 6850 posts
  • Last active: Jan 02 2012 10:09 PM
  • Joined: 27 Dec 2005
Well, I took inspiration in the URL encoding...
To be able to do reverse conversion, it would be better to use 2-digit hexa codes, so we are sure that a 3rd digit after ? is really a simple digit.
And we should remove ? from the OK list, to encode it, so there would be no ambiguity.
Posted Image vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2")

polyethene
  • Members
  • 5519 posts
  • Last active: May 17 2015 06:39 AM
  • Joined: 26 Oct 2012
Building from PhiLho's function, this is a more enhanced version:
varize(var, autofix = true) {
	Loop, Parse, var
	{ c = %A_LoopField%
		x := Asc(c)
		If (x > 47 and x < 58) or (x > 64 and x < 91) or (x > 96 and x < 123)
			or c = "#" or c = "_" or c = "@" or c = "$" or c = "?" or c = "[" or c = "]"
			IfEqual, autofix, 1, SetEnv, nv, %nv%%c%
			Else er++
	} If StrLen(var) > 254
		IfEqual, autofix, 1, StringLeft, var, var, 254
		Else er++
	IfEqual, autofix, 1, Return, nv
	Else Return, er
}

Specify false for the autofix parameter to return how many errors there were in the variable instead of automatically correcting it. Example:
var = €:\Dév\« ÂùtöH°tkéÿ™ »\¡Dœ©s & Inf¤ß!
MsgBox, % "var:`t" . var . "`nautofix:`t" . varize(var) . "`nerrors:`t" varize(var, false)

autohotkey.com/net Site Manager

 

Contact me by email (polyethene at autohotkey.net) or message tidbit


Laszlo
  • Moderators
  • 4713 posts
  • Last active: Mar 31 2012 03:17 AM
  • Joined: 14 Feb 2005
Or, you can use the simple Hexify function adapted from here
MsgBox % Hexify("")

MsgBox % Hexify("a")

MsgBox % Hexify("`r`nÿ`t%.;'")



Hexify(x)         ; Convert a string to a huge hex number starting with X

{

   format = %A_FormatInteger%

   SetFormat Integer, H

   hex = X

   Loop Parse, x

      hex := hex  0x100+Asc(A_LoopField)

   StringReplace hex, hex, 0x1,,All

   SetFormat Integer, %format%

   Return hex

}
It is invertible, that is, you never get the same hex stream for different input strings.

Morpheus
  • Members
  • 475 posts
  • Last active: Oct 21 2014 11:08 AM
  • Joined: 31 Jul 2008
Carrying on from Lazlo's Function I had the idea to set the contents of of the NEW variable to the OLD variable name for 2 reasons.

1. If the New Variable name was longer than 255 characters, it would have to be shortened, and the ability to retrieve the old name from the new name would be lost.

2. It would provide a simple way to retrieve the old name from the new name.

This is what I came up with:

VarName = This,.<>:'"/|\(){}=-+!^& *~;Name
NewName := Hex(VarName) , %NewName% := VarName
MsgBox % NewName "`n" %NewName%
Hex(x)         ; by Lazlo - http://www.autohotkey.com/forum/viewtopic.php?p=63624#63624
{
   format = %A_FormatInteger%
   StringLeft, y, x, 127
   SetFormat Integer, H
   Hex = X
   Loop Parse, y
      Hex := Hex  0x100+Asc(A_LoopField)
   StringReplace Hex, Hex, 0x1,,All
   SetFormat Integer, %format%
   Return Hex
}

This works Ok, but it would be better if the Function itself could return both the new name, and set the content of the 'new name' variable to the old name, but I could not figure out how to do it myself.

Can anybody show me how it would be done?

nimda
  • Members
  • 4368 posts
  • Last active: Aug 09 2015 02:36 AM
  • Joined: 26 Dec 2010
It is a lot easier with AHK_L objects:
o := {}
o["C:\Users\nimda\"] := 42


Morpheus
  • Members
  • 475 posts
  • Last active: Oct 21 2014 11:08 AM
  • Joined: 31 Jul 2008
I appreciate the reply. It is good to know that the solution is not too simple, considering how much time I spent trying to get it to work.

MacTwistie
  • Members
  • 1 posts
  • Last active: Nov 25 2013 08:13 PM
  • Joined: 24 Nov 2013

This has been my solution - I am happy for feedback as how to make this faster and easier to read, but it does have all the steps.

 

;SAVE THESE 4 Lines of DATA AS A TEXT FILE. OMIT THIS LIINE.

 
;21:21:34: Exile's Den has fallen!
;21:21:07: [Ender] has looted [Coldwater Seabug]!
;21:20:59: Loc: 4223 1224 2051
;21:21:07: [Ender] has looted [Coldwater Seabug]!
 
LogLoc = FullLogFileHereWith.TXT
Chars1 = [  ;used to parse lines ONLY without this Character


FileRead, Tlog, %LogLoc% ;Read the complete file into a variable.  This includes ~n on the end of each line.


Tlog := RegExReplace(Tlog,"`r`n","|#")  ;Replace the Newline `N with |#.  The reason to use 2 replacements is below.
Tlog := RegExReplace(Tlog,":","")       ;Remove : from the Variable.  "" means it removes the space entirely.
Tlog := RegExReplace(Tlog,"!","")       ;Remove ! from the Variable.
Tlog := RegExReplace(Tlog,"'","")       ;Remove ' from teh Variable.


Loop, Parse, Tlog,"|" ;Parse the logfile, each time it see's | it starts a new line.
IfNotInstring A_LoopField , %Chars1%  ; If [ is not in the string, then process the line into a new variable.

  {
  Tlog2 .= A_LoopField                                ; .= means it adds all the lines together as they are parsed. Interesting that the | that is used to parse the lines is stripped. Leaving only the # as shown above.
  }

Loop, parse, Tlog2, #,                                ; Open and read our log file from memory - now use # to start a new line


IfInstring A_LoopField , Loc                          ;check the string for text Loc. if it exists action it.
 {
 StringSplit, word_array, A_LoopField , %A_Space%, .  ; Omits periods.  Splits the line into variables based on spaces.
 BaseX := word_array3                                 ; set variable to the 4th object returned - It starts at 0
 BaseY := word_array5                                 ; set variable to the 6th object returned - It starts at 0
 Tooltip, X= %BaseX%`nY= %BaseY%                      ; Show under the mouse what is returned.
 }
MsgBox, X=%BaseX%`nY=%BaseY%

Exitapp

F2::exitapp

 

So basically here it whats happening

 

Set a variable to the character "[" allows it to be parsed by Expressions via the varaible.  You often cannot use these characters directly.

Remove a few other characters in the string

Read line by line, and create a new variable that only contains lines without "[" in them.

Check the variable for the text "Loc"

Split Loc into parts, and grab the bits you want.

 

Cheers