[Solved] Parsing strings with Pilcrow inside

Get help with using AutoHotkey and its commands and hotkeys
Peter2
Posts: 244
Joined: 21 Sep 2014, 14:38
Location: CH

[Solved] Parsing strings with Pilcrow inside

31 Oct 2015, 18:04

I have a text file which consists of Ascii signs 32 - 230, and there is also the "Pilcrow" sign (Ascii 182, Unicode U+00B6 pilcrow sign, HTML-Entity ¶) in it.

Code: Select all

S'??QÖ!O}P&a?ÚºÎ??4^?¸h׶,"Ó#|W­?lJjU?¡[B?Ü®¦Ç|Ý:+¶9{yiesÊÃ?
Now I use the crypt/decrypt script, made by PhiLho,
http://autohotkey.com/board/topic/17939 ... ript-file/
and get different results. I suppose that the problem does not come from crypting, maybe it comes from standard function

Code: Select all

   Loop Parse, _string
   {...
which has problems to distinguish between the Pilcrows inside the strings and at the end of the string.

Is there a simple way to handle it, or simply "don't do it!"
Are there other characters (between 32 - 230) which disturb the parsing of strings?

Thanks, and enjoy your Sunday.
Last edited by Peter2 on 01 Nov 2015, 15:39, edited 1 time in total.
Peter (AHK Beginner) / Win 7 x64, AHK Version v1.1.22.xx
lexikos
Posts: 6668
Joined: 30 Sep 2013, 04:07
GitHub: Lexikos

Re: Parsing strings with Pilcrow inside

31 Oct 2015, 18:09

ASCII is 7-bit (0-127). So what encoding does the text file use? CP1252? UTF-8?

If you read the file correctly, the standard commands (loop parse etc.) will work just fine. If you're reading the file incorrectly (e.g. if it's UTF-8 without a byte order mark but you're reading it as CP1252 or whatever your system's default code page is), the standard commands will still work just fine - but they'll be working on data which may have already been corrupted.
I suppose that the problem does not come from crypting,
Why?

The function was written for old AutoHotkey (32-bit ANSI, where strings are 8 bits per character). If you are using a Unicode version of AutoHotkey (where strings are 16 bits per character), it may not work.
Peter2
Posts: 244
Joined: 21 Sep 2014, 14:38
Location: CH

Re: Parsing strings with Pilcrow inside

31 Oct 2015, 18:25

lexikos wrote:ASCII is 7-bit (0-127). So what encoding does the text file use? ...
I use current AutoHotkeyU32.exe, and the file is created with "Random, code, 32, 230 neu = % Chr(code) neu fileappend. Ultraedit shows the filetype with "1252"
lexikos wrote:...The function was written for old AutoHotkey (32-bit ANSI, where strings are 8 bits per character). If you are using a Unicode version of AutoHotkey (where strings are 16 bits per character), it may not work.
So I have to replace it with one of the newer crypt-scripts?
Or will it be enough to avoid the Pilcrow-char?
Peter (AHK Beginner) / Win 7 x64, AHK Version v1.1.22.xx
lexikos
Posts: 6668
Joined: 30 Sep 2013, 04:07
GitHub: Lexikos

Re: Parsing strings with Pilcrow inside

31 Oct 2015, 21:14

Your code is generating text with Unicode characters 32 .. 230, and then writing it as ANSI. There are some Unicode characters in the 128 .. 230 range which do not exist in your system's default ANSI code page, and are therefore replaced with a placeholder (?). However, given what Ultraedit shows, your default is probably 1252, the same as mine, and Pilcrow works just fine for me -- it is 182 in both CP1252 and UTF-16.

Either use a file encoding which preserves all characters (i.e. UTF-8 or UTF-16) or generate ASCII text (chars 1 .. 127 only). FileEncoding sets the default encoding.

If you want to support Unicode characters > 255, you must adjust PhiLho's algorithm or find a different one.

As a matter of style, you should use := to assign expressions, not the undocumented = %.
Peter2
Posts: 244
Joined: 21 Sep 2014, 14:38
Location: CH

Re: Parsing strings with Pilcrow inside

01 Nov 2015, 10:45

Thanks for your advices.

I tried some things - unsatisfying results ....

I have ...
1) an AHK script with the linked script of PhiLho (see above)
2) a script which
a) reads a small text file
b) adds random characters
c) crypt it (for usage)
d) decrypt it (for controlling)
3) a script which reads c) and decrypt it

2) and 3) uses the same "included" file 1)
Files b) (before encryption) and d) (after decryption) are exactly the same

- I set in all my scripts the line Fileencoding, UTF-8, later Fileencoding, CP1252. Ultraedit showed my that the files are coded directly, but the result has problems.
- Then I removed the Fileencoding and limited the random characters to Random, code, 32, 127 (pure ASCII), but there are difference:

Encrypted line:

Code: Select all

36442A5D25471F4F720D3F105F7C422946446174756D3D3230313531313330556F673F7F2E733F7268236C2657485F366F347369774A5F6F475F4B6E
Decryption with step 2d)

Code: Select all

Z-P8K=z=\lW{_|B)FDatum=20151130Uog?.s?rh#l&WH_6o4siwJ_oG_Kn
Decryption with step 3

Code: Select all

p(O%V3~=RY-#\2+"51C\Z[151130Uog?.s?rh#l&WH_6o4siwJ_oG_Kn
At the moment I don't know why script 1) returns different "decryptions" for some lines from the same source (file 2c).
Peter (AHK Beginner) / Win 7 x64, AHK Version v1.1.22.xx
Peter2
Posts: 244
Joined: 21 Sep 2014, 14:38
Location: CH

Re: [Solved] Parsing strings with Pilcrow inside

01 Nov 2015, 15:41

Solved - sorry for the confusion:

The named crypter-script uses the value A_ScriptName, and of course different names brings different results.
Peter (AHK Beginner) / Win 7 x64, AHK Version v1.1.22.xx

Return to “Ask For Help”

Who is online

Users browsing this forum: dave444344, dead_line, Google [Bot], spencer and 207 guests