[Resolved]how to justice a file's encoding?

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
zhanglei1371
Posts: 28
Joined: 05 Aug 2014, 02:01

[Resolved]how to justice a file's encoding?

07 Aug 2020, 11:07

I want to know aTXT type file's encoding is UTF-8 or ansi
, how to organize the code?
thanks very much for the reply!
Last edited by zhanglei1371 on 08 Aug 2020, 23:04, edited 1 time in total.
User avatar
Smile_
Posts: 859
Joined: 03 May 2020, 00:51

Re: how to justice a file's encoding?

07 Aug 2020, 12:56

I attached you compiled executable from a cpp script named _GetEncoding.exe, place it in the same folder with your AHK script then specify the text file path that you are going to examine it encoding and run your script.
It should give you a return value equal to 239 if it is UTF-8 text file or 97 if it is ANSI text file.
The script:

Code: Select all

TxtFile := "Example.txt"
RunWait, _GetEncoding.exe %TxtFile%,, Hide
RVal := ErrorLevel
If (RVal = 239)
    Msgbox % " UTF-8 (Return value: " RVal ")"
Else If (RVal = 97)
    Msgbox % " ANSI (Return value: " RVal ")"
Else
    Msgbox % " Unlisted encoding or the text file was not found or it is empty (Return value: " RVal ")"
If the executable doesn't work, try compile it yourself.
Here is the cpp code source

Code: Select all

#include<conio.h>
#include<stdio.h>
#include<string.h>
int main(int argc, char *argv[])
{
    FILE *fp=NULL;
    int c;
    fp=fopen(argv[1],"rb");
    if (fp != NULL)
    {
        c = fgetc(fp);
        printf("%d",c);
        fclose(fp);
    }
    return (c);
}
Attachments
_GetEncoding.zip
Get File Encoding
(16.66 KiB) Downloaded 31 times
zhanglei1371
Posts: 28
Joined: 05 Aug 2014, 02:01

Re: how to justice a file's encoding?

07 Aug 2020, 17:38

Smile_ wrote:
07 Aug 2020, 12:56
I attached you compiled executable from a cpp script named _GetEncoding.exe, place it in the same folder with your AHK script then specify the text file path that you are going to examine it encoding and run your script.
It should give you a return value equal to 239 if it is UTF-8 text file or 97 if it is ANSI text file.
thx for the reply,but I tried several files ,the result seems wrong .here are the file below.
Attachments
testfile.rar
ANSI and UTF8 test file
(7.33 KiB) Downloaded 35 times
User avatar
Smile_
Posts: 859
Joined: 03 May 2020, 00:51

Re: how to justice a file's encoding?

08 Aug 2020, 10:23

I expected you are checking only the txt files, but anyway it doesn't seems to be an option for now.
I looked little more and I found this tool called File it is basically a Linux tool, but a windows version was made, It determinate a file type (Including it encoding if it has). Read more here :arrow: http://gnuwin32.sourceforge.net/packages/file.htm

Testing script:

Code: Select all

Folder := "testfiles"
RunWait, %ComSpec% /c bin\File.exe "%Folder%\*" | Clip,, Hide
Gui, Add, Edit, -vScroll +ReadOnly, % ClipBoard
Gui, Show,, Files Infos
Return

GuiClose:
ExitApp
I attached the necessary files.

!Note: (Read more :arrow: https://docs.microsoft.com/en-us/windows/win32/intl/code-pages)
Originally, Windows code page 1252, the code page commonly used for English and other Western European languages, was based on an American National Standards Institute (ANSI) draft. That draft eventually became ISO 8859-1, but Windows code page 1252 was implemented before the standard became final, and is not exactly the same as ISO 8859-1.
So ISO 8859-1 is something similar to ANSI (but not the same!).
Attachments
TestScript.zip
Testing Script
(456.42 KiB) Downloaded 28 times
zhanglei1371
Posts: 28
Joined: 05 Aug 2014, 02:01

Re: how to justice a file's encoding?

08 Aug 2020, 23:03

Smile_ wrote:
08 Aug 2020, 10:23
I expected you are checking only the txt files, but anyway it doesn't seems to be an option for now.
I looked little more and I found this tool called File it is basically a Linux tool, but a windows version was made, It determinate a file type (Including it encoding if it has). Read more here :arrow: http://gnuwin32.sourceforge.net/packages/file.htm
Thanks Smile_ very much!
I test the file.exe with parameter -i then I can get the right Charset of the txt-type files.
here 's the result:

Code: Select all

testfiles\[ANSI]abc.txt;   text/plain; charset=iso-8859-1
testfiles\[ANSI]Class1.cs; text/x-c++; charset=iso-8859-1
testfiles\[uft8]abc.txt;   text/plain; charset=utf-8
testfiles\ANSI.txt;        text/plain; charset=iso-8859-1
testfiles\Class1.cs;       text/x-c++; charset=iso-8859-1
testfiles\UTF8 - 复件.txt; application/xml; charset=iso-8859-1
testfiles\UTF8.txt;        text/plain; charset=utf-8
testfiles\UTF8.xml;        application/xml; charset=utf-8
Now I can use the CMDLine tool to justice encode and perform other tasks accord to the right encode. :D :D :D
User avatar
Smile_
Posts: 859
Joined: 03 May 2020, 00:51

Re: [Resolved]how to justice a file's encoding?

09 Aug 2020, 00:11

Good luck, glad to hear it helped!

Return to “Ask for Help (v1)”

Who is online

Users browsing this forum: Chunjee, Google [Bot] and 133 guests