Can someone help me to call the C++ dll (detect encoding code page for search ) Topic is solved

Get help with using AutoHotkey (v2 or newer) and its commands and hotkeys
viv
Posts: 219
Joined: 09 Dec 2020, 17:48

Can someone help me to call the C++ dll (detect encoding code page for search )

Post by viv » 23 Nov 2022, 02:45

The function of this library is to detect the file encoding
https://github.com/BYVoid/uchardet

I don't know C++, probably read a part of the file and check its encoding

Specifically, this section of the function
The function is to pass in a file path
return a file encoding

https://github.com/BYVoid/uchardet/blob/4e685757780cb3c652fc6c9ec759f62888969ec9/src/tools/uchardet.cpp#L52

Code: Select all

void detect(FILE * fp)
{
    uchardet_t handle = uchardet_new();

    while (!feof(fp))
    {
        size_t len = fread(buffer, 1, BUFFER_SIZE, fp);
        int retval = uchardet_handle_data(handle, buffer, len);
        if (retval != 0)
        {
            fprintf(stderr, "Handle data error.\n");
            exit(1);
        }
    }
    uchardet_data_end(handle);

    const char * charset = uchardet_get_charset(handle);
    if (*charset)
    	printf("%s\n", charset);
	else
		printf("unknown\n");
	
    uchardet_delete(handle);
}
The public functions are all in this part
https://github.com/BYVoid/uchardet/blob/4e685757780cb3c652fc6c9ec759f62888969ec9/src/uchardet.cpp#L82

The attached content is the compiled exe, dll, and my unsuccessful ahk script
Run "uchardet.exe txt.txt" directly in CMD
will return GB18030
uchardet.zip
(192.91 KiB) Downloaded 46 times
Last edited by viv on 23 Nov 2022, 06:37, edited 1 time in total.

swagfag
Posts: 6222
Joined: 11 Jan 2017, 17:59

Re: Can someone help me to call the C++ dll  Topic is solved

Post by swagfag » 23 Nov 2022, 05:42

Code: Select all

class UCharDet
{
	#DllLoad 'libuchardet.dll'
	
	/**
	 * Create an encoding detector.
	 * @return an instance of uchardet_t.
	 *
	 * UCHARDET_INTERFACE uchardet_t uchardet_new(void);
	 */
	Ptr := DllCall('libuchardet\uchardet_new', 'CDecl Ptr')

	/**
	 * Delete an encoding detector.
	 * @param ud [in] the uchardet_t handle to delete.
	 *
	 * UCHARDET_INTERFACE void uchardet_delete(uchardet_t ud);
	 */
	__Delete() => this.Ptr && DllCall('libuchardet\uchardet_delete', 'Ptr', this, 'CDecl')

	/**
	 * Feed data to an encoding detector.
	 * The detector is able to shortcut processing when it reaches certainty
	 * for an encoding, so you should not worry about limiting input data.
	 * As far as you should be concerned: the more the better.
	 *
	 * @param ud [in] handle of an instance of uchardet
	 * @param data [in] data
	 * @param len [in] number of byte of data
	 * @return non-zero number on failure.
	 *
	 * UCHARDET_INTERFACE int uchardet_handle_data(uchardet_t ud, const char * data, size_t len);
	 */
	HandleData(pBytes, cBytes) => DllCall('libuchardet\uchardet_handle_data', 'Ptr', this, 'Ptr', pBytes, 'Ptr', cBytes, 'CDecl Int')

	/**
	 * Notify an end of data to an encoding detector.
	 * @param ud [in] handle of an instance of uchardet
	 * 
	 * UCHARDET_INTERFACE void uchardet_data_end(uchardet_t ud);
	 */
	DataEnd() => DllCall('libuchardet\uchardet_data_end', 'Ptr', this, 'CDecl')

	/**
	 * Reset an encoding detector.
	 * @param ud [in] handle of an instance of uchardet
	 * 
	 * UCHARDET_INTERFACE void uchardet_reset(uchardet_t ud);
	 */
	Reset() => DllCall('libuchardet\uchardet_reset', 'Ptr', this, 'CDecl')

	/**
	 * Get an iconv-compatible name of the encoding that was detected.
	 * @param ud [in] handle of an instance of uchardet
	 * @return name of charset on success and "" on failure.
	 * 
	 * UCHARDET_INTERFACE const char * uchardet_get_charset(uchardet_t ud);
	 */
	GetCharset() => DllCall('libuchardet\uchardet_get_charset', 'Ptr', this, 'CDecl AStr')

	DetectBuffer(Buffer) {
		this.Reset()

		if res := this.HandleData(Buffer.Ptr, Buffer.Size)
			throw Error("Internal 'libuchardet' error, uchardet_handle_data() returned " res)

		this.DataEnd()
		charset := this.GetCharset()
		this.Reset()

		return charset
	}
}

UCD := UCharDet()
MsgBox UCD.DetectBuffer(FileRead('txt.txt', 'RAW')) ; GB18030

viv
Posts: 219
Joined: 09 Dec 2020, 17:48

Re: Can someone help me to call the C++ dll

Post by viv » 23 Nov 2022, 06:35

@swagfag

Thank you very much, I wish you a happy life!

Cubex
Posts: 8
Joined: 06 Sep 2014, 06:23

Re: Can someone help me to call the C++ dll (detect encoding code page for search )

Post by Cubex » 26 Jan 2023, 03:23

Nice and for Autohotkey v.1 ?
Please

swagfag
Posts: 6222
Joined: 11 Jan 2017, 17:59

Re: Can someone help me to call the C++ dll (detect encoding code page for search )

Post by swagfag » 27 Jan 2023, 04:38

Code: Select all

#NoEnv
#Warn ClassOverwrite
#Requires AutoHotkey v1.1.36.02

class UCharDet
{
	__DllLoad() {
		static _ := UCharDet.__DllLoad()

		if hModule := DllCall("LoadLibrary", "Str", "libuchardet.dll", "Ptr")
			DllCall("GetModuleHandleEx", "UInt", 1, "Str", "libuchardet.dll", "Ptr*", 0) ; GET_MODULE_HANDLE_EX_FLAG_PIN
		else
			throw Exception("Failed to load 'libuchardet.dll'.", -1, A_LastError)
	}
	
	/**
	 * Create an encoding detector.
	 * @return an instance of uchardet_t.
	 *
	 * UCHARDET_INTERFACE uchardet_t uchardet_new(void);
	 */
	Ptr := DllCall("libuchardet\uchardet_new", "CDecl Ptr")

	/**
	 * Delete an encoding detector.
	 * @param ud [in] the uchardet_t handle to delete.
	 *
	 * UCHARDET_INTERFACE void uchardet_delete(uchardet_t ud);
	 */
	__Delete() {
		if this.Ptr
			DllCall("libuchardet\uchardet_delete", "Ptr", this.Ptr, "CDecl")
	}

	/**
	 * Feed data to an encoding detector.
	 * The detector is able to shortcut processing when it reaches certainty
	 * for an encoding, so you should not worry about limiting input data.
	 * As far as you should be concerned: the more the better.
	 *
	 * @param ud [in] handle of an instance of uchardet
	 * @param data [in] data
	 * @param len [in] number of byte of data
	 * @return non-zero number on failure.
	 *
	 * UCHARDET_INTERFACE int uchardet_handle_data(uchardet_t ud, const char * data, size_t len);
	 */
	HandleData(pBytes, cBytes) {
		return DllCall("libuchardet\uchardet_handle_data", "Ptr", this.Ptr, "Ptr", pBytes, "Ptr", cBytes, "CDecl Int")
	}

	/**
	 * Notify an end of data to an encoding detector.
	 * @param ud [in] handle of an instance of uchardet
	 * 
	 * UCHARDET_INTERFACE void uchardet_data_end(uchardet_t ud);
	 */
	DataEnd() {
		return DllCall("libuchardet\uchardet_data_end", "Ptr", this.Ptr, "CDecl")
	}

	/**
	 * Reset an encoding detector.
	 * @param ud [in] handle of an instance of uchardet
	 * 
	 * UCHARDET_INTERFACE void uchardet_reset(uchardet_t ud);
	 */
	Reset() {
		return DllCall("libuchardet\uchardet_reset", "Ptr", this.Ptr, "CDecl")
	}

	/**
	 * Get an iconv-compatible name of the encoding that was detected.
	 * @param ud [in] handle of an instance of uchardet
	 * @return name of charset on success and "" on failure.
	 * 
	 * UCHARDET_INTERFACE const char * uchardet_get_charset(uchardet_t ud);
	 */
	GetCharset() {
		return DllCall("libuchardet\uchardet_get_charset", "Ptr", this.Ptr, "CDecl AStr")
	}

	DetectBytes(pBytes, cBytes) {
		this.Reset()

		if res := this.HandleData(pBytes, cBytes)
			throw Exception("Internal 'libuchardet' error, uchardet_handle_data() returned " res)

		this.DataEnd()
		charset := this.GetCharset()
		this.Reset()

		return charset
	}
}

UCD := new UCharDet()
f := FileOpen("txt.txt", "r")
f.Pos := 0
cBytes := f.RawRead(Bytes, f.Length)
MsgBox % UCD.DetectBytes(&Bytes, cBytes) ; GB18030

Cubex
Posts: 8
Joined: 06 Sep 2014, 06:23

Re: Can someone help me to call the C++ dll (detect encoding code page for search )

Post by Cubex » 29 Jan 2023, 04:17

@swagfag
Thank you very much. It's a 64bit version and I need a 32bit version uchardet. There is already a newer version of uchardet 0.0.8, but I can not compile it into a 32bit executable file. Damage.
Thank you so much. :clap: :bravo:

tuzi
Posts: 223
Joined: 27 Apr 2016, 23:40

Re: Can someone help me to call the C++ dll (detect encoding code page for search )

Post by tuzi » 09 Dec 2023, 01:51

base on swagfag 's code, i create a lib ahk-chardet.

vmech
Posts: 374
Joined: 25 Aug 2019, 13:03

Re: Can someone help me to call the C++ dll (detect encoding code page for search )

Post by vmech » 09 Dec 2023, 04:45

@viv, where you got this binaries ?

@tuzi, version for V2 ?
Please post your script code inside [code] ... [/code] block. Thank you.

Post Reply

Return to “Ask for Help (v2)”