[Idea] Hashing as built-in function

Discuss the future of the AutoHotkey language
User avatar
jNizM
Posts: 3183
Joined: 30 Sep 2013, 01:33
Contact:

[Idea] Hashing as built-in function

Post by jNizM » 04 Jan 2023, 10:04

I've been experimenting a bit with hashing to include it (eventually if desired) as a built-in feature in AHK.

Currently it only works with 64-bit AHK.
As a non-programmer, it's a little harder to get into C++. I am always open for improvement suggestions or pull requests.


What it might look like:

Code: Select all

Hash := HashString(String [, Algorithm, HMAC])
Hash := HashFile(Filename [, Algorithm])

Algorithm
	- MD2
	- MD4
	- MD5
	- SHA1 (default)
	- SHA256
	- SHA384
	- SHA512
How the function call in AHK could look like:

Code: Select all

MsgBox HashString("The quick brown fox jumps over the lazy dog", "SHA256")   ; -> D7A8FBB307D7809469CA9ABCB0082E4F8D5651E46D3CDB762D02D0BF37C9E592
MsgBox HashString("The quick brown fox jumps over the lazy dog", "MD5", "Secret")   ; -> 5B9F81435A19FDFB0B23F68ED45979E4
MsgBox HashString("𤽜")   ; -> 4701F6DEA7CFB096998E720DF86137837B63AD0F
MsgBox HashFile("test.txt", "SHA1")   ; -> FC62C3F7F7838F4818BD45ED32ADD47EA20E32F6

Branch: https://github.com/jNizM/AutoHotkey/tree/hashing
Commits: https://github.com/jNizM/AutoHotkey/commits/hashing
C++ File: https://github.com/jNizM/AutoHotkey/blob/hashing/source/lib/crypt.cpp
[AHK] v2.0.5 | [WIN] 11 Pro (Version 22H2) | [GitHub] Profile

guest3456
Posts: 3463
Joined: 09 Oct 2013, 10:31

Re: [Idea] Hashing as Buildin function

Post by guest3456 » 04 Jan 2023, 11:02

for unit tests for hashing a file, you can cross check against:

cmd:

Code: Select all

certutil -hashfile FILE_NAME SHA256
powershell:

Code: Select all

Get-FileHash -a sha256 FILE_NAME
would prefer sha256 to be default


lexikos
Posts: 9664
Joined: 30 Sep 2013, 04:07
Contact:

Re: [Idea] Hashing as Buildin function

Post by lexikos » 04 Jan 2023, 18:59

The UX scripts include a HashFile function with similar usage but different algorithms assigned to the numbers, based on a v1 script by Deo. If this is to be built-in, I would want the second parameter to accept algorithm names. Arbitrarily assigned numbers are too cryptic.

Is there some reason that you use the heap functions rather than malloc/free?

The BYTE arrays allocated with new BYTE[] are never deleted.

Copying the input/secret with pbInput[i] = (BYTE)aInput[i]; will truncate all character ordinal values over 255. Hashing works over binary data. If you are hashing a string, you need to decide whether to hash the exact bits of the original string (which is always UTF-16 in v2) or transcode it to something else first, like UTF-8. Alternatively, replace "HashString" with "HashBuffer" or similar and let the caller transcode the string if they wish.

Don't use std::ifstream; this is likely to increase code size measurably, as it is not used anywhere else. It appears you're only using it to read a fixed number of bytes into a buffer repeatedly, which could be done with some simple calls to CreateFile, ReadFile and CloseHandle.

For HashFile, adding LONG_OPERATION_UPDATE into the loop would allow the script to respond to hotkeys, tray icon, etc. if the function takes a long time. It needs LONG_OPERATION_INIT outside the loop. You can search the project for these macros to see examples.

User avatar
jNizM
Posts: 3183
Joined: 30 Sep 2013, 01:33
Contact:

Re: [Idea] Hashing as Buildin function

Post by jNizM » 05 Jan 2023, 01:58

The UX scripts include a HashFile function with similar usage but different algorithms assigned to the numbers, based on a v1 script by Deo.
MD4 is missing (CALG_MD4 = 0x00008002)
I would want the second parameter to accept algorithm names.
I'll try
Alternatively, replace "HashString" with "HashBuffer" or similar and let the caller transcode the string if they wish.
Whats about HashString(), HashFile() and HashBuffer() or HashBinary()?
Is there some reason that you use the heap functions rather than malloc/free? [... and below]
Having only recently started trying to use / understand C++, I took what search and others (e.g. GitHub repos) showed me.
But I will try to implement your suggestions with the means at my disposal.

The official example from MSDN is also of limited help.
(https://learn.microsoft.com/en-us/windows/win32/seccng/creating-a-hash-with-cng)
[AHK] v2.0.5 | [WIN] 11 Pro (Version 22H2) | [GitHub] Profile

User avatar
jNizM
Posts: 3183
Joined: 30 Sep 2013, 01:33
Contact:

Re: [Idea] Hashing as Buildin function

Post by jNizM » 05 Jan 2023, 05:10

Changelog so far (im still learning):
- Changed heap to malloc
- Changed std::ifstream to createfile / readfile
- Added LONG_OPERATION_INIT and LONG_OPERATION_UPDATE
- Added UTF-16 to UTF-8 transcode
- Added delete BYTE[]
- Changed Algorithm ID to Names

Todo:
- Hash from buffer (HashBuffer)


UTF-16 chars like (𤽜) generates now also correct hash values
Top Post adjusted.
[AHK] v2.0.5 | [WIN] 11 Pro (Version 22H2) | [GitHub] Profile

iseahound
Posts: 1459
Joined: 13 Aug 2016, 21:04
Contact:

Re: [Idea] Hashing as built-in function

Post by iseahound » 06 Jan 2023, 18:30

I'm not sure about this, as generally one expects:
  • Fast hash functions - for designing bloom filters, hash maps, etc.
  • Cryptographic hash functions - for secure signing
The lack of SHA-3 is also troublesome. Also Blake2, Blake3 are preferred for speed. I think some of the older algorithms can satisfy both requirements above, while not being able to excel at either. But they would be fine for simple automated tasks like checksum verification.

User avatar
jNizM
Posts: 3183
Joined: 30 Sep 2013, 01:33
Contact:

Re: [Idea] Hashing as built-in function

Post by jNizM » 07 Jan 2023, 11:06

As soon as Microsoft implements more hash algorithms, it will be easy to customize and add them. But currently only the following are supported by MS directly:
https://learn.microsoft.com/en-us/windows/win32/seccng/cng-algorithm-identifiers
[AHK] v2.0.5 | [WIN] 11 Pro (Version 22H2) | [GitHub] Profile

Post Reply

Return to “AutoHotkey Development”