Search optimisation Topic is solved

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
pneumatic
Posts: 338
Joined: 05 Dec 2016, 01:51

Search optimisation

Post by pneumatic » 04 May 2021, 20:58

Hello

I am trying to search a 7MB binary file for a few byte patterns, similar to using a search function in a typical hex editor program.

I am able to implement this in AHK with a naive search algorithm ('linear search') in conjunction with FileRead *C and NumGet UChar.

Being linear search it is obviously quite slow though.

For example, a typical hex editor can find all occurrences of a 24 byte pattern in a 7MB binary in just a fraction of a second -- they are obviously not using linear search and iterating through each byte like my AHK implementation.

Therefore, I am interested in implementing a smarter search algorithm in AHK.

Questions:

1. Which algorithm should I try to implement in AHK? Should I sort the data first and then use something like binary search / divide and conquer / hash tables? Wouldn't it take a very long time to sort the binary data? Hex editors can open and search files immediately, so I'm not sure they are sorting the binary data first.

2. Is there already a third party AHK implementation available?

3. Are there Windows API functions that implement various search algorithms on unsorted data?

Thanks

User avatar
boiler
Posts: 16772
Joined: 21 Dec 2014, 02:44

Re: Search optimisation

Post by boiler » 04 May 2021, 21:20

It would be a relatively straightforward C++ function, which would be compiled into machine code and called by the AHK script. Even a linear algorithm would be much faster. It has been done often with AHK for graphics manipulations and searches, such as this one. You would just need to pass it the addresses and lengths of the needle and haystack. The online MCode compiler that I used seems to be down, but there are other compilers available as discussed here.

pneumatic
Posts: 338
Joined: 05 Dec 2016, 01:51

Re: Search optimisation

Post by pneumatic » 04 May 2021, 22:08

I'm still not confident that linear search implemented with mcode would be similarly fast to hex editors such as HxD, simply because of how CPU hungry linear search is.

Perhaps there is some other optimisation technique using bitwise operations?

One idea is that since a byte can only have 256 different values, create 256 buckets with each bucket containing offsets of all occurrences of that byte. Then when you want to search for a particular byte, you only need to search that one bucket. The bucket creation could be done at file readin using different memory ranges corresponding to buckets so it probably needn't take any extra processing time.

edit: it seems this bucket sorting might also facilitate binary search? Because if we're reading the file in sequentially, the offsets in the buckets will also be sorted sequentially. So we'd end up with sequentially sorted buckets with each bucket containing sequentially sorted offsets.

But then the file would have to be read in 1 byte at a time, which I suspect is going to bog it down (in AHK it does ).

So maybe I need an mcode implementation to efficiently read the file into buckets.

TAC109
Posts: 1099
Joined: 02 Oct 2013, 19:41
Location: New Zealand

Re: Search optimisation  Topic is solved

Post by TAC109 » 04 May 2021, 23:56

@pneumatic
I believe you’re over-thinking it. If you extract the InBuf function from BinMod this should satisfy your requirements. It does an in-memory search (you should load the file into a variable first), and is very fast as it works in machine code. It was originally written by w0xx0m and published some time ago in the old forum (see the link in the function if interested). The parameters are Haystack address, Haystack size, Needle address, Needle size, and Offset in haystack to start the search. It returns the offset in haystack where the needle was found, or a negative number if the needle was not found.

Cheers

Edit: Machine code is 32 bit only.
Last edited by TAC109 on 05 May 2021, 16:41, edited 1 time in total.
My scripts:-
XRef - Produces Cross Reference lists for scripts
ReClip - A Text Reformatting and Clip Management utility
ScriptGuard - Protects Compiled Scripts from Decompilation
I also maintain Ahk2Exe

User avatar
boiler
Posts: 16772
Joined: 21 Dec 2014, 02:44

Re: Search optimisation

Post by boiler » 05 May 2021, 05:08

@TAC109 - That’s interesting and useful. Thanks. I went to w0xx0m’s thread to see his C source and found it was written in assembly language. Impressive, and surely about as fast as possible. I see he has a function for searching binary file contents directly as well, which also could come in handy.

TAC109
Posts: 1099
Joined: 02 Oct 2013, 19:41
Location: New Zealand

Re: Search optimisation

Post by TAC109 » 05 May 2021, 13:20

I should have mentioned that this machine code only works with a 32 bit version of AutoHotkey.
My scripts:-
XRef - Produces Cross Reference lists for scripts
ReClip - A Text Reformatting and Clip Management utility
ScriptGuard - Protects Compiled Scripts from Decompilation
I also maintain Ahk2Exe

teadrinker
Posts: 4309
Joined: 29 Mar 2015, 09:41
Contact:

Re: Search optimisation

Post by teadrinker » 05 May 2021, 17:09

Tried to implement without machine code:

Code: Select all

SetBatchLines, -1

filePath := "D:\Downloads\YouTube\Rammstein - Radio (Official Video).mkv"
bytes := "71 90 ef 8e"

size := ReadFile(filePath, buff)
len := CryptStringToBinary(bytes, needle, "CRYPT_STRING_HEX")
start := A_TickCount
offsets := SearchInBuff(&buff, size, &needle, len)
MsgBox,, % A_TickCount - start . "ms", % offsets

SearchInBuff(pHaystack, hstkSize, pNeedle, needleSize) {
   arr := GetMax(pNeedle, needleSize)
   maxByte := arr.max, offset := arr.maxIdx
   prevByte := arr.prevMax, prevByteShift := arr.prevMaxIdx - offset
   bool := needleSize > 1
   found := "", addr := pHaystack + offset - 1, maxAddr := addr + hstkSize - needleSize + 1
   while addr := DllCall("msvcrt\memchr", "Ptr", addr + 1, "Int", maxByte, "Ptr", maxAddr - addr, "Cdecl Ptr") {
      ( (bool ? NumGet(addr + prevByteShift, "UChar") = prevByte : true)
        && DllCall("msvcrt\memcmp", "Ptr", addr - offset, "Ptr", pNeedle, "Ptr", needleSize, "Cdecl") = 0
        && found .= Format("{:#x}", addr - offset - pHaystack) . "`n" )
   }
   Return RTrim(found, "`n")
}

ReadFile(filePath, ByRef buff) {
   File := FileOpen(filePath, "r")
   File.Pos := 0
   File.RawRead(buff, size := File.Length)
   File.Close()
   Return size
}

CryptStringToBinary(string, ByRef outData, formatName := "CRYPT_STRING_BASE64")
{
   static formats := { CRYPT_STRING_BASE64: 0x1
                     , CRYPT_STRING_HEX:    0x4
                     , CRYPT_STRING_HEXRAW: 0xC }
   fmt := formats[formatName]
   chars := StrLen(string)
   if !DllCall("Crypt32\CryptStringToBinary", "Str", string, "UInt", chars, "UInt", fmt
                                            , "Ptr", 0, "UIntP", bytes, "UIntP", 0, "UIntP", 0)
      throw "CryptStringToBinary failed. LastError: " . A_LastError
   VarSetCapacity(outData, bytes)
   DllCall("Crypt32\CryptStringToBinary", "Str", string, "UInt", chars, "UInt", fmt
                                        , "Str", outData, "UIntP", bytes, "UIntP", 0, "UIntP", 0)
   Return bytes
}

GetMax(pData, len) {
   VarSetCapacity(sortedData, len, 0)
   DllCall("RtlMoveMemory", "Ptr", &sortedData, "Ptr", pData, "Ptr", len)
   pCmp := RegisterCallback("cmp", "C F", 2)
   DllCall("msvcrt\qsort", "Ptr", &sortedData, "Ptr", len, "Ptr", 1, "Ptr", pCmp, "Cdecl")
   max := NumGet(sortedData, "UChar"), (len > 1 && prevMax := NumGet(sortedData, 1, "UChar"))
   Loop % len {
      i := A_Index - 1
      res := NumGet(pData + i, "UChar")
     (res = max && maxIdx := i)
     (res = prevMax && prevMaxIdx := i)
   }
   Return {max: max, maxIdx: maxIdx, prevMax: prevMax, prevMaxIdx: prevMaxIdx}
}

cmp(a, b) {
   a := NumGet(a + 0, "UChar"), b := NumGet(b + 0, "UChar")
   Return a < b ? 1 : a > b ? -1 : 0
}
For me it works quite fast.
Last edited by teadrinker on 06 May 2021, 13:06, edited 3 times in total.

pneumatic
Posts: 338
Joined: 05 Dec 2016, 01:51

Re: Search optimisation

Post by pneumatic » 05 May 2021, 19:08

Incredible solutions, thank you very much.

It's a close call, but I would have to favour teadrinker's solution as it supports 64-bit and is a bit more orthodox.

Apart from that the mcode solution is very cool; AHK is amazing!

teadrinker
Posts: 4309
Joined: 29 Mar 2015, 09:41
Contact:

Re: Search optimisation

Post by teadrinker » 05 May 2021, 19:28

I edited my post, simplified the code a bit. :)

pneumatic
Posts: 338
Joined: 05 Dec 2016, 01:51

Re: Search optimisation

Post by pneumatic » 05 May 2021, 19:37

teadrinker wrote:
05 May 2021, 19:28
I edited my post, simplified the code a bit. :)
It's significantly faster, but worse for noobs like me who want to do other stuff inside the while loop like update a progress bar or something.

User avatar
rommmcek
Posts: 1470
Joined: 15 Aug 2014, 15:18

Re: Search optimisation

Post by rommmcek » 05 May 2021, 22:51

@teadrinker: To retrieve offset of the last four bytes I need to increase size by one i.e. size:= ReadFile(filePath, buff)+1... Why?

teadrinker
Posts: 4309
Joined: 29 Mar 2015, 09:41
Contact:

Re: Search optimisation

Post by teadrinker » 05 May 2021, 23:44

Search area was incorrectly calculated, must be one byte more. Should work now, fixed.

User avatar
rommmcek
Posts: 1470
Joined: 15 Aug 2014, 15:18

Re: Search optimisation

Post by rommmcek » 06 May 2021, 00:25

Works fine, thanks! Approx. 4 times faster as my while loop.

User avatar
SKAN
Posts: 1551
Joined: 29 Sep 2013, 16:58

Re: Search optimisation

Post by SKAN » 06 May 2021, 00:46

@teadrinker

Code: Select all

   while addr := DllCall("msvcrt\memchr", "Ptr", addr + 1, "Int", maxByte, "Ptr", maxAddr - addr) {
      ( DllCall("msvcrt\memcmp", "Ptr", addr - offset, "Ptr", pNeedle, "Ptr", needleSize) = 0
        && found .= Format("{:#x}", addr - offset - pHaystack) . "`n" )
Nice!
A suggestion: When memchr() succeeds, check whether the last byte of needle is a match (with NumGet(.."UChar")) and then call memcmp()

pneumatic
Posts: 338
Joined: 05 Dec 2016, 01:51

Re: Search optimisation

Post by pneumatic » 06 May 2021, 01:52

[deleted]

Reason for deletion:

I couldn't understand teadrinker's code and was asking for help, then realised I miscalculated maxByte which is actually the index of the largest byte in the needle, then it all made sense.

I presume also that searching for maxByte speeds up the search if it corresponds to the leftmost bit which I presume is the first bit to be compared at the machine level, causing a short circuit boolean evaluation and less comparisons?

teadrinker
Posts: 4309
Joined: 29 Mar 2015, 09:41
Contact:

Re: Search optimisation

Post by teadrinker » 06 May 2021, 05:51

pneumatic wrote: I presume also that searching for maxByte speeds up the search if it corresponds to the leftmost bit which I presume is the first bit to be compared at the machine level, causing a short circuit boolean evaluation and less comparisons?
Yeah, I assumed that the byte with max value is generally encountered less frequently. At least it can be non-zero, zero generally encountered very often.
SKAN wrote: A suggestion: When memchr() succeeds, check whether the last byte of needle is a match
Thanks! I took previous byte by value:

Code: Select all

SetBatchLines, -1

filePath := "C:\Users\mi\AppData\Roaming\XnViewMP\Thumb.db"
bytes := "74 72 61"

size := ReadFile(filePath, buff)
len := CryptStringToBinary(bytes, needle, "CRYPT_STRING_HEX")
start := A_TickCount
offsets := SearchInBuff(&buff, size, &needle, len)
MsgBox,, % A_TickCount - start . "ms", % offsets

SearchInBuff(pHaystack, hstkSize, pNeedle, needleSize) {
   arr := GetMax(pNeedle, needleSize)
   maxByte := arr.max, offset := arr.maxIdx
   prevByte := arr.prevMax, prevByteShift := arr.prevMaxIdx - offset
   bool := needleSize > 1
   found := "", addr := pHaystack + offset - 1, maxAddr := addr + hstkSize - needleSize + 1
   while addr := DllCall("msvcrt\memchr", "Ptr", addr + 1, "Int", maxByte, "Ptr", maxAddr - addr) {
      ( (bool ? NumGet(addr + prevByteShift, "UChar") = prevByte : true)
        && DllCall("msvcrt\memcmp", "Ptr", addr - offset, "Ptr", pNeedle, "Ptr", needleSize) = 0
        && found .= Format("{:#x}", addr - offset - pHaystack) . "`n" )
   }
   Return RTrim(found, "`n")
}

ReadFile(filePath, ByRef buff) {
   File := FileOpen(filePath, "r")
   File.Pos := 0
   File.RawRead(buff, size := File.Length)
   File.Close()
   Return size
}

CryptStringToBinary(string, ByRef outData, formatName := "CRYPT_STRING_BASE64")
{
   static formats := { CRYPT_STRING_BASE64: 0x1
                     , CRYPT_STRING_HEX:    0x4
                     , CRYPT_STRING_HEXRAW: 0xC }
   fmt := formats[formatName]
   chars := StrLen(string)
   if !DllCall("Crypt32\CryptStringToBinary", "Str", string, "UInt", chars, "UInt", fmt
                                            , "Ptr", 0, "UIntP", bytes, "UIntP", 0, "UIntP", 0)
      throw "CryptStringToBinary failed. LastError: " . A_LastError
   VarSetCapacity(outData, bytes)
   DllCall("Crypt32\CryptStringToBinary", "Str", string, "UInt", chars, "UInt", fmt
                                        , "Str", outData, "UIntP", bytes, "UIntP", 0, "UIntP", 0)
   Return bytes
}

GetMax(pData, len) {
   VarSetCapacity(sortedData, len, 0)
   DllCall("RtlMoveMemory", "Ptr", &sortedData, "Ptr", pData, "Ptr", len)
   pCmp := RegisterCallback("cmp", "C F", 2)
   DllCall("msvcrt\qsort", "Ptr", &sortedData, "Ptr", len, "Ptr", 1, "Ptr", pCmp)
   max := NumGet(sortedData, "UChar"), (len > 1 && prevMax := NumGet(sortedData, 1, "UChar"))
   Loop % len {
      i := A_Index - 1
      res := NumGet(pData + i, "UChar")
      (res = max && maxIdx := i), (res = prevMax && prevMaxIdx := i)
   }
   Return {max: max, maxIdx: maxIdx, prevMax: prevMax, prevMaxIdx: prevMaxIdx}
}

cmp(a, b) {
   a := NumGet(a + 0, "UChar"), b := NumGet(b + 0, "UChar")
   Return a < b ? 1 : a > b ? -1 : 0
}
This approach reduces the search time by about 1/5.

pneumatic
Posts: 338
Joined: 05 Dec 2016, 01:51

Re: Search optimisation

Post by pneumatic » 06 May 2021, 07:32

Well as it turns out, I actually need more speed that only the mcode solution can provide.

The mcode seems to be around 10x faster -- it's fricken fast!

The problem is that I also need to search the binary file for a pattern that matches some logical rules to do with their sum and what bytes are around them, and it's causing a CPU bottleneck to perform those logical checks. I am trying to workaround this by using a lookup table containing all possible byte strings that would return true for those logical checks, and searching the file for each and every one of those strings. Because mcode is so fast, this ends up being quite a lot faster.

edit: actually the speed of mcode seems to vary depending on the string being searched for. Maybe on average it's not as hugely faster as I thought? Wouldn't those C funcs in teadrinker's solution also be low level ASM code?

pneumatic
Posts: 338
Joined: 05 Dec 2016, 01:51

Re: Search optimisation

Post by pneumatic » 06 May 2021, 07:58

I've noticed on AutohotkeyU32, teadrinker's solution throws a runtime error on the msvcrt DllCalls, but ErrorLevel is 0 in the catch block, and the function still seems to work fine.

So if you are putting the DllCalls in a try block then it will fail.

AutohotkeyU64 doesn't have this issue.

Also AutohotkeyU64 with the mcode solution seems to go into an infinite loop with CPU pegged at 100% usage -- might be worth putting some protection code in there.

teadrinker
Posts: 4309
Joined: 29 Mar 2015, 09:41
Contact:

Re: Search optimisation

Post by teadrinker » 06 May 2021, 08:30

pneumatic wrote: teadrinker's solution throws a runtime error on the msvcrt DllCalls
Perhaps, "Cdecl" must be added to all msvcrt functions:

Code: Select all

SetBatchLines, -1

filePath := "C:\Users\mi\AppData\Roaming\XnViewMP\Thumb.db"
bytes := "74 72 61"

size := ReadFile(filePath, buff)
len := CryptStringToBinary(bytes, needle, "CRYPT_STRING_HEX")
start := A_TickCount
offsets := SearchInBuff(&buff, size, &needle, len)
MsgBox,, % A_TickCount - start . "ms", % offsets

SearchInBuff(pHaystack, hstkSize, pNeedle, needleSize) {
   arr := GetMax(pNeedle, needleSize)
   maxByte := arr.max, offset := arr.maxIdx
   prevByte := arr.prevMax, prevByteShift := arr.prevMaxIdx - offset
   bool := needleSize > 1
   found := "", addr := pHaystack + offset - 1, maxAddr := addr + hstkSize - needleSize + 1
   while addr := DllCall("msvcrt\memchr", "Ptr", addr + 1, "Int", maxByte, "Ptr", maxAddr - addr, "Cdecl Ptr") {
      ( (bool ? NumGet(addr + prevByteShift, "UChar") = prevByte : true)
        && DllCall("msvcrt\memcmp", "Ptr", addr - offset, "Ptr", pNeedle, "Ptr", needleSize, "Cdecl") = 0
        && found .= Format("{:#x}", addr - offset - pHaystack) . "`n" )
   }
   Return RTrim(found, "`n")
}

ReadFile(filePath, ByRef buff) {
   File := FileOpen(filePath, "r")
   File.Pos := 0
   File.RawRead(buff, size := File.Length)
   File.Close()
   Return size
}

CryptStringToBinary(string, ByRef outData, formatName := "CRYPT_STRING_BASE64")
{
   static formats := { CRYPT_STRING_BASE64: 0x1
                     , CRYPT_STRING_HEX:    0x4
                     , CRYPT_STRING_HEXRAW: 0xC }
   fmt := formats[formatName]
   chars := StrLen(string)
   if !DllCall("Crypt32\CryptStringToBinary", "Str", string, "UInt", chars, "UInt", fmt
                                            , "Ptr", 0, "UIntP", bytes, "UIntP", 0, "UIntP", 0)
      throw "CryptStringToBinary failed. LastError: " . A_LastError
   VarSetCapacity(outData, bytes)
   DllCall("Crypt32\CryptStringToBinary", "Str", string, "UInt", chars, "UInt", fmt
                                        , "Str", outData, "UIntP", bytes, "UIntP", 0, "UIntP", 0)
   Return bytes
}

GetMax(pData, len) {
   VarSetCapacity(sortedData, len, 0)
   DllCall("RtlMoveMemory", "Ptr", &sortedData, "Ptr", pData, "Ptr", len)
   pCmp := RegisterCallback("cmp", "C F", 2)
   DllCall("msvcrt\qsort", "Ptr", &sortedData, "Ptr", len, "Ptr", 1, "Ptr", pCmp, "Cdecl")
   max := NumGet(sortedData, "UChar"), (len > 1 && prevMax := NumGet(sortedData, 1, "UChar"))
   Loop % len {
      i := A_Index - 1
      res := NumGet(pData + i, "UChar")
      (res = max && maxIdx := i), (res = prevMax && prevMaxIdx := i)
   }
   Return {max: max, maxIdx: maxIdx, prevMax: prevMax, prevMaxIdx: prevMaxIdx}
}

cmp(a, b) {
   a := NumGet(a + 0, "UChar"), b := NumGet(b + 0, "UChar")
   Return a < b ? 1 : a > b ? -1 : 0
}
Although, I did not have such a problem.
pneumatic wrote: Wouldn't those C funcs in teadrinker's solution also be low level ASM code?
Nope.
Last edited by teadrinker on 06 May 2021, 08:55, edited 3 times in total.

User avatar
jNizM
Posts: 3183
Joined: 30 Sep 2013, 01:33
Contact:

Re: Search optimisation

Post by jNizM » 06 May 2021, 08:36

teadrinker wrote:Perhaps, "Cdecl" must be added to all msvcrt functions:
Maybe at least a "ptr" after memchr

ref: https://autohotkey.com/board/topic/82986-ahk-l-decompiler-payload-method/page-2#entry529503
[AHK] v2.0.5 | [WIN] 11 Pro (Version 22H2) | [GitHub] Profile

Post Reply

Return to “Ask for Help (v1)”