Page 1 of 3

Search optimisation

Posted: 04 May 2021, 20:58
by pneumatic
Hello

I am trying to search a 7MB binary file for a few byte patterns, similar to using a search function in a typical hex editor program.

I am able to implement this in AHK with a naive search algorithm ('linear search') in conjunction with FileRead *C and NumGet UChar.

Being linear search it is obviously quite slow though.

For example, a typical hex editor can find all occurrences of a 24 byte pattern in a 7MB binary in just a fraction of a second -- they are obviously not using linear search and iterating through each byte like my AHK implementation.

Therefore, I am interested in implementing a smarter search algorithm in AHK.

Questions:

1. Which algorithm should I try to implement in AHK? Should I sort the data first and then use something like binary search / divide and conquer / hash tables? Wouldn't it take a very long time to sort the binary data? Hex editors can open and search files immediately, so I'm not sure they are sorting the binary data first.

2. Is there already a third party AHK implementation available?

3. Are there Windows API functions that implement various search algorithms on unsorted data?

Thanks

Re: Search optimisation

Posted: 04 May 2021, 21:20
by boiler
It would be a relatively straightforward C++ function, which would be compiled into machine code and called by the AHK script. Even a linear algorithm would be much faster. It has been done often with AHK for graphics manipulations and searches, such as this one. You would just need to pass it the addresses and lengths of the needle and haystack. The online MCode compiler that I used seems to be down, but there are other compilers available as discussed here.

Re: Search optimisation

Posted: 04 May 2021, 22:08
by pneumatic
I'm still not confident that linear search implemented with mcode would be similarly fast to hex editors such as HxD, simply because of how CPU hungry linear search is.

Perhaps there is some other optimisation technique using bitwise operations?

One idea is that since a byte can only have 256 different values, create 256 buckets with each bucket containing offsets of all occurrences of that byte. Then when you want to search for a particular byte, you only need to search that one bucket. The bucket creation could be done at file readin using different memory ranges corresponding to buckets so it probably needn't take any extra processing time.

edit: it seems this bucket sorting might also facilitate binary search? Because if we're reading the file in sequentially, the offsets in the buckets will also be sorted sequentially. So we'd end up with sequentially sorted buckets with each bucket containing sequentially sorted offsets.

But then the file would have to be read in 1 byte at a time, which I suspect is going to bog it down (in AHK it does ).

So maybe I need an mcode implementation to efficiently read the file into buckets.

Re: Search optimisation  Topic is solved

Posted: 04 May 2021, 23:56
by TAC109
@pneumatic
I believe you’re over-thinking it. If you extract the InBuf function from BinMod this should satisfy your requirements. It does an in-memory search (you should load the file into a variable first), and is very fast as it works in machine code. It was originally written by w0xx0m and published some time ago in the old forum (see the link in the function if interested). The parameters are Haystack address, Haystack size, Needle address, Needle size, and Offset in haystack to start the search. It returns the offset in haystack where the needle was found, or a negative number if the needle was not found.

Cheers

Edit: Machine code is 32 bit only.

Re: Search optimisation

Posted: 05 May 2021, 05:08
by boiler
@TAC109 - That’s interesting and useful. Thanks. I went to w0xx0m’s thread to see his C source and found it was written in assembly language. Impressive, and surely about as fast as possible. I see he has a function for searching binary file contents directly as well, which also could come in handy.

Re: Search optimisation

Posted: 05 May 2021, 13:20
by TAC109
I should have mentioned that this machine code only works with a 32 bit version of AutoHotkey.

Re: Search optimisation

Posted: 05 May 2021, 17:09
by teadrinker
Tried to implement without machine code:

Code: Select all

SetBatchLines, -1

filePath := "D:\Downloads\YouTube\Rammstein - Radio (Official Video).mkv"
bytes := "71 90 ef 8e"

size := ReadFile(filePath, buff)
len := CryptStringToBinary(bytes, needle, "CRYPT_STRING_HEX")
start := A_TickCount
offsets := SearchInBuff(&buff, size, &needle, len)
MsgBox,, % A_TickCount - start . "ms", % offsets

SearchInBuff(pHaystack, hstkSize, pNeedle, needleSize) {
   arr := GetMax(pNeedle, needleSize)
   maxByte := arr.max, offset := arr.maxIdx
   prevByte := arr.prevMax, prevByteShift := arr.prevMaxIdx - offset
   bool := needleSize > 1
   found := "", addr := pHaystack + offset - 1, maxAddr := addr + hstkSize - needleSize + 1
   while addr := DllCall("msvcrt\memchr", "Ptr", addr + 1, "Int", maxByte, "Ptr", maxAddr - addr, "Cdecl Ptr") {
      ( (bool ? NumGet(addr + prevByteShift, "UChar") = prevByte : true)
        && DllCall("msvcrt\memcmp", "Ptr", addr - offset, "Ptr", pNeedle, "Ptr", needleSize, "Cdecl") = 0
        && found .= Format("{:#x}", addr - offset - pHaystack) . "`n" )
   }
   Return RTrim(found, "`n")
}

ReadFile(filePath, ByRef buff) {
   File := FileOpen(filePath, "r")
   File.Pos := 0
   File.RawRead(buff, size := File.Length)
   File.Close()
   Return size
}

CryptStringToBinary(string, ByRef outData, formatName := "CRYPT_STRING_BASE64")
{
   static formats := { CRYPT_STRING_BASE64: 0x1
                     , CRYPT_STRING_HEX:    0x4
                     , CRYPT_STRING_HEXRAW: 0xC }
   fmt := formats[formatName]
   chars := StrLen(string)
   if !DllCall("Crypt32\CryptStringToBinary", "Str", string, "UInt", chars, "UInt", fmt
                                            , "Ptr", 0, "UIntP", bytes, "UIntP", 0, "UIntP", 0)
      throw "CryptStringToBinary failed. LastError: " . A_LastError
   VarSetCapacity(outData, bytes)
   DllCall("Crypt32\CryptStringToBinary", "Str", string, "UInt", chars, "UInt", fmt
                                        , "Str", outData, "UIntP", bytes, "UIntP", 0, "UIntP", 0)
   Return bytes
}

GetMax(pData, len) {
   VarSetCapacity(sortedData, len, 0)
   DllCall("RtlMoveMemory", "Ptr", &sortedData, "Ptr", pData, "Ptr", len)
   pCmp := RegisterCallback("cmp", "C F", 2)
   DllCall("msvcrt\qsort", "Ptr", &sortedData, "Ptr", len, "Ptr", 1, "Ptr", pCmp, "Cdecl")
   max := NumGet(sortedData, "UChar"), (len > 1 && prevMax := NumGet(sortedData, 1, "UChar"))
   Loop % len {
      i := A_Index - 1
      res := NumGet(pData + i, "UChar")
     (res = max && maxIdx := i)
     (res = prevMax && prevMaxIdx := i)
   }
   Return {max: max, maxIdx: maxIdx, prevMax: prevMax, prevMaxIdx: prevMaxIdx}
}

cmp(a, b) {
   a := NumGet(a + 0, "UChar"), b := NumGet(b + 0, "UChar")
   Return a < b ? 1 : a > b ? -1 : 0
}
For me it works quite fast.

Re: Search optimisation

Posted: 05 May 2021, 19:08
by pneumatic
Incredible solutions, thank you very much.

It's a close call, but I would have to favour teadrinker's solution as it supports 64-bit and is a bit more orthodox.

Apart from that the mcode solution is very cool; AHK is amazing!

Re: Search optimisation

Posted: 05 May 2021, 19:28
by teadrinker
I edited my post, simplified the code a bit. :)

Re: Search optimisation

Posted: 05 May 2021, 19:37
by pneumatic
teadrinker wrote:
05 May 2021, 19:28
I edited my post, simplified the code a bit. :)
It's significantly faster, but worse for noobs like me who want to do other stuff inside the while loop like update a progress bar or something.

Re: Search optimisation

Posted: 05 May 2021, 22:51
by rommmcek
@teadrinker: To retrieve offset of the last four bytes I need to increase size by one i.e. size:= ReadFile(filePath, buff)+1... Why?

Re: Search optimisation

Posted: 05 May 2021, 23:44
by teadrinker
Search area was incorrectly calculated, must be one byte more. Should work now, fixed.

Re: Search optimisation

Posted: 06 May 2021, 00:25
by rommmcek
Works fine, thanks! Approx. 4 times faster as my while loop.

Re: Search optimisation

Posted: 06 May 2021, 00:46
by SKAN
@teadrinker

Code: Select all

   while addr := DllCall("msvcrt\memchr", "Ptr", addr + 1, "Int", maxByte, "Ptr", maxAddr - addr) {
      ( DllCall("msvcrt\memcmp", "Ptr", addr - offset, "Ptr", pNeedle, "Ptr", needleSize) = 0
        && found .= Format("{:#x}", addr - offset - pHaystack) . "`n" )
Nice!
A suggestion: When memchr() succeeds, check whether the last byte of needle is a match (with NumGet(.."UChar")) and then call memcmp()

Re: Search optimisation

Posted: 06 May 2021, 01:52
by pneumatic
[deleted]

Reason for deletion:

I couldn't understand teadrinker's code and was asking for help, then realised I miscalculated maxByte which is actually the index of the largest byte in the needle, then it all made sense.

I presume also that searching for maxByte speeds up the search if it corresponds to the leftmost bit which I presume is the first bit to be compared at the machine level, causing a short circuit boolean evaluation and less comparisons?

Re: Search optimisation

Posted: 06 May 2021, 05:51
by teadrinker
pneumatic wrote: I presume also that searching for maxByte speeds up the search if it corresponds to the leftmost bit which I presume is the first bit to be compared at the machine level, causing a short circuit boolean evaluation and less comparisons?
Yeah, I assumed that the byte with max value is generally encountered less frequently. At least it can be non-zero, zero generally encountered very often.
SKAN wrote: A suggestion: When memchr() succeeds, check whether the last byte of needle is a match
Thanks! I took previous byte by value:

Code: Select all

SetBatchLines, -1

filePath := "C:\Users\mi\AppData\Roaming\XnViewMP\Thumb.db"
bytes := "74 72 61"

size := ReadFile(filePath, buff)
len := CryptStringToBinary(bytes, needle, "CRYPT_STRING_HEX")
start := A_TickCount
offsets := SearchInBuff(&buff, size, &needle, len)
MsgBox,, % A_TickCount - start . "ms", % offsets

SearchInBuff(pHaystack, hstkSize, pNeedle, needleSize) {
   arr := GetMax(pNeedle, needleSize)
   maxByte := arr.max, offset := arr.maxIdx
   prevByte := arr.prevMax, prevByteShift := arr.prevMaxIdx - offset
   bool := needleSize > 1
   found := "", addr := pHaystack + offset - 1, maxAddr := addr + hstkSize - needleSize + 1
   while addr := DllCall("msvcrt\memchr", "Ptr", addr + 1, "Int", maxByte, "Ptr", maxAddr - addr) {
      ( (bool ? NumGet(addr + prevByteShift, "UChar") = prevByte : true)
        && DllCall("msvcrt\memcmp", "Ptr", addr - offset, "Ptr", pNeedle, "Ptr", needleSize) = 0
        && found .= Format("{:#x}", addr - offset - pHaystack) . "`n" )
   }
   Return RTrim(found, "`n")
}

ReadFile(filePath, ByRef buff) {
   File := FileOpen(filePath, "r")
   File.Pos := 0
   File.RawRead(buff, size := File.Length)
   File.Close()
   Return size
}

CryptStringToBinary(string, ByRef outData, formatName := "CRYPT_STRING_BASE64")
{
   static formats := { CRYPT_STRING_BASE64: 0x1
                     , CRYPT_STRING_HEX:    0x4
                     , CRYPT_STRING_HEXRAW: 0xC }
   fmt := formats[formatName]
   chars := StrLen(string)
   if !DllCall("Crypt32\CryptStringToBinary", "Str", string, "UInt", chars, "UInt", fmt
                                            , "Ptr", 0, "UIntP", bytes, "UIntP", 0, "UIntP", 0)
      throw "CryptStringToBinary failed. LastError: " . A_LastError
   VarSetCapacity(outData, bytes)
   DllCall("Crypt32\CryptStringToBinary", "Str", string, "UInt", chars, "UInt", fmt
                                        , "Str", outData, "UIntP", bytes, "UIntP", 0, "UIntP", 0)
   Return bytes
}

GetMax(pData, len) {
   VarSetCapacity(sortedData, len, 0)
   DllCall("RtlMoveMemory", "Ptr", &sortedData, "Ptr", pData, "Ptr", len)
   pCmp := RegisterCallback("cmp", "C F", 2)
   DllCall("msvcrt\qsort", "Ptr", &sortedData, "Ptr", len, "Ptr", 1, "Ptr", pCmp)
   max := NumGet(sortedData, "UChar"), (len > 1 && prevMax := NumGet(sortedData, 1, "UChar"))
   Loop % len {
      i := A_Index - 1
      res := NumGet(pData + i, "UChar")
      (res = max && maxIdx := i), (res = prevMax && prevMaxIdx := i)
   }
   Return {max: max, maxIdx: maxIdx, prevMax: prevMax, prevMaxIdx: prevMaxIdx}
}

cmp(a, b) {
   a := NumGet(a + 0, "UChar"), b := NumGet(b + 0, "UChar")
   Return a < b ? 1 : a > b ? -1 : 0
}
This approach reduces the search time by about 1/5.

Re: Search optimisation

Posted: 06 May 2021, 07:32
by pneumatic
Well as it turns out, I actually need more speed that only the mcode solution can provide.

The mcode seems to be around 10x faster -- it's fricken fast!

The problem is that I also need to search the binary file for a pattern that matches some logical rules to do with their sum and what bytes are around them, and it's causing a CPU bottleneck to perform those logical checks. I am trying to workaround this by using a lookup table containing all possible byte strings that would return true for those logical checks, and searching the file for each and every one of those strings. Because mcode is so fast, this ends up being quite a lot faster.

edit: actually the speed of mcode seems to vary depending on the string being searched for. Maybe on average it's not as hugely faster as I thought? Wouldn't those C funcs in teadrinker's solution also be low level ASM code?

Re: Search optimisation

Posted: 06 May 2021, 07:58
by pneumatic
I've noticed on AutohotkeyU32, teadrinker's solution throws a runtime error on the msvcrt DllCalls, but ErrorLevel is 0 in the catch block, and the function still seems to work fine.

So if you are putting the DllCalls in a try block then it will fail.

AutohotkeyU64 doesn't have this issue.

Also AutohotkeyU64 with the mcode solution seems to go into an infinite loop with CPU pegged at 100% usage -- might be worth putting some protection code in there.

Re: Search optimisation

Posted: 06 May 2021, 08:30
by teadrinker
pneumatic wrote: teadrinker's solution throws a runtime error on the msvcrt DllCalls
Perhaps, "Cdecl" must be added to all msvcrt functions:

Code: Select all

SetBatchLines, -1

filePath := "C:\Users\mi\AppData\Roaming\XnViewMP\Thumb.db"
bytes := "74 72 61"

size := ReadFile(filePath, buff)
len := CryptStringToBinary(bytes, needle, "CRYPT_STRING_HEX")
start := A_TickCount
offsets := SearchInBuff(&buff, size, &needle, len)
MsgBox,, % A_TickCount - start . "ms", % offsets

SearchInBuff(pHaystack, hstkSize, pNeedle, needleSize) {
   arr := GetMax(pNeedle, needleSize)
   maxByte := arr.max, offset := arr.maxIdx
   prevByte := arr.prevMax, prevByteShift := arr.prevMaxIdx - offset
   bool := needleSize > 1
   found := "", addr := pHaystack + offset - 1, maxAddr := addr + hstkSize - needleSize + 1
   while addr := DllCall("msvcrt\memchr", "Ptr", addr + 1, "Int", maxByte, "Ptr", maxAddr - addr, "Cdecl Ptr") {
      ( (bool ? NumGet(addr + prevByteShift, "UChar") = prevByte : true)
        && DllCall("msvcrt\memcmp", "Ptr", addr - offset, "Ptr", pNeedle, "Ptr", needleSize, "Cdecl") = 0
        && found .= Format("{:#x}", addr - offset - pHaystack) . "`n" )
   }
   Return RTrim(found, "`n")
}

ReadFile(filePath, ByRef buff) {
   File := FileOpen(filePath, "r")
   File.Pos := 0
   File.RawRead(buff, size := File.Length)
   File.Close()
   Return size
}

CryptStringToBinary(string, ByRef outData, formatName := "CRYPT_STRING_BASE64")
{
   static formats := { CRYPT_STRING_BASE64: 0x1
                     , CRYPT_STRING_HEX:    0x4
                     , CRYPT_STRING_HEXRAW: 0xC }
   fmt := formats[formatName]
   chars := StrLen(string)
   if !DllCall("Crypt32\CryptStringToBinary", "Str", string, "UInt", chars, "UInt", fmt
                                            , "Ptr", 0, "UIntP", bytes, "UIntP", 0, "UIntP", 0)
      throw "CryptStringToBinary failed. LastError: " . A_LastError
   VarSetCapacity(outData, bytes)
   DllCall("Crypt32\CryptStringToBinary", "Str", string, "UInt", chars, "UInt", fmt
                                        , "Str", outData, "UIntP", bytes, "UIntP", 0, "UIntP", 0)
   Return bytes
}

GetMax(pData, len) {
   VarSetCapacity(sortedData, len, 0)
   DllCall("RtlMoveMemory", "Ptr", &sortedData, "Ptr", pData, "Ptr", len)
   pCmp := RegisterCallback("cmp", "C F", 2)
   DllCall("msvcrt\qsort", "Ptr", &sortedData, "Ptr", len, "Ptr", 1, "Ptr", pCmp, "Cdecl")
   max := NumGet(sortedData, "UChar"), (len > 1 && prevMax := NumGet(sortedData, 1, "UChar"))
   Loop % len {
      i := A_Index - 1
      res := NumGet(pData + i, "UChar")
      (res = max && maxIdx := i), (res = prevMax && prevMaxIdx := i)
   }
   Return {max: max, maxIdx: maxIdx, prevMax: prevMax, prevMaxIdx: prevMaxIdx}
}

cmp(a, b) {
   a := NumGet(a + 0, "UChar"), b := NumGet(b + 0, "UChar")
   Return a < b ? 1 : a > b ? -1 : 0
}
Although, I did not have such a problem.
pneumatic wrote: Wouldn't those C funcs in teadrinker's solution also be low level ASM code?
Nope.

Re: Search optimisation

Posted: 06 May 2021, 08:36
by jNizM
teadrinker wrote:Perhaps, "Cdecl" must be added to all msvcrt functions:
Maybe at least a "ptr" after memchr

ref: https://autohotkey.com/board/topic/82986-ahk-l-decompiler-payload-method/page-2#entry529503