Hello
I am trying to search a 7MB binary file for a few byte patterns, similar to using a search function in a typical hex editor program.
I am able to implement this in AHK with a naive search algorithm ('linear search') in conjunction with FileRead *C and NumGet UChar.
Being linear search it is obviously quite slow though.
For example, a typical hex editor can find all occurrences of a 24 byte pattern in a 7MB binary in just a fraction of a second -- they are obviously not using linear search and iterating through each byte like my AHK implementation.
Therefore, I am interested in implementing a smarter search algorithm in AHK.
Questions:
1. Which algorithm should I try to implement in AHK? Should I sort the data first and then use something like binary search / divide and conquer / hash tables? Wouldn't it take a very long time to sort the binary data? Hex editors can open and search files immediately, so I'm not sure they are sorting the binary data first.
2. Is there already a third party AHK implementation available?
3. Are there Windows API functions that implement various search algorithms on unsorted data?
Thanks
Search optimisation Topic is solved
Re: Search optimisation
It would be a relatively straightforward C++ function, which would be compiled into machine code and called by the AHK script. Even a linear algorithm would be much faster. It has been done often with AHK for graphics manipulations and searches, such as this one. You would just need to pass it the addresses and lengths of the needle and haystack. The online MCode compiler that I used seems to be down, but there are other compilers available as discussed here.
Re: Search optimisation
I'm still not confident that linear search implemented with mcode would be similarly fast to hex editors such as HxD, simply because of how CPU hungry linear search is.
Perhaps there is some other optimisation technique using bitwise operations?
One idea is that since a byte can only have 256 different values, create 256 buckets with each bucket containing offsets of all occurrences of that byte. Then when you want to search for a particular byte, you only need to search that one bucket. The bucket creation could be done at file readin using different memory ranges corresponding to buckets so it probably needn't take any extra processing time.
edit: it seems this bucket sorting might also facilitate binary search? Because if we're reading the file in sequentially, the offsets in the buckets will also be sorted sequentially. So we'd end up with sequentially sorted buckets with each bucket containing sequentially sorted offsets.
But then the file would have to be read in 1 byte at a time, which I suspect is going to bog it down (in AHK it does ).
So maybe I need an mcode implementation to efficiently read the file into buckets.
Perhaps there is some other optimisation technique using bitwise operations?
One idea is that since a byte can only have 256 different values, create 256 buckets with each bucket containing offsets of all occurrences of that byte. Then when you want to search for a particular byte, you only need to search that one bucket. The bucket creation could be done at file readin using different memory ranges corresponding to buckets so it probably needn't take any extra processing time.
edit: it seems this bucket sorting might also facilitate binary search? Because if we're reading the file in sequentially, the offsets in the buckets will also be sorted sequentially. So we'd end up with sequentially sorted buckets with each bucket containing sequentially sorted offsets.
But then the file would have to be read in 1 byte at a time, which I suspect is going to bog it down (in AHK it does ).
So maybe I need an mcode implementation to efficiently read the file into buckets.
Re: Search optimisation Topic is solved
@pneumatic
I believe you’re over-thinking it. If you extract the InBuf function from BinMod this should satisfy your requirements. It does an in-memory search (you should load the file into a variable first), and is very fast as it works in machine code. It was originally written by w0xx0m and published some time ago in the old forum (see the link in the function if interested). The parameters are Haystack address, Haystack size, Needle address, Needle size, and Offset in haystack to start the search. It returns the offset in haystack where the needle was found, or a negative number if the needle was not found.
Cheers
Edit: Machine code is 32 bit only.
I believe you’re over-thinking it. If you extract the InBuf function from BinMod this should satisfy your requirements. It does an in-memory search (you should load the file into a variable first), and is very fast as it works in machine code. It was originally written by w0xx0m and published some time ago in the old forum (see the link in the function if interested). The parameters are Haystack address, Haystack size, Needle address, Needle size, and Offset in haystack to start the search. It returns the offset in haystack where the needle was found, or a negative number if the needle was not found.
Cheers
Edit: Machine code is 32 bit only.
Last edited by TAC109 on 05 May 2021, 16:41, edited 1 time in total.
My scripts:-
XRef - Produces Cross Reference lists for scripts
ReClip - A Text Reformatting and Clip Management utility
ScriptGuard - Protects Compiled Scripts from Decompilation
I also maintain Ahk2Exe
XRef - Produces Cross Reference lists for scripts
ReClip - A Text Reformatting and Clip Management utility
ScriptGuard - Protects Compiled Scripts from Decompilation
I also maintain Ahk2Exe
Re: Search optimisation
@TAC109 - That’s interesting and useful. Thanks. I went to w0xx0m’s thread to see his C source and found it was written in assembly language. Impressive, and surely about as fast as possible. I see he has a function for searching binary file contents directly as well, which also could come in handy.
Re: Search optimisation
I should have mentioned that this machine code only works with a 32 bit version of AutoHotkey.
My scripts:-
XRef - Produces Cross Reference lists for scripts
ReClip - A Text Reformatting and Clip Management utility
ScriptGuard - Protects Compiled Scripts from Decompilation
I also maintain Ahk2Exe
XRef - Produces Cross Reference lists for scripts
ReClip - A Text Reformatting and Clip Management utility
ScriptGuard - Protects Compiled Scripts from Decompilation
I also maintain Ahk2Exe
-
- Posts: 4411
- Joined: 29 Mar 2015, 09:41
- Contact:
Re: Search optimisation
Tried to implement without machine code:
For me it works quite fast.
Code: Select all
SetBatchLines, -1
filePath := "D:\Downloads\YouTube\Rammstein - Radio (Official Video).mkv"
bytes := "71 90 ef 8e"
size := ReadFile(filePath, buff)
len := CryptStringToBinary(bytes, needle, "CRYPT_STRING_HEX")
start := A_TickCount
offsets := SearchInBuff(&buff, size, &needle, len)
MsgBox,, % A_TickCount - start . "ms", % offsets
SearchInBuff(pHaystack, hstkSize, pNeedle, needleSize) {
arr := GetMax(pNeedle, needleSize)
maxByte := arr.max, offset := arr.maxIdx
prevByte := arr.prevMax, prevByteShift := arr.prevMaxIdx - offset
bool := needleSize > 1
found := "", addr := pHaystack + offset - 1, maxAddr := addr + hstkSize - needleSize + 1
while addr := DllCall("msvcrt\memchr", "Ptr", addr + 1, "Int", maxByte, "Ptr", maxAddr - addr, "Cdecl Ptr") {
( (bool ? NumGet(addr + prevByteShift, "UChar") = prevByte : true)
&& DllCall("msvcrt\memcmp", "Ptr", addr - offset, "Ptr", pNeedle, "Ptr", needleSize, "Cdecl") = 0
&& found .= Format("{:#x}", addr - offset - pHaystack) . "`n" )
}
Return RTrim(found, "`n")
}
ReadFile(filePath, ByRef buff) {
File := FileOpen(filePath, "r")
File.Pos := 0
File.RawRead(buff, size := File.Length)
File.Close()
Return size
}
CryptStringToBinary(string, ByRef outData, formatName := "CRYPT_STRING_BASE64")
{
static formats := { CRYPT_STRING_BASE64: 0x1
, CRYPT_STRING_HEX: 0x4
, CRYPT_STRING_HEXRAW: 0xC }
fmt := formats[formatName]
chars := StrLen(string)
if !DllCall("Crypt32\CryptStringToBinary", "Str", string, "UInt", chars, "UInt", fmt
, "Ptr", 0, "UIntP", bytes, "UIntP", 0, "UIntP", 0)
throw "CryptStringToBinary failed. LastError: " . A_LastError
VarSetCapacity(outData, bytes)
DllCall("Crypt32\CryptStringToBinary", "Str", string, "UInt", chars, "UInt", fmt
, "Str", outData, "UIntP", bytes, "UIntP", 0, "UIntP", 0)
Return bytes
}
GetMax(pData, len) {
VarSetCapacity(sortedData, len, 0)
DllCall("RtlMoveMemory", "Ptr", &sortedData, "Ptr", pData, "Ptr", len)
pCmp := RegisterCallback("cmp", "C F", 2)
DllCall("msvcrt\qsort", "Ptr", &sortedData, "Ptr", len, "Ptr", 1, "Ptr", pCmp, "Cdecl")
max := NumGet(sortedData, "UChar"), (len > 1 && prevMax := NumGet(sortedData, 1, "UChar"))
Loop % len {
i := A_Index - 1
res := NumGet(pData + i, "UChar")
(res = max && maxIdx := i)
(res = prevMax && prevMaxIdx := i)
}
Return {max: max, maxIdx: maxIdx, prevMax: prevMax, prevMaxIdx: prevMaxIdx}
}
cmp(a, b) {
a := NumGet(a + 0, "UChar"), b := NumGet(b + 0, "UChar")
Return a < b ? 1 : a > b ? -1 : 0
}
Last edited by teadrinker on 06 May 2021, 13:06, edited 3 times in total.
Re: Search optimisation
Incredible solutions, thank you very much.
It's a close call, but I would have to favour teadrinker's solution as it supports 64-bit and is a bit more orthodox.
Apart from that the mcode solution is very cool; AHK is amazing!
It's a close call, but I would have to favour teadrinker's solution as it supports 64-bit and is a bit more orthodox.
Apart from that the mcode solution is very cool; AHK is amazing!
-
- Posts: 4411
- Joined: 29 Mar 2015, 09:41
- Contact:
Re: Search optimisation
I edited my post, simplified the code a bit.
Re: Search optimisation
It's significantly faster, but worse for noobs like me who want to do other stuff inside the while loop like update a progress bar or something.
Re: Search optimisation
@teadrinker: To retrieve offset of the last four bytes I need to increase size by one i.e. size:= ReadFile(filePath, buff)+1... Why?
-
- Posts: 4411
- Joined: 29 Mar 2015, 09:41
- Contact:
Re: Search optimisation
Search area was incorrectly calculated, must be one byte more. Should work now, fixed.
Re: Search optimisation
Works fine, thanks! Approx. 4 times faster as my while loop.
Re: Search optimisation
@teadrinker
A suggestion: When memchr() succeeds, check whether the last byte of needle is a match (with NumGet(.."UChar")) and then call memcmp()
Nice!Code: Select all
while addr := DllCall("msvcrt\memchr", "Ptr", addr + 1, "Int", maxByte, "Ptr", maxAddr - addr) { ( DllCall("msvcrt\memcmp", "Ptr", addr - offset, "Ptr", pNeedle, "Ptr", needleSize) = 0 && found .= Format("{:#x}", addr - offset - pHaystack) . "`n" )
A suggestion: When memchr() succeeds, check whether the last byte of needle is a match (with NumGet(.."UChar")) and then call memcmp()
Re: Search optimisation
[deleted]
Reason for deletion:
I couldn't understand teadrinker's code and was asking for help, then realised I miscalculated maxByte which is actually the index of the largest byte in the needle, then it all made sense.
I presume also that searching for maxByte speeds up the search if it corresponds to the leftmost bit which I presume is the first bit to be compared at the machine level, causing a short circuit boolean evaluation and less comparisons?
Reason for deletion:
I couldn't understand teadrinker's code and was asking for help, then realised I miscalculated maxByte which is actually the index of the largest byte in the needle, then it all made sense.
I presume also that searching for maxByte speeds up the search if it corresponds to the leftmost bit which I presume is the first bit to be compared at the machine level, causing a short circuit boolean evaluation and less comparisons?
-
- Posts: 4411
- Joined: 29 Mar 2015, 09:41
- Contact:
Re: Search optimisation
Yeah, I assumed that the byte with max value is generally encountered less frequently. At least it can be non-zero, zero generally encountered very often.
Thanks! I took previous byte by value:
Code: Select all
SetBatchLines, -1
filePath := "C:\Users\mi\AppData\Roaming\XnViewMP\Thumb.db"
bytes := "74 72 61"
size := ReadFile(filePath, buff)
len := CryptStringToBinary(bytes, needle, "CRYPT_STRING_HEX")
start := A_TickCount
offsets := SearchInBuff(&buff, size, &needle, len)
MsgBox,, % A_TickCount - start . "ms", % offsets
SearchInBuff(pHaystack, hstkSize, pNeedle, needleSize) {
arr := GetMax(pNeedle, needleSize)
maxByte := arr.max, offset := arr.maxIdx
prevByte := arr.prevMax, prevByteShift := arr.prevMaxIdx - offset
bool := needleSize > 1
found := "", addr := pHaystack + offset - 1, maxAddr := addr + hstkSize - needleSize + 1
while addr := DllCall("msvcrt\memchr", "Ptr", addr + 1, "Int", maxByte, "Ptr", maxAddr - addr) {
( (bool ? NumGet(addr + prevByteShift, "UChar") = prevByte : true)
&& DllCall("msvcrt\memcmp", "Ptr", addr - offset, "Ptr", pNeedle, "Ptr", needleSize) = 0
&& found .= Format("{:#x}", addr - offset - pHaystack) . "`n" )
}
Return RTrim(found, "`n")
}
ReadFile(filePath, ByRef buff) {
File := FileOpen(filePath, "r")
File.Pos := 0
File.RawRead(buff, size := File.Length)
File.Close()
Return size
}
CryptStringToBinary(string, ByRef outData, formatName := "CRYPT_STRING_BASE64")
{
static formats := { CRYPT_STRING_BASE64: 0x1
, CRYPT_STRING_HEX: 0x4
, CRYPT_STRING_HEXRAW: 0xC }
fmt := formats[formatName]
chars := StrLen(string)
if !DllCall("Crypt32\CryptStringToBinary", "Str", string, "UInt", chars, "UInt", fmt
, "Ptr", 0, "UIntP", bytes, "UIntP", 0, "UIntP", 0)
throw "CryptStringToBinary failed. LastError: " . A_LastError
VarSetCapacity(outData, bytes)
DllCall("Crypt32\CryptStringToBinary", "Str", string, "UInt", chars, "UInt", fmt
, "Str", outData, "UIntP", bytes, "UIntP", 0, "UIntP", 0)
Return bytes
}
GetMax(pData, len) {
VarSetCapacity(sortedData, len, 0)
DllCall("RtlMoveMemory", "Ptr", &sortedData, "Ptr", pData, "Ptr", len)
pCmp := RegisterCallback("cmp", "C F", 2)
DllCall("msvcrt\qsort", "Ptr", &sortedData, "Ptr", len, "Ptr", 1, "Ptr", pCmp)
max := NumGet(sortedData, "UChar"), (len > 1 && prevMax := NumGet(sortedData, 1, "UChar"))
Loop % len {
i := A_Index - 1
res := NumGet(pData + i, "UChar")
(res = max && maxIdx := i), (res = prevMax && prevMaxIdx := i)
}
Return {max: max, maxIdx: maxIdx, prevMax: prevMax, prevMaxIdx: prevMaxIdx}
}
cmp(a, b) {
a := NumGet(a + 0, "UChar"), b := NumGet(b + 0, "UChar")
Return a < b ? 1 : a > b ? -1 : 0
}
Re: Search optimisation
Well as it turns out, I actually need more speed that only the mcode solution can provide.
The mcode seems to be around 10x faster -- it's fricken fast!
The problem is that I also need to search the binary file for a pattern that matches some logical rules to do with their sum and what bytes are around them, and it's causing a CPU bottleneck to perform those logical checks. I am trying to workaround this by using a lookup table containing all possible byte strings that would return true for those logical checks, and searching the file for each and every one of those strings. Because mcode is so fast, this ends up being quite a lot faster.
edit: actually the speed of mcode seems to vary depending on the string being searched for. Maybe on average it's not as hugely faster as I thought? Wouldn't those C funcs in teadrinker's solution also be low level ASM code?
The mcode seems to be around 10x faster -- it's fricken fast!
The problem is that I also need to search the binary file for a pattern that matches some logical rules to do with their sum and what bytes are around them, and it's causing a CPU bottleneck to perform those logical checks. I am trying to workaround this by using a lookup table containing all possible byte strings that would return true for those logical checks, and searching the file for each and every one of those strings. Because mcode is so fast, this ends up being quite a lot faster.
edit: actually the speed of mcode seems to vary depending on the string being searched for. Maybe on average it's not as hugely faster as I thought? Wouldn't those C funcs in teadrinker's solution also be low level ASM code?
Re: Search optimisation
I've noticed on AutohotkeyU32, teadrinker's solution throws a runtime error on the msvcrt DllCalls, but ErrorLevel is 0 in the catch block, and the function still seems to work fine.
So if you are putting the DllCalls in a try block then it will fail.
AutohotkeyU64 doesn't have this issue.
Also AutohotkeyU64 with the mcode solution seems to go into an infinite loop with CPU pegged at 100% usage -- might be worth putting some protection code in there.
So if you are putting the DllCalls in a try block then it will fail.
AutohotkeyU64 doesn't have this issue.
Also AutohotkeyU64 with the mcode solution seems to go into an infinite loop with CPU pegged at 100% usage -- might be worth putting some protection code in there.
-
- Posts: 4411
- Joined: 29 Mar 2015, 09:41
- Contact:
Re: Search optimisation
Perhaps, "Cdecl" must be added to all msvcrt functions:
Code: Select all
SetBatchLines, -1
filePath := "C:\Users\mi\AppData\Roaming\XnViewMP\Thumb.db"
bytes := "74 72 61"
size := ReadFile(filePath, buff)
len := CryptStringToBinary(bytes, needle, "CRYPT_STRING_HEX")
start := A_TickCount
offsets := SearchInBuff(&buff, size, &needle, len)
MsgBox,, % A_TickCount - start . "ms", % offsets
SearchInBuff(pHaystack, hstkSize, pNeedle, needleSize) {
arr := GetMax(pNeedle, needleSize)
maxByte := arr.max, offset := arr.maxIdx
prevByte := arr.prevMax, prevByteShift := arr.prevMaxIdx - offset
bool := needleSize > 1
found := "", addr := pHaystack + offset - 1, maxAddr := addr + hstkSize - needleSize + 1
while addr := DllCall("msvcrt\memchr", "Ptr", addr + 1, "Int", maxByte, "Ptr", maxAddr - addr, "Cdecl Ptr") {
( (bool ? NumGet(addr + prevByteShift, "UChar") = prevByte : true)
&& DllCall("msvcrt\memcmp", "Ptr", addr - offset, "Ptr", pNeedle, "Ptr", needleSize, "Cdecl") = 0
&& found .= Format("{:#x}", addr - offset - pHaystack) . "`n" )
}
Return RTrim(found, "`n")
}
ReadFile(filePath, ByRef buff) {
File := FileOpen(filePath, "r")
File.Pos := 0
File.RawRead(buff, size := File.Length)
File.Close()
Return size
}
CryptStringToBinary(string, ByRef outData, formatName := "CRYPT_STRING_BASE64")
{
static formats := { CRYPT_STRING_BASE64: 0x1
, CRYPT_STRING_HEX: 0x4
, CRYPT_STRING_HEXRAW: 0xC }
fmt := formats[formatName]
chars := StrLen(string)
if !DllCall("Crypt32\CryptStringToBinary", "Str", string, "UInt", chars, "UInt", fmt
, "Ptr", 0, "UIntP", bytes, "UIntP", 0, "UIntP", 0)
throw "CryptStringToBinary failed. LastError: " . A_LastError
VarSetCapacity(outData, bytes)
DllCall("Crypt32\CryptStringToBinary", "Str", string, "UInt", chars, "UInt", fmt
, "Str", outData, "UIntP", bytes, "UIntP", 0, "UIntP", 0)
Return bytes
}
GetMax(pData, len) {
VarSetCapacity(sortedData, len, 0)
DllCall("RtlMoveMemory", "Ptr", &sortedData, "Ptr", pData, "Ptr", len)
pCmp := RegisterCallback("cmp", "C F", 2)
DllCall("msvcrt\qsort", "Ptr", &sortedData, "Ptr", len, "Ptr", 1, "Ptr", pCmp, "Cdecl")
max := NumGet(sortedData, "UChar"), (len > 1 && prevMax := NumGet(sortedData, 1, "UChar"))
Loop % len {
i := A_Index - 1
res := NumGet(pData + i, "UChar")
(res = max && maxIdx := i), (res = prevMax && prevMaxIdx := i)
}
Return {max: max, maxIdx: maxIdx, prevMax: prevMax, prevMaxIdx: prevMaxIdx}
}
cmp(a, b) {
a := NumGet(a + 0, "UChar"), b := NumGet(b + 0, "UChar")
Return a < b ? 1 : a > b ? -1 : 0
}
Nope.
Last edited by teadrinker on 06 May 2021, 08:55, edited 3 times in total.
Re: Search optimisation
Maybe at least a "ptr" after memchrteadrinker wrote:Perhaps, "Cdecl" must be added to all msvcrt functions:
ref: https://autohotkey.com/board/topic/82986-ahk-l-decompiler-payload-method/page-2#entry529503
[AHK] v2.0.5 | [WIN] 11 Pro (Version 22H2) | [GitHub] Profile
Who is online
Users browsing this forum: Google [Bot], JKJadan and 267 guests