 |
AutoHotkey Community Let's help each other out
|
| View previous topic :: View next topic |
| Author |
Message |
n-l-i-d Guest
|
Posted: Fri Jan 09, 2009 11:40 pm Post subject: |
|
|
Been searching the forum for a fast file checksum code, and found this posting by Laszlo: a wrapper for the CRC32 function
| Quote: | The function CRC32 has three parameters.
- The first one is the name of a buffer, which can contain binary data.
- The second parameter is the length of the data in bytes. If omitted or not positive, Strlen(Buffer) is used internally.
- The 3rd parameter is used for continuing the CRC computation for second or later data sections. If omitted, -1 is used, the standard initial value for CRC32. If an earlier CRC operation is to be continued (which returned C), put here ~C. If a different CRC is needed than the standard CRC-32 (e.g. to resolve collisions), you can use any 32 bit integer for initialization. |
Can anybody show me in a simple script how to use this CRC32 function to generate the checksum of any file (with FileSelectFile, for example), read as binary (and in chunks if big, so as to not load the whole file in memory)
I'd like to use this for a duplicate file finder script. |
|
| Back to top |
|
 |
Laszlo
Joined: 14 Feb 2005 Posts: 4515 Location: Boulder, CO
|
Posted: Mon Jan 12, 2009 5:53 am Post subject: |
|
|
| Code: | bufSz := 1 << 26 ; 64MB buffer
VarSetCapacity(buff,bufSz) ; allocate buffer
file := A_ScriptFullPath ; put your filename here
FileGetSize Sz, %file%
c := 0, offs := -bufSz
h := OpenFile(file) ; handle to file
Loop % Sz//bufSz { ; for each buff-full of data
BinRead(h, buff, bufSz, offs+=bufSz)
c := CRC32(buff,bufSz,~c) ; compute accumulated CRC
}
If (m:=mod(Sz,bufSz)) { ; the slack
BinRead(h, buff, m, offs+=bufSz)
c := CRC32(buff,m,~c)
}
CloseFile(h)
; c = CRC here
SetFormat Integer, Hex
MsgBox % c+0 ; show CRC32 in hex
OpenFile(file) { ; only for read!
Return DllCall("CreateFile",Str,file, UInt,0x80000000, UInt,3, UInt,0, UInt,3, UInt,0, UInt,0)
}
BinRead(hFile, ByRef data, n, offset=0) { ; offset<0: counted from the end backwards
DllCall("SetFilePointerEx",UInt,hFile, Int64,offset, UIntP,U, Int,2*(offset<0))
DllCall("ReadFile",UInt,hFile, Str,data, UInt,n, UIntP,r, UInt,0)
Return r ; the number of bytes read
}
CloseFile(hFile) {
DllCall("CloseHandle", UInt,hFile)
}
CRC32(ByRef Buffer, Bytes=0, Start=-1) {
Static CRC32, CRC32LookupTable
If (CRC32 = "") {
MCode(CRC32,"33c06a088bc85af6c101740ad1e981f12083b8edeb02d1e94a75ec8b542404890c82403d0001000072d8c3")
VarSetCapacity(CRC32LookupTable, 1024)
DllCall(&CRC32, "uint",&CRC32LookupTable, "cdecl")
MCode(CRC32,"558bec33c039450c7627568b4d080fb60c08334d108b55108b751481e1ff000000c1ea0833148e403b450c89551072db5e8b4510f7d05dc3")
}
If Bytes <= 0
Bytes := StrLen(Buffer)
Return DllCall(&CRC32, "uint",&Buffer, "uint",Bytes, "int",Start, "uint",&CRC32LookupTable, "cdecl uint")
}
MCode(ByRef code, hex) { ; allocate memory and write Machine Code there
VarSetCapacity(code,StrLen(hex)//2)
Loop % StrLen(hex)//2
NumPut("0x" . SubStr(hex,2*A_Index-1,2), code, A_Index-1, "Char")
} |
Edit 20090112: Faster BinRead, larger buffer for speedup
Last edited by Laszlo on Mon Jan 12, 2009 8:04 pm; edited 1 time in total |
|
| Back to top |
|
 |
n-l-i-d Guest
|
Posted: Mon Jan 12, 2009 6:13 pm Post subject: |
|
|
Great! Much abliged...
Is there any way to speed up the scanning of large files more? I found two (fast) command line utilities that beat this script considerably in speed:
rehash.exe (has other checksum options too)
crc32.exe (fastest of the crc32.exe's I found)
Time to scan a 700 MB iso file:
- script: 30 seconds (buffer-size: 32768, latest AutoHotkey beta)
- rehash: 19 seconds
- crc32: 18 seconds
Other command line programs I tried (but were slower than the ones mentioned above):
- crc32.exe
- crc32.exe
- crc.exe
testscript for AHK code
| Code: | #NoEnv
SetBatchLines -1
;Critical On
;Process, Priority, , Realtime
StartTime := A_TickCount
aBuffer := 1024 * 32
VarSetCapacity(data,aBuffer) ; allocate 4KB buffer (change to your taste)
file := "E:\Downloads\gos-3.1-gadgets-20081205.iso" ;A_ScriptFullPath ; put your filename here
FileGetSize Sz, %file%
c := 0, offs := -aBuffer
Loop % Sz//aBuffer { ; for each block of data
BinRead(file, data, aBuffer, offs+=aBuffer)
c := CRC32(data,aBuffer,~c) ; compute accumulated CRC
}
If (m:=mod(Sz,aBuffer)) { ; the slack
BinRead(file, data, m, offs+=aBuffer)
c := CRC32(data,m,~c)
}
; c = CRC here
SetFormat Integer, Hex
crc := c+0 ; show CRC32 in hex
SetFormat Integer, D
ElapsedTime := Round((A_TickCount - StartTime)/1000)
MsgBox % "It took " ElapsedTime " seconds to scan`n`n" file "`n`nBuffer size: " aBuffer " kB`nCRC: " crc
Return
BinRead(file, ByRef data, n=0, offset=0) {
h := DllCall("CreateFile",Str,file, UInt,0x80000000, UInt,3, UInt,0, UInt,3, UInt,0, UInt,0)
DllCall("SetFilePointerEx",UInt,h, Int64,offset, UIntP,U, Int,2*(offset<0))
m := DllCall("GetFileSize",UInt,h, Int64P,r)
If n not between 1 and %m%
n := m
VarSetCapacity(data, n)
DllCall("ReadFile",UInt,h, Str,data, UInt,n, UIntP,r, UInt,0)
DllCall("CloseHandle", UInt,h)
Return r
}
CRC32(ByRef Buffer, Bytes=0, Start=-1) {
Static CRC32, CRC32LookupTable
If (CRC32 = "") {
MCode(CRC32,"33c06a088bc85af6c101740ad1e981f12083b8edeb02d1e94a75ec8b542404890c82403d0001000072d8c3")
VarSetCapacity(CRC32LookupTable, 1024)
DllCall(&CRC32, "uint",&CRC32LookupTable, "cdecl")
MCode(CRC32,"558bec33c039450c7627568b4d080fb60c08334d108b55108b751481e1ff000000c1ea0833148e403b450c89551072db5e8b4510f7d05dc3")
}
If Bytes <= 0
Bytes := StrLen(Buffer)
Return DllCall(&CRC32, "uint",&Buffer, "uint",Bytes, "int",Start, "uint",&CRC32LookupTable, "cdecl uint")
}
MCode(ByRef code, hex) { ; allocate memory and write Machine Code there
VarSetCapacity(code,StrLen(hex)//2)
Loop % StrLen(hex)//2
NumPut("0x" . SubStr(hex,2*A_Index-1,2), code, A_Index-1, "Char")
} |
testscript for console programs:
| Code: | #NoEnv
SetBatchLines -1
;Critical On
;Process, Priority, , Realtime
StartTime := A_TickCount
;aBuffer := 1024 * 4
fileToScan := "E:\Downloads\gos-3.1-gadgets-20081205.iso"
CMDdir := A_ScriptDir "\bin"
CMDin := A_ScriptDir "\bin\rehash.exe -none -crc32 -norcrsv -f """ fileToScan """"
crc := CMDret_RunReturn(CMDin, CMDdir)
ElapsedTime := Round((A_TickCount - StartTime)/1000)
MsgBox % "It took " ElapsedTime " seconds to scan`n`n" fileToScan "`n`nBuffer size: " aBuffer " kB`nCRC: " crc ; show CRC32 in hex
Return
; ******************************************************************
; CMDret-AHK functions
; version 1.10 beta
;
; Updated: Dec 5, 2006
; by: corrupt
; Code modifications and/or contributions made by:
; Laszlo, shimanov, toralf, Wdb
; ******************************************************************
; Usage:
; CMDin - command to execute
; WorkingDir - full path to working directory (Optional)
; ******************************************************************
; Known Issues:
; - If using dir be sure to specify a path (example: cmd /c dir c:\)
; or specify a working directory
; - Running 16 bit console applications may not produce output. Use
; a 32 bit application to start the 16 bit process to receive output
; ******************************************************************
; Additional requirements:
; - none
; ******************************************************************
; Code Start
; ******************************************************************
CMDret_RunReturn(CMDin, WorkingDir=0)
{
Global cmdretPID
tcWrk := WorkingDir=0 ? "Int" : "Str"
idltm := A_TickCount + 20
CMsize = 1
VarSetCapacity(CMDout, 1, 32)
VarSetCapacity(sui,68, 0)
VarSetCapacity(pi, 16, 0)
VarSetCapacity(pa, 12, 0)
Loop, 4 {
DllCall("RtlFillMemory", UInt,&pa+A_Index-1, UInt,1, UChar,12 >> 8*A_Index-8)
DllCall("RtlFillMemory", UInt,&pa+8+A_Index-1, UInt,1, UChar,1 >> 8*A_Index-8)
}
IF (DllCall("CreatePipe", "UInt*",hRead, "UInt*",hWrite, "UInt",&pa, "Int",0) <> 0) {
Loop, 4
DllCall("RtlFillMemory", UInt,&sui+A_Index-1, UInt,1, UChar,68 >> 8*A_Index-8)
DllCall("GetStartupInfo", "UInt", &sui)
Loop, 4 {
DllCall("RtlFillMemory", UInt,&sui+44+A_Index-1, UInt,1, UChar,257 >> 8*A_Index-8)
DllCall("RtlFillMemory", UInt,&sui+60+A_Index-1, UInt,1, UChar,hWrite >> 8*A_Index-8)
DllCall("RtlFillMemory", UInt,&sui+64+A_Index-1, UInt,1, UChar,hWrite >> 8*A_Index-8)
DllCall("RtlFillMemory", UInt,&sui+48+A_Index-1, UInt,1, UChar,0 >> 8*A_Index-8)
}
IF (DllCall("CreateProcess", Int,0, Str,CMDin, Int,0, Int,0, Int,1, "UInt",0, Int,0, tcWrk, WorkingDir, UInt,&sui, UInt,&pi) <> 0) {
Loop, 4
cmdretPID += *(&pi+8+A_Index-1) << 8*A_Index-8
Loop {
idltm2 := A_TickCount - idltm
If (idltm2 < 10) {
DllCall("Sleep", Int, 10)
Continue
}
IF (DllCall("PeekNamedPipe", "uint", hRead, "uint", 0, "uint", 0, "uint", 0, "uint*", bSize, "uint", 0 ) <> 0 ) {
Process, Exist, %cmdretPID%
IF (ErrorLevel OR bSize > 0) {
IF (bSize > 0) {
VarSetCapacity(lpBuffer, bSize+1)
IF (DllCall("ReadFile", "UInt",hRead, "Str", lpBuffer, "Int",bSize, "UInt*",bRead, "Int",0) > 0) {
IF (bRead > 0) {
TRead += bRead
VarSetCapacity(CMcpy, (bRead+CMsize+1), 0)
CMcpy = a
DllCall("RtlMoveMemory", "UInt", &CMcpy, "UInt", &CMDout, "Int", CMsize)
DllCall("RtlMoveMemory", "UInt", &CMcpy+CMsize, "UInt", &lpBuffer, "Int", bRead)
CMsize += bRead
VarSetCapacity(CMDout, (CMsize + 1), 0)
CMDout=a
DllCall("RtlMoveMemory", "UInt", &CMDout, "UInt", &CMcpy, "Int", CMsize)
VarSetCapacity(CMDout, -1) ; fix required by change in autohotkey v1.0.44.14
}
}
}
}
ELSE
break
}
ELSE
break
idltm := A_TickCount
}
cmdretPID=
DllCall("CloseHandle", UInt, hWrite)
DllCall("CloseHandle", UInt, hRead)
}
}
IF (StrLen(CMDout) < TRead) {
VarSetCapacity(CMcpy, TRead, 32)
TRead2 = %TRead%
Loop {
DllCall("RtlZeroMemory", "UInt", &CMcpy, Int, TRead)
NULLptr := StrLen(CMDout)
cpsize := Tread - NULLptr
DllCall("RtlMoveMemory", "UInt", &CMcpy, "UInt", (&CMDout + NULLptr + 2), "Int", (cpsize - 1))
DllCall("RtlZeroMemory", "UInt", (&CMDout + NULLptr), Int, cpsize)
DllCall("RtlMoveMemory", "UInt", (&CMDout + NULLptr), "UInt", &CMcpy, "Int", cpsize)
TRead2 --
IF (StrLen(CMDout) > TRead2)
break
}
}
StringTrimLeft, CMDout, CMDout, 1
Return, CMDout
} |
|
|
| Back to top |
|
 |
Laszlo
Joined: 14 Feb 2005 Posts: 4515 Location: Boulder, CO
|
Posted: Mon Jan 12, 2009 7:36 pm Post subject: |
|
|
| n-l-i-d wrote: | | speed up the scanning of large files | An AHK script, which is only 1.66 times slower than the speed champion is pretty good. The machine code part is written in C, optimized for size, not speed, so you could gain a few percent speed if you replace it with a faster, pure assembler code. The AHK overhead at dll calls can be improved if you precompute the file I/O function addresses, and use larger buffers, like 4MB (be a power of two). You can remove a few superfluous lines from the BinRead function, open and close the file only once, but it does not matter much at large buffers. I'll update the script in the previous post in a few minutes... |
|
| Back to top |
|
 |
Laszlo
Joined: 14 Feb 2005 Posts: 4515 Location: Boulder, CO
|
Posted: Mon Jan 12, 2009 8:09 pm Post subject: |
|
|
| The CRC32 loop script is updated in the third previous post. Do you see a speedup? |
|
| Back to top |
|
 |
n-l-i-d Guest
|
Posted: Mon Jan 12, 2009 9:14 pm Post subject: |
|
|
| Not really. I tried experimenting with different buffer sizes, and it seems that a buffer of 32 MB is fastest (on this particular file that is). If I change the buffer size in your latest code from 64 to 32 MB, the scanning is much faster (around 18 seconds, as fast as the fastest command line utility). |
|
| Back to top |
|
 |
Laszlo
Joined: 14 Feb 2005 Posts: 4515 Location: Boulder, CO
|
Posted: Mon Jan 12, 2009 9:31 pm Post subject: |
|
|
| The running time could be dominated by the actual disk I/O. Too large buffers might be bad, because Windows has to shuffle things around to make room. In theory double buffering could speed things up, by loading the data into one buffer, while processing the other, but in this case the processing is much faster than reading from disk (unless you have RAID or 15K rpm drives), so you cannot gain much. |
|
| Back to top |
|
 |
n-l-i-d Guest
|
Posted: Tue Jan 13, 2009 2:36 am Post subject: |
|
|
While searching for even faster scanning methods (crc32 assembly code being the optimum I guess), I stumbled across this code that might interest you: CodeProject - CRC32_ Generating a checksum for a file (8 functions, including assembly) and a related set of AutoIt scripts to run inline assembly, with a crc32 example (and if I'm not misinterpreting the code, the author also loads an inline dll into memory and uses it from there!)
This is all way over my head, but looks like very interesting stuff to convert to AHK.
 |
|
| Back to top |
|
 |
IsNull
Joined: 10 May 2007 Posts: 166 Location: .switzerland
|
|
| Back to top |
|
 |
mitchi
Joined: 14 Jun 2008 Posts: 9
|
Posted: Wed Jan 28, 2009 5:09 pm Post subject: |
|
|
How useful is all this? Like someone said, you can only have assembly code snippets here. Far calls are out of the question. Why don't you just put all the snippets you want into a DLL and call them. And if you use a DLL, you aren't limited to self-relative code.
It's still pretty cool  |
|
| Back to top |
|
 |
bmcclure
Joined: 24 Nov 2007 Posts: 766
|
Posted: Wed Jan 28, 2009 5:32 pm Post subject: |
|
|
However if you use a Dll you're not demonstrating something as cool as running machine code natively in AHK  _________________ Ben
My Trac projects
My Wiki
[Broken] - My music |
|
| Back to top |
|
 |
Laszlo
Joined: 14 Feb 2005 Posts: 4515 Location: Boulder, CO
|
Posted: Wed Jan 28, 2009 5:46 pm Post subject: |
|
|
| mitchi wrote: | | How useful is all this? | If you need a few small functions hard to program or too slow with the AHK commands, you can directly embed them in your script. If you make a dll, each time you find a useful function you have to recompile the dll. If others use your SW, you have to make sure they have the right version of the dll. If you update a function in the dll, older scripts could break, so you face a complex compatibility management problem. It is also easier including a few lines of machine code somebody else developed, than copying the source code to your always growing dll source, and recompile. The source of the machine code need not be C, and mixing different language source code is hard.
On the other hand, if your function is large, calls library functions, etc. it is better kept in a separate dll. You may end up needing several of them for a larger project, though. |
|
| Back to top |
|
 |
Glasso Guest
|
Posted: Sat Jan 31, 2009 10:48 pm Post subject: |
|
|
Hi, is there anyway to have Bitmap data in memory, and have a conversion of that to hex directly occur within memory?
i.e., hBitmap or pBitmap --->>> straight assignment of hex code to a variable, or fileappend write to file with hBitmap or pBitmap converted to hex code?
I commonly use GDI+ and Gdip functions posted elsewhere in this forum, and curious if this bit wizardy is robust to do this. |
|
| Back to top |
|
 |
Laszlo
Joined: 14 Feb 2005 Posts: 4515 Location: Boulder, CO
|
Posted: Sat Jan 31, 2009 10:56 pm Post subject: |
|
|
| Glasso wrote: | | Hi, is there anyway to have Bitmap data in memory, and have a conversion of that to hex directly occur within memory? | Check out the Bin2Hex and Hex2Bin functions in this thread.
| Glasso wrote: | | this bit wizardy is robust to do this? | Yes |
|
| Back to top |
|
 |
Glasso Guest
|
Posted: Sat Jan 31, 2009 11:42 pm Post subject: |
|
|
Well, what I try is:
| Code: |
hexData := Bin2Hex (&hBitmap, not_sure_how_to_get_hbitmap_size_at_this_level)
|
what can i do bettter? |
|
| Back to top |
|
 |
|
|
You can post new topics in this forum You can reply to topics in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|