Jump to content

Sky Slate Blueberry Blackcurrant Watermelon Strawberry Orange Banana Apple Emerald Chocolate
Photo

CRC-32


  • Please log in to reply
17 replies to this topic
Laszlo
  • Moderators
  • 4713 posts
  • Last active: Mar 31 2012 03:17 AM
  • Joined: 14 Feb 2005
CRC, the Cyclic Redundancy Check is a simple hash function: a fixed length value computed from arbitrary length data. It is used mainly for error checking: to see if the data (program code, compressed archive, Internet message) is intact; or narrowing down the search for identical files. A few random bit flips or dropped bytes almost always result in different CRC value, but it is easy to modify the data e.g. with appending a few bytes, so the result will provide any desired CRC value. Consequently, CRC cannot be used for security (message authentication, or detecting malicious modifications).

CRCs are based on division in the ring of polynomials over the integers modulo 2. A string of bits is interpreted as the coefficients of a polynomial of this sort, and to find the CRC, we divide it by another fixed polynomial. The coefficients of the remainder polynomial are the CRC. (See more details in http://www.repairfaq...K/F_crc_v3.html.) The IEEE 802.3 standard specifies the polynomial and the padding of the data (with its length represented in 4 bytes).

There are several CRC variants in use:

CRC-8 x^8 + x^2 + x^ + 1
CRC-CCITT x^16 + x^12 + x^5 + 1
CRC-16 (IBM) x^16 +x^15 + x^2 + 1
CRC-32 (802.3) x^32 + x^26 + x^23 + x^22 + x^16 + x^12 + x^11 + x^10 + x^8 + x^7 + x^5 + x^4 + x^2 + x^ + 1
CRC32c (Castagnoli) x^32 + x^28 + x^27 + x^26 + x^25 + x^23 + x^22 + x^20 + x^19 + x^18 + x^14 + x^13 + x^11 + x^10 + x^9 + x^8 + x^6 + 1

Here is a simple script for CRC-32, the most often used CRC version. When using it the for a (binary) file, the data has to be converted to a string of hex digits, e.g. with this script, also used here. For large files the input has to be modified, so pieces of data are read, without closing and reopening the file. Also, the code can be optimized with using tables and handling several bits together, but this script could still serve as a verifier for the accelerated code.
SetFormat Integer, H

x = 00000000000000000000000000000000000000000000000000000000000000000000000000000000 ; 864d7f99
MsgBox % CRC32(x)
x = FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF ; c55e457a
MsgBox % CRC32(x)
x = 0102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f202122232425262728 ; bf671ed0
MsgBox % CRC32(x)
Return

CRC32(x)
{
   L := StrLen(x)>>1          ; length in bytes
   StringTrimLeft L, L, 2     ; remove leading 0x
   L = 0000000%L%
   StringRight L, L, 8        ; 8 hex digits
   x = %x%%L%                 ; standard pad
   R =  0xFFFFFFFF            ; initial register value
   Loop Parse, x
   {
      y := "0x" A_LoopField   ; one hex digit at a time
      Loop 4
      {
         R := (R << 1) ^ ( (y << (A_Index+28)) & 0x100000000)
         IfGreater R,0xFFFFFFFF
            R := R ^ 0x104C11DB7
      }
   }
   Return ~R                  ; ones complement is the CRC
}


Chris
  • Administrators
  • 10727 posts
  • Last active:
  • Joined: 02 Mar 2004
It's great to have a CRC function posted. I think it could be used with PixelGetColor to identify a pattern of pixels on the screen, which might be easier than ImageSearch for some tasks. And I'm sure CRC has many other uses.

Thanks.

Laszlo
  • Moderators
  • 4713 posts
  • Last active: Mar 31 2012 03:17 AM
  • Joined: 14 Feb 2005
The script above implements the standard. For internal use (like at pattern search), we can even drop the padding with length and the final bit inversion, making the code really short.
CRC(x)                        ; CRC of x = string of hex digits

{

   x = %x%12345678            ; non-standard pad

   R =  0xFFFFFFFF            ; initial register value

   Loop Parse, x

   {

      y := "0x" A_LoopField   ; one hex digit at a time

      Loop 4

      {

         R := (R << 1) ^ ( (y << (A_Index+28)) & 0x100000000)

         IfGreater R,0xFFFFFFFF

            R := R ^ 0x104C11DB7

      }

   }

   Return R                   ; final Reg = CRC

}
Also, don't forget to convert other types of data to strings of hex digits, what the CRC script needs. The following is simplified from the list functions library.
String2Hex(x)                 ; Convert a string to hex digits

{                             ; needs SetFormat Integer, H

   Loop Parse, x

   {

      y := ASC(A_LoopField)   ; 2 digit ASCII code of chars of x, 15 < y < 256

      StringTrimLeft y, y, 2  ; Remove leading 0x

      hex = %hex%%y%

   }

   Return hex

}


Rajat
  • Members
  • 1904 posts
  • Last active: Jul 17 2015 07:45 AM
  • Joined: 28 Mar 2004
wow Laszlo, you're really good with encryption stuff! nice work here too!

MIA

CleanNews.in : Bite sized latest news headlines from India with zero bloat


Invalid User
  • Members
  • 447 posts
  • Last active: Mar 27 2012 01:04 PM
  • Joined: 14 Feb 2005
Rajat I think you mentioned Laszlo was an encryption expert, your wern't joking. Nice work here Laszlo.
my lame sig :)

corrupt
  • Members
  • 2558 posts
  • Last active: Nov 01 2014 03:23 PM
  • Joined: 29 Dec 2004
Thanks Laszlo :D .

Laszlo
  • Moderators
  • 4713 posts
  • Last active: Mar 31 2012 03:17 AM
  • Joined: 14 Feb 2005
It was my pleasure! (And I really do crypto for living.)

Rajat
  • Members
  • 1904 posts
  • Last active: Jul 17 2015 07:45 AM
  • Joined: 28 Mar 2004

Rajat I think you mentioned Laszlo was an encryption expert, your wern't joking.

actually i came to know about that from his webpage... its url was in some script of his.

MIA

CleanNews.in : Bite sized latest news headlines from India with zero bloat


SKAN
  • Administrators
  • 9115 posts
  • Last active:
  • Joined: 26 Dec 2005
Dear Laszlo, :)

Currently, for file verification I call an external dll function.
I like to have it done in pure AHK.

The CRC32() function expects a hex string.
Can you post a variant that accepts a filename and returns a CRC32 hash!

Something like :
Hash:=CRC32("Autohotkey.exe")

I require to check binary files sized less than 1 MB.

Please help me!

Regards, :)
kWo4Lk1.png

Laszlo
  • Moderators
  • 4713 posts
  • Last active: Mar 31 2012 03:17 AM
  • Joined: 14 Feb 2005
It is called CRC. I simplified the original CRC32:
MsgBox % CRC(A_ScriptFullPath)

MsgBox %ErrorLevel%                 ; Error from file read



CRC(file) {

   format = %A_FormatInteger%       ; save original integer format

   SetFormat Integer, Hex           ; for converting bytes to hex

   c := CRC32(HexRead(file))

   SetFormat Integer, %format%      ; restore original format

   Return c                         ; result in hex. Add "+0" for original integer format

}



HexRead(file, n=0, offset=0) {

   h := DllCall("CreateFile",Str,file,UInt,0x80000000,UInt,3,UInt,0,UInt,3,UInt,0,UInt,0)

   IfEqual h,-1, SetEnv, ErrorLevel, -1

   IfNotEqual ErrorLevel,0,Return,0 ; couldn't open the file



   m = 0                            ; seek to offset

   IfLess offset,0, SetEnv,m,2

   r := DllCall("SetFilePointerEx",UInt,h,Int64,offset,UIntP,p,Int,m)

   IfEqual r,0, SetEnv, ErrorLevel, -3

   IfNotEqual ErrorLevel,0, {

      t = %ErrorLevel%              ; save ErrorLevel to be returned

      DllCall("CloseHandle", UInt, h)

      ErrorLevel = %t%              ; return seek error

      Return 0

   }



   IfEqual n,0, SetEnv n,0xffffffff ; almost infinite



   Loop %n%

   {

      result := DllCall("ReadFile",UInt,h,UCharP,c,UInt,1,UIntP,Read,UInt,0)

      if (!result or Read < 1 or ErrorLevel)

         break

      c += 0x100                    ; convert to hex with a leading 1

      StringTrimLeft c, c, 3        ; remove 0x1

      data = %data%%c%              ; append 2 hex digits

   }



   IfNotEqual ErrorLevel,0, SetEnv,t,%ErrorLevel%



   h := DllCall("CloseHandle", UInt, h)

   IfEqual h,-1, SetEnv, ErrorLevel, -2

   IfNotEqual t,,SetEnv, ErrorLevel, %t%



   Return data

}



CRC32(x) {

   L := 0x100000000 + (StrLen(x)>>1) ; length in bytes with leading 0x1

   StringTrimLeft L, L, 3     ; remove leading 0x1, length in 8 digits

   x = %x%%L%                 ; standard pad

   R = 0xFFFFFFFF             ; initial register value

   Loop Parse, x

   {

      y = 0x%A_LoopField%     ; one hex digit at a time

      Loop 4

      {

         R := (R << 1) ^ ( (y << (A_Index+28)) & 0x100000000)

         IfGreater R,0xFFFFFFFF

            R := R ^ 0x104C11DB7

      }

   }

   Return ~R                  ; ones complement is the CRC

}


SKAN
  • Administrators
  • 9115 posts
  • Last active:
  • Joined: 26 Dec 2005
Dear Laszlo, :)

Many Thanks! :D

It is called CRC. I simplified the original CRC32:


I tested it with a binary file sized 193847 bytes

The time taken varies between 100 - 156 seconds!
Is this normal for this algorithm?

My system: AMD Sempron 2 Ghz / 256 MB / WIN 2000 SP4

Thanks again for the help.

Regards, :)
kWo4Lk1.png

Laszlo
  • Moderators
  • 4713 posts
  • Last active: Mar 31 2012 03:17 AM
  • Joined: 14 Feb 2005
Try it with
Process Priority,,High

SetBatchLines -1
It could speed it up. However, AHK is always much slower for large scale data processing than compiled code. The script performs several AHK commands on each data byte, so don't expect really high performance.

SKAN
  • Administrators
  • 9115 posts
  • Last active:
  • Joined: 26 Dec 2005
Dear Laszlo, :)

The script performs several AHK commands on each data byte, so don't expect really high performance.


:shock:

I was always wondering if we could have a FileCopy command written with pure AHK

- so that we have a progress bar while copying/moving large data ( especially Video files! ).

- so that simultaneously CRC32 will be computed and appended to the file name.

This is disheartening!

:D Anyways! I am very happy to have a Pure AHK solution for checking file integrity! :D

Thank you very much for all this help...

Regards, :)
kWo4Lk1.png

AhkRich
  • Guests
  • Last active:
  • Joined: --
Hi,

Great work Laszlo! I'm really glad that someone is bringing such great expertise to Autohotkey users.

I think there is a problem with your CRC32 implementation though.
The result of your code does not match the result obtained from PHP or several online calculators. Now, this could just mean that the PHP function is wrong and those online calculators all use the PHP function.

For example, let's use the text "clipboard".
The hex-encoded form is 636c6970626f617264

Your crc32 code (the "standard" one) produced the following result: ebf4f949
However, the php function (and all the online sites) produced the following: 8fdbc496

So it seems that your implementation does match the one PHP uses. Is there somewhere I can find some standard references to see which implimentation is correct?

Here are some sites to check against your results:
http://www.fileforma...o/tool/hash.htm
http://crc32-checksum.waraxe.us/

Thanks,
Richard

AhkRich
  • Members
  • 1 posts
  • Last active: Nov 10 2007 04:03 PM
  • Joined: 09 Nov 2007
Beware if you are trying to find the CRC32 of BINARY data.
The String2Hex function shown above will NOT give the correct result for binary data.

However the String2Hex function shown above only gives the correct result for bytes that have a value of 15-256 (as stated in the comments). Many binary files (including pure-text files) have characters that have values less than 15. Common examples are carriage return, line feed, tab etc. String2Hex will NOT give the correct result for any string or file that contains these characters.

Each byte in the binary (or text) data MUST be represented by two hex digits. For example a 4-character string will have an 8-digit hex representation.

For example take the string "AA[TAB]A" where [TAB] is the tab character (ASCII 0x09) and A is just the letter A (ASCII 0x41).
The correct hex representation of this string is 41410941. However String2hex gives 4141941 which actually represents a completely different string where the characters have the following ascii codes (from left to right): 0x04, 0x14, 0x19, and 0x41

The following code corrects the problem (although I'm sure there are more optimized ways of doing it).
I also hate functions that change global parameters (such as integer formatting), so I made sure this function restores A_FormatInteger to its original value (H or D).

String2Hex(x)                 ; Convert a string to hex digits
{
  prevFmt = %A_FormatInteger%
  SetFormat Integer, H ;this function requires hex format
   Loop Parse, x
   {
      y := Asc(A_LoopField)   ; 2 digit ASCII code
      StringTrimLeft, y,y,2  ;remove 0x
      y=0%y%  ;pad on left with 0 to make sure each char gives 2 hex digits.
      StringRight, y,y,2 ;take only the right-most two digits
      hex := hex y
   }
   SetFormat Integer, %prevFmt% ;restore original integer formatting
   return hex
}