 |
AutoHotkey Community Let's help each other out
|
| View previous topic :: View next topic |
| Author |
Message |
Laszlo
Joined: 14 Feb 2005 Posts: 4012 Location: Pittsburgh
|
Posted: Fri Mar 14, 2008 7:57 pm Post subject: |
|
|
MemCpy is straightforward in C: copy n bytes (parameter #3) from source (parameter #2) to destination (parameter #1).
| Code: | memcpy(char* dest, char* source, int n) {
int i;
for (i=0; i<n; ++i) dest[i] = source[i];
} |
In MemCmp we have a number of choices about the parameters. The version below compares up to n (parameter #3) bytes of buffer1 (parameter #1) and buffer2 (parameter #2). The result is the difference of the first unequal bytes (negative if buffer1 < buffer2, positive if buffer1 > buffer2) or 0 if they are the same. Parameter #4 tells at which offset the byte difference was taken and returned.
| Code: | int memcmp(char* c, char* d, int n, int* i1) { // -> difference of first diff bytes, i1 <- its index
int i;
for (i=0; i<n-1; ++i)
if (c[i] != d[i]) break;
*i1 = i;
return c[i]-d[i];
} |
Test them with
| Code: | MCode(MemCpy,"568b74241085f67e138b4c24088b44240c2bc18a14088811414e75f75ec3")
MCode(MemCmp,"558bec8b451033c94885c0578b7d0c7e188b5508568bf72bd7538a1c3"
. "23a1e750641463bc87cf35b5e8b451489088b45080fbe04010fbe0c392bc15f5dc3")
A = 123
B = 1234
C = ....
DllCall(&MemCpy, UInt,&C, UInt,&B, Int,5, "cdecl")
MsgBox %C%
Msgbox % DllCall(&MemCmp, UInt,&A, UInt,&B, Int,4, IntP,I, "cdecl int") . "`n" . I
MCode(ByRef code, hex) { ; allocate memory and write Machine Code there
VarSetCapacity(code,StrLen(hex)//2)
Loop % StrLen(hex)//2
NumPut("0x" . SubStr(hex,2*A_Index-1,2), code, A_Index-1, "Char")
} |
|
|
| Back to top |
|
 |
SKAN
Joined: 26 Dec 2005 Posts: 5880
|
Posted: Sat Mar 15, 2008 2:54 am Post subject: |
|
|
| Laszlo wrote: | | Olfen already did the work for XTEA. |
I had missed that.
Thank you very much for the binary functions.  |
|
| Back to top |
|
 |
Azerty
Joined: 19 Dec 2006 Posts: 72 Location: France
|
Posted: Mon Mar 17, 2008 10:17 am Post subject: |
|
|
| Azerty wrote: | Hi Skan
After Lazslo's comment, I looked at ascii85, and, for inlining binaries into AHK code, it suits better : base64 encodes 3 bytes into 4, ascii85 encodes 4 bytes into 5.
For instance, a 8 Ko binary would be encoded in 10923 bytes vs 10240 using ascii85 (CR/LF not counted).
Today : the full ASM binary encoder is ready, the decoder is in progress. I hope to publish them both from now until end of week with some sample code.
As for constraints : will be i486+ (using BSWAP for compactness), but code should even be Win95 compatible (though I won't test it, nor support it - I'm on W2K/WXP).
CU |
I'm late (been busy elsewhere late days), but it's here. |
|
| Back to top |
|
 |
SKAN
Joined: 26 Dec 2005 Posts: 5880
|
Posted: Wed Mar 19, 2008 3:20 pm Post subject: |
|
|
@Laszlo
Sir, I compiled the following source code given by you with VC++ 6.0 :
| Code: | int main(void) {
}
scpyn(char* dest, char* source, int n) {
while (--n)*dest++ = *source++;
*dest = 0;
} |
This what I get from disasm.exe: http://arian.suresh.googlepages.com/main.txt
Can you please throw some light on how to extract the relevant hex from this dump?
Can this extraction be automated with an ahk parser ?
Please help.
 |
|
| Back to top |
|
 |
Laszlo
Joined: 14 Feb 2005 Posts: 4012 Location: Pittsburgh
|
Posted: Wed Mar 19, 2008 8:28 pm Post subject: |
|
|
I don't have VC++6, but I guess you did not export your function. Add to the linker command line options: /EXPORT:spyn. When you compile the program as a console application, the function name should appear in the assembler listing, so you can search for it.
You can extract the machine code from each non-comment line. It is between the first and second white space (tab). I copy the function code to the clipboard, where a simple AHK script removes the beginning and end of each line. There are, unfortunatelly exceptions. Sometimes a longer block of machine code is spread into 2 lines. I don't have an intelligent enough script, just process the bad code manually. |
|
| Back to top |
|
 |
SKAN
Joined: 26 Dec 2005 Posts: 5880
|
Posted: Wed Mar 19, 2008 9:44 pm Post subject: |
|
|
| Laszlo wrote: | | I don't have VC++6, but I guess you did not export your function. Add to the linker command line options: /EXPORT:spyn. When you compile the program as a console application, the function name should appear in the assembler listing, so you can search for it. |
I had thought that was neccessary only for a DLL.
Now I opted the wizard mode and created a simple Win32 Console application and added scpyn in exports.def
| Code: | // main.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
int main(int argc, char* argv[])
{
return 0;
}
void scpyn(char* dest, char* source, int n) {
while (--n)*dest++ = *source++;
*dest = 0;
} |
The neccessary part of disasm dump:
| Code: | DEBUG :: scpyn
=========
scpyn
=========
:00401010 56 push esi
:00401011 8B742410 mov esi, dword[esp+10]
:00401015 4E dec esi
:00401016 7416 je 0040102E
:00401018 8B4C240C mov ecx, dword[esp+0C]
:0040101C 8B442408 mov eax, dword[esp+08]
---------
:00401020 8A11 mov dl, byte[ecx]
:00401022 8810 mov byte[eax], dl
:00401024 40 inc eax
:00401025 41 inc ecx
:00401026 4E dec esi
:00401027 75F7 jne 00401020
:00401029 C60000 mov byte[eax], 00
:0040102C 5E pop esi
:0040102D C3 ret
---------
:0040102E 8B442408 mov eax, dword[esp+08]
:00401032 5E pop esi
:00401033 C60000 mov byte[eax], 00
:00401036 C3 ret
:00401037 90 90 90 90 90 90 90 90 90 ......... |
Derived hex : 568B7424104E74168B4C240C8B4424088A11881040414E75F7C600005EC3
is different from yours : 558becff4d108b4508740e8b4d0c8a1188104041ff4d1075f5c600005dc3
.. though it seems to work:
| Code: | MCode( Code, "568B7424104E74168B4C240C8B4424088A11881040414E75F7C600005EC38B4424085EC60000C3" )
OldStr := "The Quick Brown Fox Jumps Over The Lazy Dog"
VarSetCapacity( NewStr,15+1 )
DllCall( &Code, Str,NewStr, Str,OldStr, UInt,15+1, "cdecl" )
MsgBox, % "[" NewStr "]`n" Errorlevel
MCode(ByRef code, hex) { ; allocate memory and write Machine Code there
VarSetCapacity(code,StrLen(hex)//2)
Loop % StrLen(hex)//2
NumPut("0x" . SubStr(hex,2*A_Index-1,2), code, A_Index-1, "Char")
} |
| Quote: | | You can extract the machine code from each non-comment line. It is between the first and second white space (tab). I copy the function code to the clipboard, where a simple AHK script removes the beginning and end of each line. |
Thanks! I have it as follows, now:
| Code: | Loop, Parse, Clipboard, `n, `r
Hex .= RegExReplace( SubStr(A_LoopField,11,20), "(^\s*|\s*$)")
Clipboard := Hex
MsgBox,0, % StrLen(Hex)//2, % Hex |
| Laszlo wrote: | | There are, unfortunatelly exceptions. Sometimes a longer block of machine code is spread into 2 lines. I don't have an intelligent enough script, just process the bad code manually. |
I was referring to a different problem. Would a function GoTo to an another offset ? If that is the case, should I jump to the relevant part to extract the hex code ? Does these machine code functions always end with hex C3 ?
Thanks for all the help Sir, but I am terribly confused.
 |
|
| Back to top |
|
 |
Laszlo
Joined: 14 Feb 2005 Posts: 4012 Location: Pittsburgh
|
Posted: Wed Mar 19, 2008 9:59 pm Post subject: |
|
|
| SKAN wrote: | | different from yours | Different compilers create different machine code. You could also experiment with different optimization settings.
| SKAN wrote: | | Would a function GoTo to an another offset ? If that is the case, should I jump to the relevant part to extract the hex code ? | If the branch is to another function, you are out of luck. The machine code functions in AHK will have different relative location. Unless you include all the code in one machine code block, these will not work. Also, if library functions are called, they often crash your script.
| SKAN wrote: | | Does these machine code functions always end with hex C3? | Not necessarily. The last instruction can be any of the many branch instructions, and the function returns from another place. Also, if the function is not CDECL, the last instruction is usually a return with stack cleanup. |
|
| Back to top |
|
 |
SKAN
Joined: 26 Dec 2005 Posts: 5880
|
Posted: Wed Mar 19, 2008 10:14 pm Post subject: |
|
|
| Quote: | | Different compilers create different machine code. You could also experiment with different optimization settings. |
Means ... I did it . Thank you Sir.
| Quote: | | If the branch is to another function, you are out of luck. |
| Quote: | | Also, if library functions are called, they often crash your script. |
You mean .LIB or also .DLL ?
| Quote: | | The last instruction can be any of the many branch instructions, and the function returns from another place. Also, if the function is not CDECL, the last instruction is usually a return with stack cleanup. |
Thanks for the clarifications Sir.. very useful information!
 |
|
| Back to top |
|
 |
Laszlo
Joined: 14 Feb 2005 Posts: 4012 Location: Pittsburgh
|
Posted: Wed Mar 19, 2008 10:26 pm Post subject: |
|
|
| Any library function call is bad. Even if the code is stand alone, and the compiler copies the used functions somewhere into the program, their locations are hard to find, and the machine code needs address patches. This was the reason I could not make floating point machine code functions: the compiler used its internal functions regardless of what I tried. |
|
| Back to top |
|
 |
SKAN
Joined: 26 Dec 2005 Posts: 5880
|
Posted: Wed Mar 19, 2008 10:31 pm Post subject: |
|
|
Oh!
| Quote: | | If the branch is to another function, you are out of luck. |
How do I tell it by looking at the dump ? |
|
| Back to top |
|
 |
Laszlo
Joined: 14 Feb 2005 Posts: 4012 Location: Pittsburgh
|
Posted: Wed Mar 19, 2008 10:42 pm Post subject: |
|
|
| They are marked with funny looking names. |
|
| Back to top |
|
 |
apocalypse~r
Joined: 21 Jun 2007 Posts: 19
|
Posted: Thu Mar 20, 2008 5:27 am Post subject: Do you know how much id give... |
|
|
someone write a script that takes a function in a program or a dll or something and turns it into a hex string for use with the mcode function.
think: you could define any function no matter how complex in one line! _________________ problems := bugs + errors + glitches
code := (problems != 0) ? debug(code) : celebrate() |
|
| Back to top |
|
 |
tic
Joined: 22 Apr 2007 Posts: 1353
|
Posted: Thu Mar 20, 2008 5:39 am Post subject: Re: Do you know how much id give... |
|
|
| apocalypse~r wrote: | someone write a script that takes a function in a program or a dll or something and turns it into a hex string for use with the mcode function.
think: you could define any function no matter how complex in one line! |
lol
i dont think its as easy as that |
|
| Back to top |
|
 |
Azerty
Joined: 19 Dec 2006 Posts: 72 Location: France
|
Posted: Thu Mar 20, 2008 10:30 pm Post subject: |
|
|
Skan, reading your posts, I understand (but might be wrong) that you want to made and automatic concersion from C to HEX code. Am I right ?
I think it'll be really difficult if C functions want to talk to each other. So will it be to embed static data. If you disassemble Ascii85_Encoder in ODBG, you'll see there are static tables : they are difficult to embed because the code, to be usable in AHK, must be "self-relative". That means any reference to memory or data must be relative to the starting adress of the function. Usually, branches (like goto) are self-relative, but function calls are absolute (the EXE/DLL loading module in Windows does what is called "code relacation" to correct absolute addresses so they match with the address the data is loaded into memory), as are references to static data.
In assembler, with some little tricks, the code can be made self-relative. In C, you only depend on the compiler, which use no such tricks.
And for compilation of stand-alone functions, I was wondering if using gcc (MingW in our case) to produce object modules and to export the binary code out of object modules would not be easier than producing EXEs and searching hazardously for start and end of function into it. Stand-alone object modules with one function in each would further explicitely permit to identify if code is self-relative or not... |
|
| Back to top |
|
 |
SKAN
Joined: 26 Dec 2005 Posts: 5880
|
Posted: Thu Mar 20, 2008 10:44 pm Post subject: |
|
|
| Azerty wrote: | | you want to made and automatic concersion from C to HEX code. Am I right ? |
Exactly.. I give the script a starting offset, and it tries to extract dependencies, patch the offsets and put it together to make it one long byte stream.
I see it not possible ..  |
|
| Back to top |
|
 |
|
|
You can post new topics in this forum You can reply to topics in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|