AutoHotkey Homepage AutoHotkey Community
Let's help each other out
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Machine code functions: Bit Wizardry
Goto page Previous  1, 2, 3 ... 10, 11, 12, 13  Next
 
Post new topic   Reply to topic    AutoHotkey Community Forum Index -> Scripts & Functions
View previous topic :: View next topic  
Author Message
Laszlo



Joined: 14 Feb 2005
Posts: 4012
Location: Pittsburgh

PostPosted: Fri Mar 14, 2008 7:57 pm    Post subject: Reply with quote

MemCpy is straightforward in C: copy n bytes (parameter #3) from source (parameter #2) to destination (parameter #1).
Code:
memcpy(char* dest, char* source, int n) {
   int i;
   for (i=0; i<n; ++i) dest[i] = source[i];
}

In MemCmp we have a number of choices about the parameters. The version below compares up to n (parameter #3) bytes of buffer1 (parameter #1) and buffer2 (parameter #2). The result is the difference of the first unequal bytes (negative if buffer1 < buffer2, positive if buffer1 > buffer2) or 0 if they are the same. Parameter #4 tells at which offset the byte difference was taken and returned.
Code:
int memcmp(char* c, char* d, int n, int* i1) { // -> difference of first diff bytes, i1 <- its index
   int i;
   for (i=0; i<n-1; ++i)
      if (c[i] != d[i]) break;
   *i1 = i;
   return c[i]-d[i];
}

Test them with
Code:
MCode(MemCpy,"568b74241085f67e138b4c24088b44240c2bc18a14088811414e75f75ec3")
MCode(MemCmp,"558bec8b451033c94885c0578b7d0c7e188b5508568bf72bd7538a1c3"
. "23a1e750641463bc87cf35b5e8b451489088b45080fbe04010fbe0c392bc15f5dc3")

A = 123
B = 1234
C = ....

DllCall(&MemCpy, UInt,&C, UInt,&B, Int,5, "cdecl")
MsgBox %C%
Msgbox % DllCall(&MemCmp, UInt,&A, UInt,&B, Int,4, IntP,I, "cdecl int") . "`n" . I

MCode(ByRef code, hex) { ; allocate memory and write Machine Code there
   VarSetCapacity(code,StrLen(hex)//2)
   Loop % StrLen(hex)//2
      NumPut("0x" . SubStr(hex,2*A_Index-1,2), code, A_Index-1, "Char")
}
Back to top
View user's profile Send private message
SKAN



Joined: 26 Dec 2005
Posts: 5880

PostPosted: Sat Mar 15, 2008 2:54 am    Post subject: Reply with quote

Laszlo wrote:
Olfen already did the work for XTEA.


I had missed that. Surprised

Thank you very much for the binary functions. Smile
Back to top
View user's profile Send private message
Azerty



Joined: 19 Dec 2006
Posts: 72
Location: France

PostPosted: Mon Mar 17, 2008 10:17 am    Post subject: Reply with quote

Azerty wrote:
Hi Skan

After Lazslo's comment, I looked at ascii85, and, for inlining binaries into AHK code, it suits better : base64 encodes 3 bytes into 4, ascii85 encodes 4 bytes into 5.
For instance, a 8 Ko binary would be encoded in 10923 bytes vs 10240 using ascii85 (CR/LF not counted).

Today : the full ASM binary encoder is ready, the decoder is in progress. I hope to publish them both from now until end of week with some sample code.

As for constraints : will be i486+ (using BSWAP for compactness), but code should even be Win95 compatible (though I won't test it, nor support it - I'm on W2K/WXP).

CU


I'm late (been busy elsewhere late days), but it's here.
Back to top
View user's profile Send private message
SKAN



Joined: 26 Dec 2005
Posts: 5880

PostPosted: Wed Mar 19, 2008 3:20 pm    Post subject: Reply with quote

@Laszlo

Sir, I compiled the following source code given by you with VC++ 6.0 :

Code:
int main(void) {
}

scpyn(char* dest, char* source, int n) {
   while (--n)*dest++ = *source++;
   *dest = 0;
}


This what I get from disasm.exe: http://arian.suresh.googlepages.com/main.txt

Can you please throw some light on how to extract the relevant hex from this dump?
Can this extraction be automated with an ahk parser ?

Please help.

Smile
Back to top
View user's profile Send private message
Laszlo



Joined: 14 Feb 2005
Posts: 4012
Location: Pittsburgh

PostPosted: Wed Mar 19, 2008 8:28 pm    Post subject: Reply with quote

I don't have VC++6, but I guess you did not export your function. Add to the linker command line options: /EXPORT:spyn. When you compile the program as a console application, the function name should appear in the assembler listing, so you can search for it.

You can extract the machine code from each non-comment line. It is between the first and second white space (tab). I copy the function code to the clipboard, where a simple AHK script removes the beginning and end of each line. There are, unfortunatelly exceptions. Sometimes a longer block of machine code is spread into 2 lines. I don't have an intelligent enough script, just process the bad code manually.
Back to top
View user's profile Send private message
SKAN



Joined: 26 Dec 2005
Posts: 5880

PostPosted: Wed Mar 19, 2008 9:44 pm    Post subject: Reply with quote

Laszlo wrote:
I don't have VC++6, but I guess you did not export your function. Add to the linker command line options: /EXPORT:spyn. When you compile the program as a console application, the function name should appear in the assembler listing, so you can search for it.


I had thought that was neccessary only for a DLL. Rolling Eyes
Now I opted the wizard mode and created a simple Win32 Console application and added scpyn in exports.def

Code:
// main.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"


int main(int argc, char* argv[])
{
   return 0;
}


void scpyn(char* dest, char* source, int n) {
   while (--n)*dest++ = *source++;
   *dest = 0;
}


The neccessary part of disasm dump:

Code:
DEBUG :: scpyn
=========
scpyn
=========
:00401010 56                      push esi
:00401011 8B742410                mov esi, dword[esp+10]
:00401015 4E                      dec esi
:00401016 7416                    je 0040102E
:00401018 8B4C240C                mov ecx, dword[esp+0C]
:0040101C 8B442408                mov eax, dword[esp+08]
---------
:00401020 8A11                    mov dl, byte[ecx]
:00401022 8810                    mov byte[eax], dl
:00401024 40                      inc eax
:00401025 41                      inc ecx
:00401026 4E                      dec esi
:00401027 75F7                    jne 00401020
:00401029 C60000                  mov byte[eax], 00
:0040102C 5E                      pop esi
:0040102D C3                      ret

---------
:0040102E 8B442408                mov eax, dword[esp+08]
:00401032 5E                      pop esi
:00401033 C60000                  mov byte[eax], 00
:00401036 C3                      ret

:00401037 90 90 90 90 90 90 90 90 90                        .........


Derived hex : 568B7424104E74168B4C240C8B4424088A11881040414E75F7C600005EC3
is different from yours : 558becff4d108b4508740e8b4d0c8a1188104041ff4d1075f5c600005dc3
.. though it seems to work:

Code:
MCode( Code, "568B7424104E74168B4C240C8B4424088A11881040414E75F7C600005EC38B4424085EC60000C3" )
OldStr := "The Quick Brown Fox Jumps Over The Lazy Dog"
VarSetCapacity( NewStr,15+1 )
DllCall( &Code, Str,NewStr, Str,OldStr, UInt,15+1, "cdecl" )

MsgBox, % "[" NewStr "]`n" Errorlevel


MCode(ByRef code, hex) { ; allocate memory and write Machine Code there
   VarSetCapacity(code,StrLen(hex)//2)
   Loop % StrLen(hex)//2
      NumPut("0x" . SubStr(hex,2*A_Index-1,2), code, A_Index-1, "Char")
}


Rolling Eyes Rolling Eyes

Quote:
You can extract the machine code from each non-comment line. It is between the first and second white space (tab). I copy the function code to the clipboard, where a simple AHK script removes the beginning and end of each line.


Thanks! I have it as follows, now:

Code:
Loop, Parse, Clipboard, `n, `r
  Hex .= RegExReplace( SubStr(A_LoopField,11,20), "(^\s*|\s*$)")
Clipboard := Hex
MsgBox,0, % StrLen(Hex)//2, % Hex


Laszlo wrote:
There are, unfortunatelly exceptions. Sometimes a longer block of machine code is spread into 2 lines. I don't have an intelligent enough script, just process the bad code manually.


I was referring to a different problem. Would a function GoTo to an another offset ? If that is the case, should I jump to the relevant part to extract the hex code ? Does these machine code functions always end with hex C3 ?

Thanks for all the help Sir, but I am terribly confused.

Rolling Eyes
Back to top
View user's profile Send private message
Laszlo



Joined: 14 Feb 2005
Posts: 4012
Location: Pittsburgh

PostPosted: Wed Mar 19, 2008 9:59 pm    Post subject: Reply with quote

SKAN wrote:
different from yours
Different compilers create different machine code. You could also experiment with different optimization settings.
SKAN wrote:
Would a function GoTo to an another offset ? If that is the case, should I jump to the relevant part to extract the hex code ?
If the branch is to another function, you are out of luck. The machine code functions in AHK will have different relative location. Unless you include all the code in one machine code block, these will not work. Also, if library functions are called, they often crash your script.
SKAN wrote:
Does these machine code functions always end with hex C3?
Not necessarily. The last instruction can be any of the many branch instructions, and the function returns from another place. Also, if the function is not CDECL, the last instruction is usually a return with stack cleanup.
Back to top
View user's profile Send private message
SKAN



Joined: 26 Dec 2005
Posts: 5880

PostPosted: Wed Mar 19, 2008 10:14 pm    Post subject: Reply with quote

Quote:
Different compilers create different machine code. You could also experiment with different optimization settings.


Means ... I did it Very Happy . Thank you Sir.

Quote:
If the branch is to another function, you are out of luck.


Sad

Quote:
Also, if library functions are called, they often crash your script.


You mean .LIB or also .DLL ?

Quote:
The last instruction can be any of the many branch instructions, and the function returns from another place. Also, if the function is not CDECL, the last instruction is usually a return with stack cleanup.


Thanks for the clarifications Sir.. very useful information!

Smile
Back to top
View user's profile Send private message
Laszlo



Joined: 14 Feb 2005
Posts: 4012
Location: Pittsburgh

PostPosted: Wed Mar 19, 2008 10:26 pm    Post subject: Reply with quote

Any library function call is bad. Even if the code is stand alone, and the compiler copies the used functions somewhere into the program, their locations are hard to find, and the machine code needs address patches. This was the reason I could not make floating point machine code functions: the compiler used its internal functions regardless of what I tried.
Back to top
View user's profile Send private message
SKAN



Joined: 26 Dec 2005
Posts: 5880

PostPosted: Wed Mar 19, 2008 10:31 pm    Post subject: Reply with quote

Oh! Sad

Quote:
If the branch is to another function, you are out of luck.


How do I tell it by looking at the dump ?
Back to top
View user's profile Send private message
Laszlo



Joined: 14 Feb 2005
Posts: 4012
Location: Pittsburgh

PostPosted: Wed Mar 19, 2008 10:42 pm    Post subject: Reply with quote

They are marked with funny looking names.
Back to top
View user's profile Send private message
apocalypse~r



Joined: 21 Jun 2007
Posts: 19

PostPosted: Thu Mar 20, 2008 5:27 am    Post subject: Do you know how much id give... Reply with quote

someone write a script that takes a function in a program or a dll or something and turns it into a hex string for use with the mcode function.
think: you could define any function no matter how complex in one line!
_________________
problems := bugs + errors + glitches
code := (problems != 0) ? debug(code) : celebrate()
Back to top
View user's profile Send private message
tic



Joined: 22 Apr 2007
Posts: 1353

PostPosted: Thu Mar 20, 2008 5:39 am    Post subject: Re: Do you know how much id give... Reply with quote

apocalypse~r wrote:
someone write a script that takes a function in a program or a dll or something and turns it into a hex string for use with the mcode function.
think: you could define any function no matter how complex in one line!


lol Wink
i dont think its as easy as that
Back to top
View user's profile Send private message
Azerty



Joined: 19 Dec 2006
Posts: 72
Location: France

PostPosted: Thu Mar 20, 2008 10:30 pm    Post subject: Reply with quote

Skan, reading your posts, I understand (but might be wrong) that you want to made and automatic concersion from C to HEX code. Am I right ?

I think it'll be really difficult if C functions want to talk to each other. So will it be to embed static data. If you disassemble Ascii85_Encoder in ODBG, you'll see there are static tables : they are difficult to embed because the code, to be usable in AHK, must be "self-relative". That means any reference to memory or data must be relative to the starting adress of the function. Usually, branches (like goto) are self-relative, but function calls are absolute (the EXE/DLL loading module in Windows does what is called "code relacation" to correct absolute addresses so they match with the address the data is loaded into memory), as are references to static data.

In assembler, with some little tricks, the code can be made self-relative. In C, you only depend on the compiler, which use no such tricks.

And for compilation of stand-alone functions, I was wondering if using gcc (MingW in our case) to produce object modules and to export the binary code out of object modules would not be easier than producing EXEs and searching hazardously for start and end of function into it. Stand-alone object modules with one function in each would further explicitely permit to identify if code is self-relative or not...
Back to top
View user's profile Send private message
SKAN



Joined: 26 Dec 2005
Posts: 5880

PostPosted: Thu Mar 20, 2008 10:44 pm    Post subject: Reply with quote

Azerty wrote:
you want to made and automatic concersion from C to HEX code. Am I right ?


Exactly.. I give the script a starting offset, and it tries to extract dependencies, patch the offsets and put it together to make it one long byte stream.

I see it not possible .. Sad
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    AutoHotkey Community Forum Index -> Scripts & Functions All times are GMT
Goto page Previous  1, 2, 3 ... 10, 11, 12, 13  Next
Page 11 of 13

 
Jump to:  
You can post new topics in this forum
You can reply to topics in this forum


Powered by phpBB © 2001, 2005 phpBB Group