convert bytes to shorts (by converting ANSI to Unicode)

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
User avatar
jeeswg
Posts: 6902
Joined: 19 Dec 2016, 01:58
Location: UK

convert bytes to shorts (by converting ANSI to Unicode)

17 Oct 2018, 18:44

- I'm interested in converting bytes to shorts. I.e. each byte becomes 2 bytes, e.g. 'AA' becomes 'AA00', you double the amount of data.
- I have found that MultiByteToWideChar can do this (demonstrated below), I wondered if there were any other well-known ways. (That is my 'Ask For Help' question.)
- What I wanted to do was this: take binary data and convert it to UTF-16, (doubling the amount of data,) replacing any null characters with a custom character. That way you have binary data stored as a string, where you can identify the nulls.
- (Commonly nulls are replaced with spaces, but then you can't differentiate between nulls and spaces. E.g. when you open a binary data file in Notepad.)
- Converting from ANSI to UTF-16 can have one issue, bytes in the range 128-255 can have their numbers changed. However, if you use CP28591 (ISO-8859-1), this is avoided, the numbers are maintained.
- Note: it appears that MultiByteToWideChar and WideCharToMultiByte can handle null bytes/shorts, they don't truncate the data/string, unless -1 is used.

Code: Select all

;Unicode - Wikipedia
;https://en.wikipedia.org/wiki/Unicode
;The first 256 code points were made identical to the content of ISO-8859-1 so as to make it trivial to convert existing western text.

;Code Page Identifiers | Microsoft Docs
;https://docs.microsoft.com/en-us/windows/desktop/intl/code-page-identifiers
;28591 | iso-8859-1 | ISO 8859-1 Latin 1; Western European (ISO)

q:: ;bytes to shorts (by converting ANSI to Unicode)

;create example data with 1000 items (bytes/shorts)
vSize := 1000
VarSetCapacity(vBytes, vSize)
VarSetCapacity(vShorts, vSize*2)
Loop, % vSize
{
	vNum := Mod(A_Index, 256)
	, NumPut(vNum, &vBytes, A_Index-1, "UChar")
	, NumPut(vNum, &vShorts, A_Index*2-2, "UShort")
}

;bytes to shorts
VarSetCapacity(vTemp, vSize*2, 0)
;CP28591 ;ISO-8859-1
DllCall("kernel32\MultiByteToWideChar", UInt,28591, UInt,0, Ptr,&vBytes, Int,vSize, Ptr,&vTemp, Int,vSize*2)
;memcmp returns 0 if data matches
MsgBox, % DllCall("msvcrt\memcmp", Ptr,&vShorts, Ptr,&vTemp, UPtr,vSize*2, "Cdecl Int")

;shorts (range 0 to 255) to bytes
VarSetCapacity(vTemp, vSize, 0)
;CP28591 ;ISO-8859-1
DllCall("kernel32\WideCharToMultiByte", UInt,28591, UInt,0, Ptr,&vShorts, Int,vSize, Ptr,&vTemp, Int,vSize, Ptr,0, Ptr,0)
;memcmp returns 0 if data matches
MsgBox, % DllCall("msvcrt\memcmp", Ptr,&vBytes, Ptr,&vTemp, UPtr,vSize, "Cdecl Int")
return
- Here's a further example, and a one-line function.

Code: Select all

;Unicode - Wikipedia
;https://en.wikipedia.org/wiki/Unicode
;The first 256 code points were made identical to the content of ISO-8859-1 so as to make it trivial to convert existing western text.

;Code Page Identifiers | Microsoft Docs
;https://docs.microsoft.com/en-us/windows/desktop/intl/code-page-identifiers
;28591 | iso-8859-1 | ISO 8859-1 Latin 1; Western European (ISO)

;e.g.
vSize := 10
VarSetCapacity(vBytes, vSize, 0)
VarSetCapacity(vShorts, vSize*2, 0)
vHex1 := ""
Loop, % vSize
{
	NumPut(A_Index, &vBytes, A_Index-1, "UChar")
	, vHex1 .= Format("{:02X}", A_Index) " "
}
JEE_BinByteToShort(&vShorts, &vBytes, vSize)
vHex2 := ""
Loop, % vSize*2
{
	vTemp := NumGet(&vShorts, A_Index-1, "UChar")
	, vHex2 .= Format("{:02X}", vTemp) " "
}
MsgBox, % vHex1 "`r`n" vHex2

;e.g. vBytesWritten := JEE_BinByteToShort(vAddrDest, vAddrSource, vSize)

;note: vAddrDest must be ready to receive vSize*2 bytes

JEE_BinByteToShort(vAddrDest, vAddrSource, vSize, vCP:=28591)
{
	;CP28591 ;ISO-8859-1
	return DllCall("kernel32\MultiByteToWideChar", UInt,vCP, UInt,0, Ptr,vAddrSource, Int,vSize, Ptr,vAddrDest, Int,vSize*2)
}
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA

Return to “Ask for Help (v1)”

Who is online

Users browsing this forum: scriptor2016 and 294 guests