- @FredOoo: Please rename the thread from DOM (web browsers) to BOM (text encodings), it's incredibly confusing. Plus it will hurt searching for this thread, if it remains as is.
- Good thread topic, I make all my code Unicode/ANSI compatible. But otherwise I'd use an AHK script to handle opening ahk files, it would search for a comment like ';AHK v2 script', inside the script, to determine which exe to use.
- Btw in your second post, UTF-16 characters appear at even offsets,
- @swagfag: Nice encodings table. I thought I'd try and reverse-engineer it, as a speed-coding challenge. It has a lot of fiddly bits to it though. I wonder if your code was anything like mine.
- Btw for your UTF-16 BE example, the BOM should be þÿ.
Code: Select all
q:: ;test write string in different encodings
vList := "CP0,UTF-8,UTF-16,UTF-16"
oEnc := StrSplit("ANSI,UTF 8 ,UTF 16,UTF 16", ",")
oEnc2 := StrSplit(",,LE,BE", ",")
vText := "hello"
vOutput := ""
Loop, Parse, vList, % ","
{
vEnc := A_LoopField
vIndex := A_Index
Loop, 2
{
if (A_Index = 2) && (vEnc = "CP0")
break
vPfx := (A_Index = 1) ? "" : Chr(0xFEFF) ;BOM
vEnc2 := oEnc[vIndex]
if oEnc2[vIndex]
vEnc2 .= " " oEnc2[vIndex]
if (SubStr(vEnc2, 1, 1) = "U")
vEnc2 .= (A_Index = 1) ? " NO BOM" : " BOM"
vSize := StrPut(vPfx vText, vEnc)
vSize--
if (vEnc = "UTF-16")
vSize *= 2
VarSetCapacity(vData, vSize)
if (oEnc2[vIndex] = "BE")
{
VarSetCapacity(vData2, vSize)
StrPut(vPfx vText, &vData2, vEnc)
;LCMAP_BYTEREV := 0x800
DllCall("kernel32\LCMapStringW", "UInt",0, "UInt",0x800, "WStr",vData2, "Int",vSize/2, "WStr",vData, "Int",vSize/2)
}
else
StrPut(vPfx vText, &vData, vEnc)
vHex := vPreview := ""
Loop, % vSize
{
vOrd := NumGet(&vData, A_Index-1, "UChar")
vHex .= (A_Index=1?"":" ") Format("{:02X}", vOrd)
vPreview .= vOrd ? Chr(vOrd) : "."
}
;vOutput .= vSize " " vEnc "`r`n"
vOutput .= Format("{: -20}{: -39}{}`r`n", vEnc2, vHex, vPreview)
}
}
Clipboard := vOutput
MsgBox, % vOutput
return