text to html, html to text (recreate AHK's Transform HTML subcommand)

Post your working scripts, libraries and tools
User avatar
jeeswg
Posts: 5734
Joined: 19 Dec 2016, 01:58
Location: UK

text to html, html to text (recreate AHK's Transform HTML subcommand)

16 Oct 2017, 16:09

A description of the Transform HTML subcommand which is scheduled to be removed in AHK v2.

Note: it does text to html, but there no equivalent to do html to text. E.g. you use it to prepare text to be stored as html.

Transform
https://autohotkey.com/docs/commands/Transform.htm#HTML
Converts String into its HTML equivalent by translating characters whose ASCII values are above 127 to their HTML names (e.g. £ becomes &pound;). In addition, the four characters "&<> are translated to "&<>. Finally, each linefeed (`n) is translated to <br>`n (i.e. <br> followed by a linefeed).
...
Converts certain characters to named expressions. e.g. € is converted to &euro;
...
Converts certain characters to numbered expressions. e.g. € is converted to &#8364;

Code: Select all

q:: ;attempt at recreating the Transform HTML subcommand for use in AHK v2
vText := ""
Loop, 500
	vText .= Chr(A_Index)
Loop, 4
{
	vFlags := A_Index-1
	vHtml1 := JEE_TransformHtml(vText, vFlags)
	Transform, vHtml2, HTML, % vText, % vFlags
	MsgBox, % (vHtml1 == vHtml2) "`r`n" vHtml1
}
return

;==================================================

;for AHK Unicode only
;flags:
;0: 5 chars: Chr(10) and "&<>
;1: 5 chars, then 121 chars to named expressions
;2: 5 chars, then Chr(128) and above to numbered expressions
;3: perform mode 1, then mode 2

JEE_TransformHtml(vText, vFlags:=1)
{
	static oArray := Object(StrSplit("160,nbsp;161,iexcl;162,cent;163,pound;164,curren;165,yen;166,brvbar;167,sect;168,uml;169,copy;170,ordf;171,laquo;172,not;173,shy;174,reg;175,macr;176,deg;177,plusmn;178,sup2;179,sup3;180,acute;181,micro;182,para;183,middot;184,cedil;185,sup1;186,ordm;187,raquo;188,frac14;189,frac12;190,frac34;191,iquest;192,Agrave;193,Aacute;194,Acirc;195,Atilde;196,Auml;197,Aring;198,AElig;199,Ccedil;200,Egrave;201,Eacute;202,Ecirc;203,Euml;204,Igrave;205,Iacute;206,Icirc;207,Iuml;208,ETH;209,Ntilde;210,Ograve;211,Oacute;212,Ocirc;213,Otilde;214,Ouml;215,times;216,Oslash;217,Ugrave;218,Uacute;219,Ucirc;220,Uuml;221,Yacute;222,THORN;223,szlig;224,agrave;225,aacute;226,acirc;227,atilde;228,auml;229,aring;230,aelig;231,ccedil;232,egrave;233,eacute;234,ecirc;235,euml;236,igrave;237,iacute;238,icirc;239,iuml;240,eth;241,ntilde;242,ograve;243,oacute;244,ocirc;245,otilde;246,ouml;247,divide;248,oslash;249,ugrave;250,uacute;251,ucirc;252,uuml;253,yacute;254,thorn;255,yuml;338,OElig;339,oelig;352,Scaron;353,scaron;376,Yuml;402,fnof;710,circ;732,tilde;8211,ndash;8212,mdash;8216,lsquo;8217,rsquo;8218,sbquo;8220,ldquo;8221,rdquo;8222,bdquo;8224,dagger;8225,Dagger;8226,bull;8230,hellip;8240,permil;8249,lsaquo;8250,rsaquo;8364,euro;8482,trade", [",",";"])*)
	local vChar,vOrd,vText2

	;replace & before everything else
	;replace `n before <>
	vText := StrReplace(vText, "&", "&")
	vText := StrReplace(vText, Chr(34), """)
	vText := StrReplace(vText, "<", "<")
	vText := StrReplace(vText, ">", ">")
	vText := StrReplace(vText, "`n", "<br>`n")

	vText2 := RegExReplace(vText, "[[:ascii:]]")
	if vFlags
	{
		while !(vText2 = "")
		{
			vChar := SubStr(vText2, 1, 1)
			vOrd := Ord(vChar)
			if (vFlags & 1) && oArray.HasKey(vOrd)
				vText := StrReplace(vText, vChar, "&" oArray[vOrd] ";")
			else if (vFlags & 2)
				vText := StrReplace(vText, vChar, "&#" vOrd ";")
			vText2 := StrReplace(vText2, vChar)
		}
	}
	return vText
}
Other similar functions:

Code: Select all

JEE_StrHtmlToText(vHtml)
{
	oHTML := ComObjCreate("HTMLFile")
	oHTML.write("<title>" vHtml "</title>")
	vText := oHTML.getElementsByTagName("title")[0].innerText
	oHTML := ""
	return vText
}

;==================================================

JEE_StrTextToHtml(vText)
{
	oHTML := ComObjCreate("HTMLFile")
	oHTML.write("<title></title>")
	oHTML.getElementsByTagName("title")[0].value := vText
	vHtml := oHTML.getElementsByTagName("title")[0].outerHTML
	oHTML := ""
	return SubStr(vHtml, 15, -10)
}
Note: I believe that the code below replaces characters 9-13 and 160 with spaces and trims leading/multiple/trailing spaces. And that it replaces ChrW(128) to ChrW(159) with ChrA(128) to ChrA(159).

[To achieve ChrA() in AHK Unicode, see ';get 255 ANSI characters (in AHK Unicode versions)' here:]
jeeswg's characters tutorial - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=7&t=26486

Other similar code:

Code: Select all

;convert html special characters - Ask for Help - AutoHotkey Community
;https://autohotkey.com/board/topic/92033-convert-html-special-characters/

document := ComObjCreate("HTMLFile")
document.write(html)
MsgBox % document.body.outerText

;How to transform like "&#8364; " this code into character? - AutoHotkey Community
;https://autohotkey.com/boards/viewtopic.php?f=5&t=3638

  doc := ComObjCreate("HTMLfile")
  doc.write(strHTML)
  return doc.body.innerText
Links:
text/list/table functions - AutoHotkey Community
https://autohotkey.com/boards/viewtopic ... 89#p135289
Transform's HTML subcommand: char 8218 - AutoHotkey Community
https://autohotkey.com/boards/viewtopic ... 14&t=38422
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA

Return to “Scripts and Functions”

Who is online

Users browsing this forum: arcticir, ecolzero and 37 guests