Page 1 of 1

[CMD/COM/DLL] xdoc2txt - Extract text from pdf/doc/xls...

Posted: 12 Oct 2013, 11:41
by tmplinshi
xdoc2txt can convert many document formats to txt, without needed to install Acrobat and WORD.

Supported formats:
Image

Three versions provided:
  • xdoc2txt.exe - Command line tool
  • xd2txcom.dll - COM component version
  • xd2txlib.dll - Dll version
Examples:
Command line example:

Code: Select all

xdoc2txt.exe -8 test.doc | iconv -f utf-8 -c
Dll example:

Code: Select all

if !A_IsUnicode {
	MsgBox, Please use unicode AutoHotkey to run.
	ExitApp
}

xdoc2txt_load(1)
MsgBox, % xdoc2txt("test.doc")
xdoc2txt_load(0)

xdoc2txt(fileName) { ; by HotKeyIt (http://ahkscript.org/boards/viewtopic.php?f=5&t=267&p=2157#p9515)
	fileLength := DllCall("xd2txlib\ExtractText", "Str", fileName, "Int", False, "Int*", fileText)
	Return StrGet( fileText, fileLength / 2 )
}

xdoc2txt_load(Load := True) {
	static hModule

	if Load
		Return, hModule := DllCall("LoadLibrary", "Str", "xd2txlib.dll")
	else
		Return, DllCall("FreeLibrary", UInt, hModule)
}
Homepage: http://ebstudio.info/home/xdoc2txt.html