[CMD/COM/DLL] xdoc2txt - Extract text from pdf/doc/xls...

Discuss other useful utilities, general computing tips & tricks, Internet resources, etc.
tmplinshi
Posts: 1540
Joined: 01 Oct 2013, 14:57

[CMD/COM/DLL] xdoc2txt - Extract text from pdf/doc/xls...

12 Oct 2013, 11:41

xdoc2txt can convert many document formats to txt, without needed to install Acrobat and WORD.

Supported formats:
Image

Three versions provided:
  • xdoc2txt.exe - Command line tool
  • xd2txcom.dll - COM component version
  • xd2txlib.dll - Dll version
Examples:
Command line example:

Code: Select all

xdoc2txt.exe -8 test.doc | iconv -f utf-8 -c
Dll example:

Code: Select all

if !A_IsUnicode {
	MsgBox, Please use unicode AutoHotkey to run.
	ExitApp
}

xdoc2txt_load(1)
MsgBox, % xdoc2txt("test.doc")
xdoc2txt_load(0)

xdoc2txt(fileName) { ; by HotKeyIt (http://ahkscript.org/boards/viewtopic.php?f=5&t=267&p=2157#p9515)
	fileLength := DllCall("xd2txlib\ExtractText", "Str", fileName, "Int", False, "Int*", fileText)
	Return StrGet( fileText, fileLength / 2 )
}

xdoc2txt_load(Load := True) {
	static hModule

	if Load
		Return, hModule := DllCall("LoadLibrary", "Str", "xd2txlib.dll")
	else
		Return, DllCall("FreeLibrary", UInt, hModule)
}
Homepage: http://ebstudio.info/home/xdoc2txt.html

Return to “Other Utilities & Resources”

Who is online

Users browsing this forum: No registered users and 33 guests