Page 1 of 1

[CMD/COM/DLL] xdoc2txt - Extract text from pdf/doc/xls...

Posted: 12 Oct 2013, 11:41
by tmplinshi
xdoc2txt can convert many document formats to txt, without needed to install Acrobat and WORD.

Supported formats:
Image

Three versions provided:
  • xdoc2txt.exe - Command line tool
  • xd2txcom.dll - COM component version
  • xd2txlib.dll - Dll version
Examples:
Command line example:

Code: Select all

xdoc2txt.exe -8 test.doc | iconv -f utf-8 -c
Dll example:

Code: Select all

if !A_IsUnicode {
	MsgBox, Please use unicode AutoHotkey to run.
	ExitApp
}

xdoc2txt_load(1)
MsgBox, % xdoc2txt("test.doc")
xdoc2txt_load(0)

xdoc2txt(fileName) { ; by HotKeyIt (http://ahkscript.org/boards/viewtopic.php?f=5&t=267&p=2157#p9515)
	fileLength := DllCall("xd2txlib\ExtractText", "Str", fileName, "Int", False, "Int*", fileText)
	Return StrGet( fileText, fileLength / 2 )
}

xdoc2txt_load(Load := True) {
	static hModule

	if Load
		Return, hModule := DllCall("LoadLibrary", "Str", "xd2txlib.dll")
	else
		Return, DllCall("FreeLibrary", UInt, hModule)
}
Homepage: http://ebstudio.info/home/xdoc2txt.html

Re: [CMD/COM/DLL] xdoc2txt - Extract text from pdf/doc/xls...

Posted: 24 Nov 2020, 08:42
by hasantr
Thanks tmplinshi. I really take advantage of this. But one problem is forcing me too much. Some pdf format files downloaded from the internet will be blocked automatically. When trying to open those files it crashes. Since I am working with files in the local network, block checking and unblock methods do not work properly.
I wonder. Do you have any solution that we can work with xd2txcom.dll?. Thank you sir.