[CMD/COM/DLL] xdoc2txt - Extract text from pdf/doc/xls...

Discuss other useful utilities, general computing tips & tricks, Internet resources, etc.
tmplinshi
Posts: 1604
Joined: 01 Oct 2013, 14:57

[CMD/COM/DLL] xdoc2txt - Extract text from pdf/doc/xls...

12 Oct 2013, 11:41

xdoc2txt can convert many document formats to txt, without needed to install Acrobat and WORD.

Supported formats:
Image

Three versions provided:
  • xdoc2txt.exe - Command line tool
  • xd2txcom.dll - COM component version
  • xd2txlib.dll - Dll version
Examples:
Command line example:

Code: Select all

xdoc2txt.exe -8 test.doc | iconv -f utf-8 -c
Dll example:

Code: Select all

if !A_IsUnicode {
	MsgBox, Please use unicode AutoHotkey to run.
	ExitApp
}

xdoc2txt_load(1)
MsgBox, % xdoc2txt("test.doc")
xdoc2txt_load(0)

xdoc2txt(fileName) { ; by HotKeyIt (http://ahkscript.org/boards/viewtopic.php?f=5&t=267&p=2157#p9515)
	fileLength := DllCall("xd2txlib\ExtractText", "Str", fileName, "Int", False, "Int*", fileText)
	Return StrGet( fileText, fileLength / 2 )
}

xdoc2txt_load(Load := True) {
	static hModule

	if Load
		Return, hModule := DllCall("LoadLibrary", "Str", "xd2txlib.dll")
	else
		Return, DllCall("FreeLibrary", UInt, hModule)
}
Homepage: http://ebstudio.info/home/xdoc2txt.html
hasantr
Posts: 933
Joined: 05 Apr 2016, 14:18
Location: İstanbul

Re: [CMD/COM/DLL] xdoc2txt - Extract text from pdf/doc/xls...

24 Nov 2020, 08:42

Thanks tmplinshi. I really take advantage of this. But one problem is forcing me too much. Some pdf format files downloaded from the internet will be blocked automatically. When trying to open those files it crashes. Since I am working with files in the local network, block checking and unblock methods do not work properly.
I wonder. Do you have any solution that we can work with xd2txcom.dll?. Thank you sir.

Return to “Other Utilities & Resources”

Who is online

Users browsing this forum: No registered users and 39 guests