Easy OCR

Post your working scripts, libraries and tools
nicstella

Easy OCR

05 Jun 2016, 02:38

I have recently been looking around the OCR posts, and was unable to get any to work. Most were posted years ago.

Instead, after finding mention of a great program "Capture2Text", I built a very simple OCR system using this. Simply activate, and click-and-drag around the text to read, and the results will be left in your clipboard.

You can find a download for Capture2Text here: http://capture2text.sourceforge.net/#download
Note that the getSelectionCoords() function is code pulled straight from another post at https://autohotkey.com/board/topic/1018 ... nsresults/

Code: Select all

;hotkey to activate OCR
+!q::
	getSelectionCoords(x_start, x_end, y_start, y_end)
	RunWait, C:\Capture2Text.exe %x_start% %y_start% %x_end% %y_end%
	MsgBox, In area :: x_start: %x_start% --> x_end: %x_end% , y_start: %y_start% --> y_end: %y_end%`n`nFound Text:`n`n%clipboard%
return

; creates a click-and-drag selection box to specify an area
getSelectionCoords(ByRef x_start, ByRef x_end, ByRef y_start, ByRef y_end) {
	;Mask Screen
	Gui, Color, FFFFFF
	Gui +LastFound
	WinSet, Transparent, 50
	Gui, -Caption 
	Gui, +AlwaysOnTop
	Gui, Show, x0 y0 h%A_ScreenHeight% w%A_ScreenWidth%,"AutoHotkeySnapshotApp"     

	;Drag Mouse
	CoordMode, Mouse, Screen
	CoordMode, Tooltip, Screen
	WinGet, hw_frame_m,ID,"AutoHotkeySnapshotApp"
	hdc_frame_m := DllCall( "GetDC", "uint", hw_frame_m)
	KeyWait, LButton, D 
	MouseGetPos, scan_x_start, scan_y_start 
	Loop
	{
		Sleep, 10   
		KeyIsDown := GetKeyState("LButton")
		if (KeyIsDown = 1)
		{
			MouseGetPos, scan_x, scan_y 
			DllCall( "gdi32.dll\Rectangle", "uint", hdc_frame_m, "int", 0,"int",0,"int", A_ScreenWidth,"int",A_ScreenWidth)
			DllCall( "gdi32.dll\Rectangle", "uint", hdc_frame_m, "int", scan_x_start,"int",scan_y_start,"int", scan_x,"int",scan_y)
		} else {
			break
		}
	}

	;KeyWait, LButton, U
	MouseGetPos, scan_x_end, scan_y_end
	Gui Destroy
	
	if (scan_x_start < scan_x_end)
	{
		x_start := scan_x_start
		x_end := scan_x_end
	} else {
		x_start := scan_x_end
		x_end := scan_x_start
	}
	
	if (scan_y_start < scan_y_end)
	{
		y_start := scan_y_start
		y_end := scan_y_end
	} else {
		y_start := scan_y_end
		y_end := scan_y_start
	}
}
carno
Posts: 201
Joined: 20 Jun 2014, 16:48

Re: Easy OCR

05 Jun 2016, 20:00

Does your script work with PDF files? I use http://www.newocr.com/ for PDF to TEXT. Does your script offer the same (or better) result?
nicstella

Re: Easy OCR

05 Jun 2016, 20:08

Well it works on specified areas of the screen, so probably not the best for converting whole documents.

On a slightly different note, you can save the output to a file by specifying the output file name as the last argument to the run command,
i.e.

Code: Select all

	RunWait, C:\Users\Nic\Documents\Mine\!PORTABLES\Capture2Text\Capture2Text.exe %x_start% %y_start% %x_end% %y_end%
OUTPUT.txt
User avatar
boiler
Posts: 2639
Joined: 21 Dec 2014, 02:44

Re: Easy OCR

05 Jun 2016, 23:00

What does your script do that Capture2Text doesn't already do by itself? It was written in AHK, btw.

From their instructions:
How to OCR:
1) Position your mouse at the top-left corner of the text that you want to OCR.

2) Press the OCR key (Windows Key + Q) to begin an OCR capture.

3) Move your mouse to resize the blue box over the text that you want to OCR.

4) Press the OCR key again or left-click to complete the OCR capture.
The OCR'd text will be placed in the clipboard.
I've used it both as described as above and from my AHK scripts by calling it via command line very successfully.
TwilightKillerX
Posts: 6
Joined: 23 Aug 2016, 21:28

Re: Easy OCR

24 Aug 2016, 01:09

Testing OCR Recognition
OCR Applications: Capture2Text vs. ABBYY Screenshot Reader
Settings: Copy Raw To Clipboard

Stock
Spoiler
Plaid
Spoiler
Hey, thanks for the ahk code in any case - I've changed it to run with the ABBYY Reader instead :)
carno
Posts: 201
Joined: 20 Jun 2014, 16:48

Re: Easy OCR

30 Aug 2016, 15:29

Wow! ABBYY OCR seems to be hands down the winner by a long shot! Could you post/share your script for ABBYY?
mons3fa
Posts: 7
Joined: 20 May 2017, 19:33
Google: https://plus.google.com/+MoeRezai

Re: Easy OCR

10 Jun 2017, 09:56

TwilightKillerX wrote:Testing OCR Recognition
OCR Applications: Capture2Text vs. ABBYY Screenshot Reader
Settings: Copy Raw To Clipboard

Stock
Spoiler
Plaid
Spoiler
Hey, thanks for the ahk code in any case - I've changed it to run with the ABBYY Reader instead :)

Given this is an old post I was wondering if you would be able to share the code for the ABBYY reader? and in addition would it be possible to export this information to an Excel spreadsheet? TIA
“There is only one good, knowledge, and one evil, ignorance.”
“I cannot teach anybody anything. I can only make them think”
-Socrates
debbie
Posts: 2
Joined: 05 Dec 2017, 17:01
Facebook: alfredo.menezes.56

Re: Easy OCR

30 Dec 2017, 22:45

It's not copying the text to the clipboard :/
zequi
Posts: 1
Joined: 02 Oct 2017, 22:53

Re: Easy OCR

31 Dec 2017, 14:58

debbie wrote:It's not copying the text to the clipboard :/
Try this older version.

https://sourceforge.net/projects/captur ... Text_v3.9/

It doesn't work with newer versions of Capture2Text
Nellybird

Re: Easy OCR

31 Dec 2017, 19:24

zequi wrote:
debbie wrote:It's not copying the text to the clipboard :/
Try this older version.

https://sourceforge.net/projects/captur ... Text_v3.9/

It doesn't work with newer versions of Capture2Text
what if i want to capture an area without launching the program itself. Say my script has an image and i program my script i want to only ocr a certain area, is there code that would let me plug into this and then have the ocr results placed in a txt file?
YetAnotherHal
Posts: 1
Joined: 23 Feb 2019, 13:36

Easy OCR

17 Mar 2019, 15:34

Hi. I've been using AutoHotKey for quite a few years now, and it's just an amazing tool that has saved me incredible amounts of time, and in some cased, enabled things that I otherwise could not have achieved.

I wanted to share yet another code that matches the OP title.

On a windows system, this AHK script was able to capture the currently active window, save it in mspaint.exe, OCR it with Tesseract, and then influence subsequent AutoHotKey script behavior based on application content.

My program of interest was NetHack for Windows. Windows Spy, WinGetText, and ControlGetText weren't returning the text that I could see within the application window. Since I wanted to make a macro that was responsive to particular content within the program of interest, that led me to experiment with OCR.

Tesseract was very sensitive to the screen resolution. The game used a default font size of less than 10 points. I changed the menu font to 10 (fail), then to 12 (fail), then to 16 (fail), then to 20 points (success!). This was somewhat disappointing, but ultimately gave adequate results.

I've hard coded my working directory path. You'll probably want to change that to use your own locations.

;-----------capture and OCR the application's screen----------

; delete contents of temporary working folder (I never figured out the switches to make Tesseract overwrite the output file)

FileDelete, C:\AHK\WorkingFolder\*.*

; save nethack screen image into temp.png

WinActivate, NetHack for Windows - Graphical Interface ; Activate nethack
WinWaitActive, NetHack for Windows - Graphical Interface ; Make sure nethack is active
SendInput, !{PrintScreen} ; Save the screen of the currently active window to the clipboard
Run, mspaint.exe ; Launch Paint
WinActivate, Paint ; Set the paint window active
WinWaitActive, Paint ; Wait until paint is the active window
SendInput, ^v ; Paste from clipboard into Paint
SendInput, !Fa ; Select menu File/Save As
Sleep 300 ; I don't always understand why, but some operations need a little time to work properly.
SendInput, C:\AHK\WorkingFolder\temp.png{Enter}
Sleep 100
SendInput, y ; If file exists, say Y to the overwrite prompt (an extra "Y" is harmless here if the file does not exist)
SendInput, !Fx ; Exit Paint
Sleep 1000 ; Paint needs a little disk access time before it has really saved the image

; OCR from the image file to a text file (Tesseract was installed into my local appdata folder)

RunWait, %localappdata%\Tesseract-OCR\tesseract.exe C:\AHK\WorkingFolder\temp.png C:\AHK\WorkingFolder\temp

;-------- load OCR text file into AHK varible-----------

Sleep 1000
FileRead, myWinText, C:\AHK\WorkingFolder\temp.txt


;-------- this is only uncommented when debugging, to examine the OCR text file -----------
; MsgBox %myWinText%

;------------find text--------------

myString := "digest" ; unique string which, in this case, means the newly created nethack character possesses a ring of slow digestion

if (InStr(myWinText, myString) <> 0) ; found the desired text when return <> 0
{
Break ; found it! (break out of a larger loop that has not been shown in this example)
}

It would have been nice if I had figured out how Tesseract could have behaved as a subroutine that returns the OCR string after OCRing the clipboard. This would have eliminated the very significant time cost of file handling the intermediate products. However, so far, I've only discovered examples that use files to handle intermediate products such as the image and the OCR text.

It would have been nice if Tesseract could have delivered satisfactory results from the default screen resolutions, which seemed very clean and readble to this human, even with my marginal eyesight. *sigh*

Anyway, this approached worked out OK for my needs. I hope it might possibly be helpful as an example to others.

Thanks again for this incredible AutoHotKey tool!

Return to “Scripts and Functions”

Who is online

Users browsing this forum: No registered users and 33 guests