Vis2 - Image to Text OCR()

Post your working scripts, libraries and tools for AHK v1.1 and older
paulpma
Posts: 65
Joined: 08 Sep 2018, 22:05

Re: Vis2 - Image to Text OCR()

19 May 2020, 01:22

@AHKMode
use

Code: Select all

 MouseClickDrag
https://www.autohotkey.com/docs/commands/MouseClickDrag.htm

or

Code: Select all

SendEvent {Click 6, 52, down}{click 45, 52, up}
or

Code: Select all

text:=OCR(x,y,x1,y1)
; see help file.. (last example most likely to work)
AHKMode
Posts: 10
Joined: 20 Apr 2020, 20:21

Re: Vis2 - Image to Text OCR()

20 May 2020, 01:06

Thanks for the reply @paulpma

^w::
text := OCR()
Sleep, 1000
SendEvent {Click 6, 52, down}{click 45, 52, up}
Sleep, 1000
MsgBox, % text

this is my script and not working.. the line SendEvent {Click 6, 52, down}{click 45, 52, up} was execute after i click left of the mouse button. i want to know what command should i do to make the OCR() function and mouseclickdrag method will be together. I've search on the google and i cannot compose the script of OCR() with somethings line like controlclick or controlsend or control.. because i want to know what should i put on the script combined the OCR() function and mouseclickdrag/sendevent..

hopefully you understand me. thanks again!
paulpma
Posts: 65
Joined: 08 Sep 2018, 22:05

Re: Vis2 - Image to Text OCR()

22 May 2020, 08:50

Hi, @AHKMode
I have tested your given script and getting same results.... I think its not working because of the way VIS script was designed... Have you tried the third method??
paulpma
Posts: 65
Joined: 08 Sep 2018, 22:05

Re: Vis2 - Image to Text OCR()

22 May 2020, 09:03

I am using GDIP with VIS and other application as well and getting an error. This happends all the time when after other application has used GDIP library and when i am to manually selecting text in VIS.. Error occurs immediately after selection. See error image for details. The line of error in GDIP is the following:

Code: Select all

DllCall("gdiplus\GdipGetImageEncodersSize", "uint*", nCount, "uint*", nSize)
	
which is part of function:

Code: Select all

Gdip_SaveBitmapToFile(pBitmap, sOutput, Quality:=75)
Image

Any idea what can cause this issue?

This is the code that is executed prior to using VIS..

Code: Select all

fileName :=  A_YYYY "-" A_MM "-" A_DD "-" A_Hour "-" A_Min "-" A_Sec ".png"
pBitmap := Gdip_Startup()
pBitmap := Gdip_BitmapFromScreen()
saveFileTo := fileName                   
Gdip_SaveBitmapToFile(pBitmap, saveFileTo)
Gdip_DisposeImage(pBitmap)
Gdip_Shutdown(pBitmap)
the whole Gdip_SaveBitmapToFile function.

Code: Select all

Gdip_SaveBitmapToFile(pBitmap, sOutput, Quality:=75)
{
	Ptr := A_PtrSize ? "UPtr" : "UInt"
	nCount := 0
	nSize := 0
	_p := 0

	SplitPath sOutput,,, Extension
	if !RegExMatch(Extension, "^(?i:BMP|DIB|RLE|JPG|JPEG|JPE|JFIF|GIF|TIF|TIFF|PNG)$")
		return -1
	Extension := "." Extension

	DllCall("gdiplus\GdipGetImageEncodersSize", "uint*", nCount, "uint*", nSize)
	VarSetCapacity(ci, nSize)
	DllCall("gdiplus\GdipGetImageEncoders", "uint", nCount, "uint", nSize, Ptr, &ci)
	if !(nCount && nSize)
		return -2

	If (A_IsUnicode){
		StrGet_Name := "StrGet"

		N := (A_AhkVersion < 2) ? nCount : "nCount"
		Loop %N%
		{
			sString := %StrGet_Name%(NumGet(ci, (idx := (48+7*A_PtrSize)*(A_Index-1))+32+3*A_PtrSize), "UTF-16")
			if !InStr(sString, "*" Extension)
				continue

			pCodec := &ci+idx
			break
		}
	} else {
		N := (A_AhkVersion < 2) ? nCount : "nCount"
		Loop %N%
		{
			Location := NumGet(ci, 76*(A_Index-1)+44)
			nSize := DllCall("WideCharToMultiByte", "uint", 0, "uint", 0, "uint", Location, "int", -1, "uint", 0, "int",  0, "uint", 0, "uint", 0)
			VarSetCapacity(sString, nSize)
			DllCall("WideCharToMultiByte", "uint", 0, "uint", 0, "uint", Location, "int", -1, "str", sString, "int", nSize, "uint", 0, "uint", 0)
			if !InStr(sString, "*" Extension)
				continue

			pCodec := &ci+76*(A_Index-1)
			break
		}
	}

	if !pCodec
		return -3

	if (Quality != 75)
	{
		Quality := (Quality < 0) ? 0 : (Quality > 100) ? 100 : Quality
		if RegExMatch(Extension, "^\.(?i:JPG|JPEG|JPE|JFIF)$")
		{
			DllCall("gdiplus\GdipGetEncoderParameterListSize", Ptr, pBitmap, Ptr, pCodec, "uint*", nSize)
			VarSetCapacity(EncoderParameters, nSize, 0)
			DllCall("gdiplus\GdipGetEncoderParameterList", Ptr, pBitmap, Ptr, pCodec, "uint", nSize, Ptr, &EncoderParameters)
			nCount := NumGet(EncoderParameters, "UInt")
			N := (A_AhkVersion < 2) ? nCount : "nCount"
			Loop %N%
			{
				elem := (24+(A_PtrSize ? A_PtrSize : 4))*(A_Index-1) + 4 + (pad := A_PtrSize = 8 ? 4 : 0)
				if (NumGet(EncoderParameters, elem+16, "UInt") = 1) && (NumGet(EncoderParameters, elem+20, "UInt") = 6)
				{
					_p := elem+&EncoderParameters-pad-4
					NumPut(Quality, NumGet(NumPut(4, NumPut(1, _p+0)+20, "UInt")), "UInt")
					break
				}
			}
		}
	}

	if (!A_IsUnicode)
	{
		nSize := DllCall("MultiByteToWideChar", "uint", 0, "uint", 0, Ptr, &sOutput, "int", -1, Ptr, 0, "int", 0)
		VarSetCapacity(wOutput, nSize*2)
		DllCall("MultiByteToWideChar", "uint", 0, "uint", 0, Ptr, &sOutput, "int", -1, Ptr, &wOutput, "int", nSize)
		VarSetCapacity(wOutput, -1)
		if !VarSetCapacity(wOutput)
			return -4
		_E := DllCall("gdiplus\GdipSaveImageToFile", Ptr, pBitmap, Ptr, &wOutput, Ptr, pCodec, "uint", _p ? _p : 0)
	}
	else
		_E := DllCall("gdiplus\GdipSaveImageToFile", Ptr, pBitmap, Ptr, &sOutput, Ptr, pCodec, "uint", _p ? _p : 0)
	return _E ? -5 : 0
}

My knowldge in AHK is not that advanced to troubleshoot this, anyone can point me out into right direction? So far I have updated my GDIP library to 1.54.. Note this occurs on WIN 7.
iseahound
Posts: 1434
Joined: 13 Aug 2016, 21:04
Contact:

Re: Vis2 - Image to Text OCR()

22 May 2020, 15:04

Don't call Gdip_Shutdown() and see if that works.

I wrote a new image library to do what you're doing now :)
paulpma
Posts: 65
Joined: 08 Sep 2018, 22:05

Re: Vis2 - Image to Text OCR()

22 May 2020, 23:23

iseahound wrote:
22 May 2020, 15:04
Don't call Gdip_Shutdown() and see if that works.
Yes, it worked.. I am just wondering what happened in background that created this issue?
I wrote a new image library to do what you're doing now :)
That is awesome that you can do that. I am happy for you. I am wondering why write another one if GDIP exists? My level is no where near of creating library like that or working with DLL calls, but I am learning slowly..

Thank you Iseahound for your help. I really appreciate it.
stfur
Posts: 1
Joined: 18 Jun 2020, 19:28

Re: Vis2 - Image to Text OCR()

18 Jun 2020, 19:50

Hey, is there any way to set the array of OCR but relative to the specific window?
Something like OCR([X,Y,W,H], "Notepad") or so.
Or maybe set CoordMode to Window?
tom098656
Posts: 1
Joined: 22 Jun 2020, 18:34

Re: Vis2 - Image to Text OCR()

22 Jun 2020, 18:57

stfur wrote:
18 Jun 2020, 19:50
Hey, is there any way to set the array of OCR but relative to the specific window?
Something like OCR([X,Y,W,H], "Notepad") or so.
Or maybe set CoordMode to Window?
I think the use of screen coordinates is baked into the GDip library used by Vis2. Your best bet is probably to do something like this:

Code: Select all

WindowOCR(x1,y1,x2,y2, windowName)
{
	WinActivate, %windowName%
	WinGetPos, WinX,WinY,,,,,,
	x := WinX + x1
	y := WinY + y1
	w := x2-x1
	h := y2-y1
	coords := [x,y,w,h]
	return OCR(coords)
}
andy5566888
Posts: 10
Joined: 07 May 2020, 02:20

Re: Vis2 - Image to Text OCR()

23 Jun 2020, 04:40

i have some problem i have a photo the photo is the number one in photo but the backgroud have many color, and i use ocr but it can't find any text will ouput NULL
i want to know how to ocr the number one photo
RedRaccoon
Posts: 12
Joined: 04 Dec 2019, 12:42

Re: Vis2 - Image to Text OCR()

21 Aug 2020, 09:22

stfur wrote:
18 Jun 2020, 19:50
Hey, is there any way to set the array of OCR but relative to the specific window?
Something like OCR([X,Y,W,H], "Notepad") or so.
Or maybe set CoordMode to Window?
Was looking for something similar, but as tom098656 solution didn't work for me(cuz I suck at coding) I used WinActivate.
Activate desired window, run code, then activate your initial window. Slight delay based on how big an area you need to read.

Code: Select all

Q::  
if WinExist("ahk_exe BlueStacks.exe")
    WinActivate ; use the window found above
	OCR("A", ,[77, 682, 741, 800]).clipboard()
WinExist("ahk_exe firefox.exe") ; back to firefox
    WinActivate ; use the window found above
return
Albireo
Posts: 1747
Joined: 16 Oct 2013, 13:53

Re: Vis2 - Image to Text OCR()

24 Aug 2020, 15:13

I think the project looks exciting.
Analyzing an area on an image seems to work well (have not tested yet).

Will the result be improved with a large, high-resolution image?
How to analyze an area that is larger than the monitor?

What I miss / want is to be able to analyze fields / columns from a PDF file. (one or many pages) Will it be possible?
Akadin
Posts: 1
Joined: 24 Aug 2020, 15:07

Re: Vis2 - Image to Text OCR()

24 Aug 2020, 18:43

This looks super awesome, any plans to add multi-monitor support for the coordinates? Every time I add coordinates that are off my main monitor screen I get an error "Could not find source image"
gregster
Posts: 8921
Joined: 30 Sep 2013, 06:48

Re: Vis2 - Image to Text OCR()

24 Aug 2020, 18:58

Albireo wrote:
24 Aug 2020, 15:13
What I miss / want is to be able to analyze fields / columns from a PDF file. (one or many pages) Will it be possible?
Afaik, there are third-party pdf-to-text tools available on the interwebs (but I haven't use one (at least not recently), I don't know how they'd handle pdfs with restricted rights), which do not rely on OCR, I think.
OCR should perhaps serve as a last resort for this use case.
Albireo
Posts: 1747
Joined: 16 Oct 2013, 13:53

Re: Vis2 - Image to Text OCR()

27 Aug 2020, 06:33

The little I have tested, Vis2 seems to interpret text very well.
But how do I automatically select an area to convert to text in a .png file?
Is it possible to graphically select the area of an image, and then use the result in an AHK program?
(Something I missed?)
iseahound
Posts: 1434
Joined: 13 Aug 2016, 21:04
Contact:

Re: Vis2 - Image to Text OCR()

31 Aug 2020, 09:03

I'm sure you can do something like: OCR("mypic.png", , [0,0,100,100]) where the array is the part of the image to crop.
Albireo
Posts: 1747
Joined: 16 Oct 2013, 13:53

Re: Vis2 - Image to Text OCR()

01 Sep 2020, 07:11

iseahound wrote:
31 Aug 2020, 09:03
I'm sure you can do something like: OCR("mypic.png", , [0,0,100,100]) where the array is the part of the image to crop.
Yes! (thanks)
It works in some way, but… I have a problem with the coordinates.
How to get the coordinates? (Window Spy doesn't show me the right values)
When I run the OCR-program like this .:

Code: Select all

FileAppend % OCR(imgFile, "swe", [200,700,100,75]), %resFile%
It scans a surface elsewhere. (I never open the picture before OCR...)
iseahound
Posts: 1434
Joined: 13 Aug 2016, 21:04
Contact:

Re: Vis2 - Image to Text OCR()

01 Sep 2020, 09:01

If you just want to OCR an image:

Code: Select all

MsgBox % OCR("text.jpg")
If you want to crop out some spaces [x, y, w, h] where x,y are the top left coordinate of the image, and w,h are the width and height of the image to keep. Open the image in an editor like paint to view the x,y coordinates of your image.

Code: Select all

MsgBox % OCR("text.jpg", "swe", [x,y,w,h])
Botsy
Posts: 19
Joined: 25 Aug 2020, 16:59

Re: Vis2 - Image to Text OCR()

01 Sep 2020, 10:43

Hi all, maybe u can help me with this:

how to specify coordinates in ocr () obtained from the position of the mouse on the button?
In general, I want to make such a script: with the mouse we put on the desired area, by clicking the button, the coordinates are written to variables. I use these variables for ocr ([]) as an area for constant scanning. And then I compare the predetermined value with what happened in the ocr area.

Code: Select all

 
#include <Vis2> 

HS = 2
F11::

Highlight(HS)

Highlight(TH)
{
	local
  if (x="")
  {
    VarSetCapacity(pt,16,0), DllCall("GetCursorPos","ptr",&pt)
    x:=NumGet(pt,0,"uint"), y:=NumGet(pt,4,"uint")
  }
  x:=Round(x), y:=Round(y)

   w = 25
   h = 25
   Gui, +LastFound +ToolWindow -Caption +AlwaysOnTop
   Gui, Color, Red
   Gui, Show, x%X% y%Y% w%W% h%H% Hide
   Options := "0-0 " W "-0 " W "-" H " 0-" H " 0-0 " TH "-" TH
      . " " W-TH "-" TH " " W-TH "-" H-TH " " TH "-" H-TH " " TH "-" TH
   WinSet, Region, % Options
   Gui, Show, NA
   KeyWait, F11
   Gui, Destroy
}

text := OCR([%x%, %y%, %w%, %h%])
msgBox, % text

return

Esc:: ExitApp

in this script, by pressing f11 I highlight the area where the cursor is located, then I get the coordinates and write them to the variable
test.jpg
test.jpg (4.25 KiB) Viewed 3647 times
Albireo
Posts: 1747
Joined: 16 Oct 2013, 13:53

Re: Vis2 - Image to Text OCR()

02 Sep 2020, 08:15

iseahound wrote:
01 Sep 2020, 09:01
… If you want to crop out some spaces [x, y, w, h] where x,y are the top left coordinate of the image, and w,h are the width and height of the image to keep. Open the image in an editor like paint to view the x,y coordinates of your image.
I had perceived it that way (in the end)
Divided a row into several fields (columns), but one field - the quantity field, is not OCR - interpreted correctly.
In that area there are e.g. 10.0 (or 24.0 or ..) but Vis2 reads 10. or 24.
And when the area have 6.0 or 8.0 i got nothing (empty result)
Right now I have no explanation for this result.

Another thing I think about is the time Vis2 needs to interpret the content.
Right now each row takes about 4 seconds. (20 rows = 80 seconds, 100rows = 400 seconds)
OCR Invoice - My test program (unfortunately are the comments in Swedish)
iseahound
Posts: 1434
Joined: 13 Aug 2016, 21:04
Contact:

Re: Vis2 - Image to Text OCR()

02 Sep 2020, 12:26

There are three tessdata trained models:

https://github.com/tesseract-ocr/tessdata_best
https://github.com/tesseract-ocr/tessdata
https://github.com/tesseract-ocr/tessdata_fast

The tessdata_best folder is used when you call OCR() without the GUI.
When using the GUI the tessdata_fast folder is used.

I recommend you replace what is in tessdata_best with the tessdata or tessdata_fast model to get faster performance.

Regarding nothing showing up - you may have to change the x,y,w,h values by a pixel or two - that seems to have a large effect on the final outcome.

Return to “Scripts and Functions (v1)”

Who is online

Users browsing this forum: No registered users and 132 guests