Vis2 - Image to Text OCR()

iseahound · Post by **iseahound** » 22 Jun 2019, 07:15

To input an list of arrays just parse through list and call the parameters of each array. You’ll need to save the output to your list.

Alternatively if you’re talking about retrieving the bounding box of each text element ( for example if there are 3 paragraphs, and you want the coordinates of each paragraph ) that’s something I’m currently working on.

teosc · Post by **teosc** » 21 Nov 2019, 10:06

I think I found a bug.

I have dual screen, main 1080p and secondary 2160p.

The OCR function ([x, y, w, h]) does not accept negative variables.

Example:
clipboard := OCR ([3385, -235, 165, 20])

It returns the error: "Could not find source image"

It would also suffice to be able to set the coordmode to "relative", but even if I enter the command the OCR function seems to always prefer the screen coordinates.

Tips?

Trigun · Post by **Trigun** » 04 Jan 2020, 09:20

hello
i'm trying to read numbers like this

but i get almost all wrongs result, the comma is always translated in something else (sometimes 1 sometimes 4)
for ex that images is "281000" instead of 23000 or 23,000
i think is bc it try to get the black part instead the yellow
how i can improve the ocr?
any hint?

Post by **Hill** » 04 Jan 2020, 13:03

Vis2 is a great script and very useful and always worked like a charm.
I use i script that scan various windows and use vis to read a portion of their screen. Very recently it ramdonmy start to pop a windows with this error warning:

Exeption Thrown!
what:Vis2.provider.tesseract.convert file c:...\vis2 library diretory\vis2.ahk line 2120 message:tesseract failed
etc.....

the line 2120 contains the following code

whr.Open("POST", "https cxl-services.appspot.com /proxy?url=https%3A%2F%2Fvision.googleapis.com%2Fv1%2Fimages%3Aannotate", Broken Link for safety true)

When i confirm the windows error (there is an ok button) the loop continue with next window and the vis2 works fine again.
Cant really understant where is the problem, so any help would be great.
In the meantime i just commented the said line and script works with no errors anymore.

Post by **6U7D6dwAcJ** » 09 Jan 2020, 11:48

Hello,
Thanks to iseahound for this great tool !

Does anyone know how to scan a given area in the active window under the mouse pointer? I do not need to scan the entire contents of the window, only a specific area.
Thank you.

bourdin07 · Post by **bourdin07** » 04 Feb 2020, 14:28

Very good plugin, I just integrated to my AHK framework. I have 1 question

google cloud vision work?

paulpma · Post by **paulpma** » 21 Apr 2020, 12:42

Hello all,

Has anyone found an option to disable subtitles (preview of text) on the screen. I have tried the following, but no luck:

Code: Select all

text:= Vis2.OCR(,,,{"bypass":true})
text:= Vis2.OCR(,,,bypass:=true) 
text:= Vis2.OCR(,,bypass:=true) 
text:= OCR(,,textPreview:=false) 
text:= OCR(,,{"textPreview":false})

My next thought is to go and manually edit script?

Also, does anyone know why if I call script via ~RButton:: OCR() I can only executed once and I have reload it for next capture. vs. #z:: OCR() ; allows multiple executions.

Your help is appreciated. Thank you.

iseahound · Post by **iseahound** » 21 Apr 2020, 16:31

So you just want the grey box with no subtitles at all? That's a weird request. Comment out every instance of Vis2.obj.subtitle.render

You can't bind to the Right click key. If you really want to try deleting every instance of Hotkey, RButton Results may be somewhat undesirable but should have no large impact on the script. Right click is a shortcut to repositioning the window. Honestly try using XButton1 or MButton.

iseahound · Post by **iseahound** » 21 Apr 2020, 16:38

Hill wrote: ↑
04 Jan 2020, 13:03
Vis2 is a great script and very useful and always worked like a charm.
I use i script that scan various windows and use vis to read a portion of their screen. Very recently it ramdonmy start to pop a windows with this error warning:

Exeption Thrown!
what:Vis2.provider.tesseract.convert file c:...\vis2 library diretory\vis2.ahk line 2120 message:tesseract failed
etc.....

the line 2120 contains the following code

whr.Open("POST", "https cxl-services.appspot.com /proxy?url=https%3A%2F%2Fvision.googleapis.com%2Fv1%2Fimages%3Aannotate", Broken Link for safety true)

When i confirm the windows error (there is an ok button) the loop continue with next window and the vis2 works fine again.
Cant really understant where is the problem, so any help would be great.
In the meantime i just commented the said line and script works with no errors anymore.

I discovered that adding the Critical function fixes lots of bugs.

Code: Select all

         class process {

            selectImage(){
            Critical                                ; Add this
            static selectImage := ObjBindMethod(Vis2.core.ux.process, "selectImage")

Code: Select all

            textPreview(bypass:=""){
            Critical                                ; Add this
            static textPreview := ObjBindMethod(Vis2.core.ux.process, "textPreview")

If you use Vis2 a lot, making the loops critical stops them from interrupting each other.

paulpma · Post by **paulpma** » 21 Apr 2020, 21:50

Iseahound thank you for fast reply. I really appreciate it.

iseahound wrote: ↑
21 Apr 2020, 16:31
So you just want the grey box with no subtitles at all? That's a weird request. Comment out every instance of Vis2.obj.subtitle.render

I did something similar, but in places where it was really needed. Sorry, I misunderstood options="" parameter.

iseahound wrote: ↑
21 Apr 2020, 16:31
You can't bind to the Right click key. If you really want to try deleting every instance of Hotkey, RButton Results may be somewhat undesirable but should have no large impact on the script. Right click is a shortcut to repositioning the window. Honestly try using XButton1 or MButton.

I will give this a shot.. Most likely will try MButton.

Thank you very much.

paulpma · Post by **paulpma** » 25 Apr 2020, 00:16

Hello,

Have been using

Code: Select all

text:= OCR()

I see that text also is copied to clipboard, is there a way to disable copy to clipboard feature? or just edit few lines in VIS2.ahk? or is this even possible without major coding? The reason why I am asking. I need to run multiple OCR and would to prefer to keep my clipboard clean for other functions.

Thank you very much all.

Paul.

paulpma · Post by **paulpma** » 01 May 2020, 22:09

I have looked at the Vis2 file and it seems too complex to disable clipboard. However i have found simple solution.

Code: Select all

text:= OCR()
clipboard =  ;clear

Thank you all.

paulpma · Post by **paulpma** » 01 May 2020, 22:51

10 second difference??

Code: Select all

text := OCR()

vs.

Code: Select all

text := OCR([21,33,750,540])

*captured area and image is the same..

I have noticed that using selection tool is much faster (~about 10 seconds faster for my amount of text) compared to specifying coordinates.. In my thought process it should be the opposite or the same amount of time. Also, I have noticed the same phenomenon when using OCR(pBitmap). Can someone enlighten me on this subject.

Thank you.

maleko48 · Post by **maleko48** » 05 May 2020, 08:54

Greetings, I have been tinkering with Vis2 to automate some stuff I do at work that requires basic OCR and projects like yours inspire me to learn as much as I can to hopefully help contribute at some point in the future as I progress through my programming and computer science education.

Just like @paulpma I too have noticed if I do the manually-initiated drag box it spits out the results really fast, but if I use the coordinates method it takes 20-60 seconds or more in some cases... If there is a way I can get the faster performance while using coordinate driven OCR, please help me learn how to set it up. (btw I have a 3 monitor setup both at home and work)

Additionally, I was considering initiating the manual drag-box via OCR(), and then using AHK to manually MouseMove() Click and hold and then MouseMove() again to the other corner to get around the performance limitation of using a custom-made function and the coordinates OCR([]) but first I need to figure out how to even do that much.

One last thing... Idk if you are familiar with Project Naptha, but it appears to use tesseract also. Is it possible to use it (especially if an offline mode can be adapted/ported) as the primary OCR engine for this AHK app somehow? Can I use its traineddata file in place of yours or will that not really do anything? I wish it were possible to merge the performance and robustness of Project Naptha with the system-wide universality and ease-of-use your AHK core app brings to the table.

Regardless, THANK YOU for putting this out into the world. Cheers! (^_^)

submeg · Post by **submeg** » 06 May 2020, 06:38

@iseahound,

I just wanted to say THANK YOU. This is amazing! I have implemented it into my main script and have shared it with the others that use AHK. Freaking incredible.

iseahound · Post by **iseahound** » 07 May 2020, 18:14

Yeah there's a difference between using the UI and running a command. I'd recommend replacing both .traindata files with the one here:

https://github.com/tesseract-ocr/tessdata

The defaults included are the ones here: fast and best.
https://github.com/tesseract-ocr/tessdata_best
https://github.com/tesseract-ocr/tessdata_fast

AHKMode · Post by **AHKMode** » 09 May 2020, 20:39

Hello Guys,

I've been using this in my script with fix coordinate on the active window. but nothing response. can you help me? below are my code. thnx !

text := OCR([56,217,91,232]) <-- this coordinate are inside the active window (.exe)
MsgBox, % text

nothings happen

AHKMode · Post by **AHKMode** » 14 May 2020, 07:51

Hello @euras
can you msge me on how to script the OCR while mouseclickdrag?

"second: does coordinates method shares the same functions as mouseclickdrag method? because when I use coordinates mode to get the text, I get the incorrect text translation, but when I use mouseclickdrag method, then the text is converted without errors..." .

thanks!

iseahound · Post by **iseahound** » 14 May 2020, 16:55

try text := OCR("A", , [56,217,91,232]). A stands for the active window and is an internal AHK variable. putting an array in the 3rd parameter will crop inside the active window.

AHKMode · Post by **AHKMode** » 16 May 2020, 00:44

Thanks @iseahound !

btw, what if i want to activate OCR plus a MouseClickDrag or Draw a rectangle METHOD ? can you help my script below?

^q::
text := OCR()
textmouse := (OCR() && mouse)

mouse:
MouseClick, L, 580, 230
sleep, 1000
MouseClick, L, 400, 300
sleep, 1000

MsgBox, % textmouse
return

thanky !

AutoHotkey Community

Vis2 - Image to Text OCR()

Re: Vis2 - Image to Text OCR()

Re: Vis2 - Image to Text OCR()

Re: Vis2 - Image to Text OCR()

Re: Vis2 - Image to Text OCR()

Re: Vis2 - Image to Text OCR()

Re: Vis2 - Image to Text OCR()

Re: Vis2 - Image to Text OCR()

Re: Vis2 - Image to Text OCR()

Re: Vis2 - Image to Text OCR()

Re: Vis2 - Image to Text OCR()

Re: Vis2 - Image to Text OCR()

Re: Vis2 - Image to Text OCR()

Re: Vis2 - Image to Text OCR()

Re: Vis2 - Image to Text OCR()

Re: Vis2 - Image to Text OCR()

Re: Vis2 - Image to Text OCR()

Re: Vis2 - Image to Text OCR()

Re: Vis2 - Image to Text OCR()

Re: Vis2 - Image to Text OCR()

Re: Vis2 - Image to Text OCR()