Optical character recognition (OCR) with UWP API

Post your working scripts, libraries and tools for AHK v1.1 and older
feiy
Posts: 5
Joined: 16 Apr 2020, 01:39

Re: Optical character recognition (OCR) with UWP API

01 Jun 2020, 09:53

yes.English (United States)
feiy
Posts: 5
Joined: 16 Apr 2020, 01:39

Re: Optical character recognition (OCR) with UWP API

01 Jun 2020, 10:02

While this is a very useful script, I think there may be some systemic reason of my computer why I decided to give it up. Thank you for your answer.
morph
Posts: 1
Joined: 17 Aug 2020, 00:34

Re: Optical character recognition (OCR) with UWP API

17 Aug 2020, 00:58

Hi, first of all thanks a lot for this tool, as other say it is way better than tesseract. At the moment I'm trying to loop the ocr tool at my discord messages in less than 1 second intervals. Given the speed in which it did the job I thought it wouldn't pose a problem. But after a few loops it gets randomly stuck in a loop waiting for a dll call:

Code: Select all

400: WaitForAsync(SoftwareBitmap)  
462: AsyncInfo := ComObjQuery(Object, IAsyncInfo := "{00000036-0000-0000-C000-000000000046}")
463: Loop
465: DllCall(NumGet(NumGet(AsyncInfo+0)+7*A_PtrSize), "ptr", AsyncInfo, "uint*", status)  
466: if (status != 0)  
477: Sleep,10 (0.02)
It will then receive status = 0 about 20 seconds later and resume, to go into this mode a few seconds later. I'm really not that savvy in windows dll calls, at first I thought that maybe I was overwriting the async task too fast so it wouldn't get to finish before I asked it to do it again. But I get the same result if I loop it in 200 ms, 1000 or 3000...

Does anyone has any idea why this is happening? Any hints on how to solve it?
Thanks!
User avatar
rommmcek
Posts: 1470
Joined: 15 Aug 2014, 15:18

Re: Optical character recognition (OCR) with UWP API

17 Aug 2020, 17:05

Try this approach. Works for me.
Spoiler
teehehe
Posts: 2
Joined: 20 Aug 2020, 05:46

Re: Optical character recognition (OCR) with UWP API

20 Aug 2020, 05:50

after running my script a bit i am getting a asnycinfo error is there somewhere to see what the error numbers mean? asyncinfo status error 2147942414
malcev
Posts: 1769
Joined: 12 Aug 2014, 12:37

Re: Optical character recognition (OCR) with UWP API

20 Aug 2020, 06:09

It means E_OUTOFMEMORY.
william_ahk
Posts: 481
Joined: 03 Dec 2018, 20:02

Re: Optical character recognition (OCR) with UWP API

25 Oct 2020, 00:41

It seems that it will return blank for texts without much whitespace, is there any way to extend the whitespace or bypass this?
malcev
Posts: 1769
Joined: 12 Aug 2014, 12:37

Re: Optical character recognition (OCR) with UWP API

25 Oct 2020, 02:05

This API does not have such method.
william_ahk
Posts: 481
Joined: 03 Dec 2018, 20:02

Re: Optical character recognition (OCR) with UWP API

28 Oct 2020, 08:58

I used Gdip to upscale the smaller selections and now it's working great 🎉

Code: Select all

area := GetArea()
pBitmap := Gdip_BitmapFromScreen(area.x "|" area.y "|" area.w "|" area.h)
pBitmap := Gdip_ResizeBitmap(pBitmap, 500, 500, true)
;Gdip_SaveBitmapToFile(pBitmap, "output_test.png")
hBitmap := Gdip_CreateHBITMAPFromBitmap(pBitmap)
pIRandomAccessStream := HBitmapToRandomAccessStream(hBitmap)
DllCall("DeleteObject", "Ptr", hBitmap)
Gdip_DisposeImage(pBitmap)
text := ocr(pIRandomAccessStream, "en-US")
MsgBox, % text
Return
There is just one problem, does this api expose any settings like page segmentation mode = single character in tesseract? It doesn't seem to recognize single numbers or letters.
malcev
Posts: 1769
Joined: 12 Aug 2014, 12:37

Re: Optical character recognition (OCR) with UWP API

28 Oct 2020, 16:01

does this api expose any settings like page segmentation mode = single character in tesseract?
No.
Qhimin
Posts: 16
Joined: 30 Nov 2020, 19:24

Re: Optical character recognition (OCR) with UWP API

30 Nov 2020, 19:30

william_ahk wrote:
28 Oct 2020, 08:58
I used Gdip to upscale the smaller selections and now it's working great 🎉

Code: Select all

area := GetArea()
pBitmap := Gdip_BitmapFromScreen(area.x "|" area.y "|" area.w "|" area.h)
pBitmap := Gdip_ResizeBitmap(pBitmap, 500, 500, true)
;Gdip_SaveBitmapToFile(pBitmap, "output_test.png")
hBitmap := Gdip_CreateHBITMAPFromBitmap(pBitmap)
pIRandomAccessStream := HBitmapToRandomAccessStream(hBitmap)
DllCall("DeleteObject", "Ptr", hBitmap)
Gdip_DisposeImage(pBitmap)
text := ocr(pIRandomAccessStream, "en-US")
MsgBox, % text
Return
There is just one problem, does this api expose any settings like page segmentation mode = single character in tesseract? It doesn't seem to recognize single numbers or letters.
Thank you for the code! I have tried to use that but I'm getting the following error: "asyncinfo status error 2291674960". Do you know what could be happening?
I'm trying to get the current number in this image:
Image
[Mod edit: Image fixed.]
If someone have an advice for how I should use the OCR to identify that would be amazing, because I'm failing 100%.
malcev
Posts: 1769
Joined: 12 Aug 2014, 12:37

Re: Optical character recognition (OCR) with UWP API

30 Nov 2020, 20:06

This api is for windows >= 8.1
Qhimin
Posts: 16
Joined: 30 Nov 2020, 19:24

Re: Optical character recognition (OCR) with UWP API

30 Nov 2020, 20:09

malcev wrote:
30 Nov 2020, 20:06
This api is for windows >= 8.1
I'm running w10 Pro 20H2. :|
malcev
Posts: 1769
Joined: 12 Aug 2014, 12:37

Re: Optical character recognition (OCR) with UWP API

30 Nov 2020, 20:24

If You use the code from the first post will it work for You?
Qhimin
Posts: 16
Joined: 30 Nov 2020, 19:24

Re: Optical character recognition (OCR) with UWP API

30 Nov 2020, 20:34

malcev wrote:
30 Nov 2020, 20:24
If You use the code from the first post will it work for You?
Yep, It does work. I'm running with Gdip v1.85 and I have just included that and changed the beginning of the file:

Code: Select all

^X::
area := GetArea()
pBitmap := Gdip_BitmapFromScreen(area.x "|" area.y "|" area.w "|" area.h)
pBitmap := Gdip_ResizeBitmap(pBitmap, 500, 500, true)
;Gdip_SaveBitmapToFile(pBitmap, "output_test.png")
hBitmap := Gdip_CreateHBITMAPFromBitmap(pBitmap)
pIRandomAccessStream := HBitmapToRandomAccessStream(hBitmap)
DllCall("DeleteObject", "Ptr", hBitmap)
Gdip_DisposeImage(pBitmap)
text := ocr(pIRandomAccessStream, "pt-BR")
MsgBox, % text
Return
I want to find a text in a fixed image on screen (the image in the post above) and since the numbers in the image are too small I want to resize that and see if the text can finally be recognized.
If you know a better way for that I am totally opened to suggestions.
malcev
Posts: 1769
Joined: 12 Aug 2014, 12:37

Re: Optical character recognition (OCR) with UWP API

30 Nov 2020, 21:31

You have to load gdiplus library for such functions.
Qhimin
Posts: 16
Joined: 30 Nov 2020, 19:24

Re: Optical character recognition (OCR) with UWP API

30 Nov 2020, 22:06

malcev wrote:
30 Nov 2020, 21:31
You have to load gdiplus library for such functions.
Which version especifically? Because I've tried to run v1.45 and like I said before, v1.85 too but both didn't work.
Also thanks for the help :D
william_ahk
Posts: 481
Joined: 03 Dec 2018, 20:02

Re: Optical character recognition (OCR) with UWP API

30 Nov 2020, 22:11

Qhimin wrote:
30 Nov 2020, 20:34
If you know a better way for that I am totally opened to suggestions.
Actually I'm using Capture2Text now since this api cannot recognize single letters/digits. Here's a simple function I've wrote for Capture2Text:

Code: Select all

Capture2Text(coords) {
    static program_path := ".\Capture2Text\Capture2Text_CLI.exe"
    text := RunCMD(program_path . " --screen-rect """ . coords[1] . " " . coords[2] . " " coords[3] . " " . coords[4] """")
    ;if an error occurred, return an empty string
    if (text = "Error, OCR failure.`r`n<Error>`r`n") {
        return ""
    } else {
        return RTrim(text, "`r`n")
    }
}

Also needs RunCMD by SKAN

Testing your image it appears Capture2Text can recognize it perfectly fine:
c2t.png
c2t.png (6.69 KiB) Viewed 7068 times
malcev
Posts: 1769
Joined: 12 Aug 2014, 12:37

Re: Optical character recognition (OCR) with UWP API

01 Dec 2020, 01:46

Just insert Gdip_Startup() in Your code.
Qhimin
Posts: 16
Joined: 30 Nov 2020, 19:24

Re: Optical character recognition (OCR) with UWP API

01 Dec 2020, 09:19

malcev wrote:
01 Dec 2020, 01:46
Just insert Gdip_Startup() in Your code.
Hum, understood! Thank you sir!
william_ahk wrote:
30 Nov 2020, 22:11
Qhimin wrote:
30 Nov 2020, 20:34
If you know a better way for that I am totally opened to suggestions.
Actually I'm using [url=http capture2text.sourceforge.net /]Capture2Text[/url] Broken Link for safety now since this api cannot recognize single letters/digits. Here's a simple function I've wrote for Capture2Text:

Code: Select all

Capture2Text(coords) {
    static program_path := ".\Capture2Text\Capture2Text_CLI.exe"
    text := RunCMD(program_path . " --screen-rect """ . coords[1] . " " . coords[2] . " " coords[3] . " " . coords[4] """")
    ;if an error occurred, return an empty string
    if (text = "Error, OCR failure.`r`n<Error>`r`n") {
        return ""
    } else {
        return RTrim(text, "`r`n")
    }
}

Also needs RunCMD by SKAN

Testing your image it appears Capture2Text can recognize it perfectly fine:
c2t.png
Thank you very very much!

Return to “Scripts and Functions (v1)”

Who is online

Users browsing this forum: charlie89, gwarble and 125 guests