Optical character recognition (OCR) with UWP API

Post your working scripts, libraries and tools for AHK v1.1 and older
mebo
Posts: 14
Joined: 22 Apr 2021, 16:08

Re: Optical character recognition (OCR) with UWP API

Post by mebo » 23 Apr 2021, 15:16

The lag is definitely gone, but now it's just extremely choppy when dragging. I ended up changing

Code: Select all

SetTimer, % timer, -10
to

Code: Select all

SetTimer, % timer, -0
and this thing is performant! Much appreciated for the quick assistance. Have a great weekend!
doubledave22
Posts: 343
Joined: 08 Jun 2019, 17:36

Re: Optical character recognition (OCR) with UWP API

Post by doubledave22 » 15 Jun 2021, 12:25

Hi, noticing a small memory leak with OCR code. Over time running this code seems to be leaking memory. My script starts out at 45MB and after an hour or so its up to over 100MB. I know its not a big deal but sometimes my users run my script for many days/weeks in a row and I don't want to cause issues. I have tried removing every extra bit from the code but the only thing that stops the leak is removing the lines with pIRandomAccessStream. Is there anything I need to do to prevent such a leak with this?

Code: Select all


read_bitmap(hwnd)
{
	pBitmap := Gdip_BitmapFromScreen( "hwnd:" hwnd)
	pBitmap2 := Gdip_CropImage(pBitmap, 300, 413, 150, 60)

	hBitmap := Gdip_CreateHBITMAPFromBitmap(pBitmap2)
	pIRandomAccessStream := HBitmapToRandomAccessStream(hBitmap)
	Found_Text := ocr(pIRandomAccessStream)
	
	DeleteObject(hBitmap)
	Gdip_DisposeImage(pBitmap)
	Gdip_DisposeImage(pBitmap2)
}

Gdip_CropImage(pBitmap, x, y, w, h)
{
   pBitmap2 := Gdip_CreateBitmap(w, h), G2 := Gdip_GraphicsFromImage(pBitmap2)
   Gdip_DrawImage(G2, pBitmap, 0, 0, w, h, x, y, w, h) 
   Gdip_DeleteGraphics(G2)
   return pBitmap2
}

teadrinker
Posts: 4309
Joined: 29 Mar 2015, 09:41
Contact:

Re: Optical character recognition (OCR) with UWP API

Post by teadrinker » 15 Jun 2021, 15:03

@malcev
I see that not all objects are comletely released:

Code: Select all

   MsgBox, % ObjRelease(IRandomAccessStream)
   MsgBox, % ObjRelease(BitmapDecoder)
   MsgBox, % ObjRelease(BitmapFrame)
   MsgBox, % ObjRelease(BitmapFrameWithSoftwareBitmap)
   MsgBox, % ObjRelease(SoftwareBitmap)
   MsgBox, % ObjRelease(OcrResult)
   MsgBox, % ObjRelease(LinesList)
malcev
Posts: 1769
Joined: 12 Aug 2014, 12:37

Re: Optical character recognition (OCR) with UWP API

Post by malcev » 15 Jun 2021, 15:59

Why do You think so?
teadrinker
Posts: 4309
Joined: 29 Mar 2015, 09:41
Contact:

Re: Optical character recognition (OCR) with UWP API

Post by teadrinker » 15 Jun 2021, 16:09

Not all of them return 0, so some references stay in memory.
malcev
Posts: 1769
Joined: 12 Aug 2014, 12:37

Re: Optical character recognition (OCR) with UWP API

Post by malcev » 15 Jun 2021, 16:15

I do not think so.
Compare

Code: Select all

MsgBox, % ObjRelease(IRandomAccessStream)
ObjRelease(BitmapDecoder)
ObjRelease(BitmapFrame)
ObjRelease(BitmapFrameWithSoftwareBitmap)
ObjRelease(SoftwareBitmap)
ObjRelease(OcrResult)
ObjRelease(LinesList)
and

Code: Select all

ObjRelease(BitmapDecoder)
ObjRelease(BitmapFrame)
ObjRelease(BitmapFrameWithSoftwareBitmap)
ObjRelease(SoftwareBitmap)
ObjRelease(OcrResult)
ObjRelease(LinesList)
MsgBox, % ObjRelease(IRandomAccessStream)
teadrinker
Posts: 4309
Joined: 29 Mar 2015, 09:41
Contact:

Re: Optical character recognition (OCR) with UWP API

Post by teadrinker » 15 Jun 2021, 16:28

Yes, this shows, that some of objects contain references to another. But can you arrange them so that all of ObjRelease() return 0?
teadrinker
Posts: 4309
Joined: 29 Mar 2015, 09:41
Contact:

Re: Optical character recognition (OCR) with UWP API

Post by teadrinker » 15 Jun 2021, 17:25

Perhaps not. Now I think, it's enough that if an object is released last, ObjRelease() returns 0.
User avatar
kczx3
Posts: 1640
Joined: 06 Oct 2015, 21:39

Re: Optical character recognition (OCR) with UWP API

Post by kczx3 » 15 Jun 2021, 21:05

Regardless… there’s memory leaking. Seems oddly similar to the pdf leak of uwp also
doubledave22
Posts: 343
Joined: 08 Jun 2019, 17:36

Re: Optical character recognition (OCR) with UWP API

Post by doubledave22 » 16 Jun 2021, 09:26

kczx3 wrote:
15 Jun 2021, 21:05
Regardless… there’s memory leaking. Seems oddly similar to the pdf leak of uwp also
Are you noticing this as well? I'd definitely like to know if it's not just me.
User avatar
kczx3
Posts: 1640
Joined: 06 Oct 2015, 21:39

Re: Optical character recognition (OCR) with UWP API

Post by kczx3 » 16 Jun 2021, 09:32

I don’t use this but if the memory climbs like that then yes, I’d say something isn’t getting disposed of properly.
byzod
Posts: 87
Joined: 21 Jun 2021, 06:46

Re: Optical character recognition (OCR) with UWP API

Post by byzod » 19 Jul 2021, 06:03

teadrinker wrote:
07 Feb 2021, 13:27
@ewerybody
Can't reproduce the issue. All latest updates are installed. Windows 10 Pro 20H2.
I have the smae issue with ewerybody, Windows 10 pro 20h2 too

This code works perfectly

Code: Select all

#Include <OCRold> ; Optical character recognition (OCR) with UWP API, original version

NumpadMult::
{
	img := "Q:\Down\1.png"
	text := OCR(img)
	Msgbox % text
	return
}

This code (exactly the same but use different script in op) got No valid COM object error

Code: Select all

#Include <OCR> ; Optical character recognition (OCR) with UWP API, the version that can ocr from screen

NumpadMult::
{
	img := "Q:\Down\1.png"
	text := OCR(img)
	Msgbox % text
	return
}
2.png
2.png (15.06 KiB) Viewed 4234 times
teadrinker
Posts: 4309
Joined: 29 Mar 2015, 09:41
Contact:

Re: Optical character recognition (OCR) with UWP API

Post by teadrinker » 19 Jul 2021, 17:01

byzod wrote: This code (exactly the same but use different script in op) got No valid COM object error
Perhaps, your OCR class is incorrect. I can't know, what it contains.
byzod
Posts: 87
Joined: 21 Jun 2021, 06:46

Re: Optical character recognition (OCR) with UWP API

Post by byzod » 19 Jul 2021, 22:03

teadrinker wrote:
19 Jul 2021, 17:01
byzod wrote: This code (exactly the same but use different script in op) got No valid COM object error
Perhaps, your OCR class is incorrect. I can't know, what it contains.
ocrold.ahk is exact code copied from the first block of OP
( select all in the page -> copy to newfile.txt -> rename to ocrold.ahk)

ocr.ahk is exact code copied from the second block of OP (Which is written by you as @malcev said)
( select all in the page -> copy to newfile.txt -> rename to ocr.ahk)

When I said script in op, it means I just copy the code from the post, and I changed nothing
malcev
Posts: 1769
Joined: 12 Aug 2014, 12:37

Re: Optical character recognition (OCR) with UWP API

Post by malcev » 20 Jul 2021, 05:26

First block is for recognizing from files.
Second block is for recognizing from screenshots.
byzod
Posts: 87
Joined: 21 Jun 2021, 06:46

Re: Optical character recognition (OCR) with UWP API

Post by byzod » 23 Jul 2021, 05:43

Fine, is this demo clear enough now?

I thought there's no need to add ocr.ahk as it's EXACTLY the code from the OP, but I will include everything to reproduce this bug (if it can be reproduced), in case of any other "I can't know what it contains" problem

Demo:
demo.gif
demo.gif (890.65 KiB) Viewed 4082 times
Just run two test_ ahk files
ocrbugdemo.zip
(42.25 KiB) Downloaded 144 times
swagfag
Posts: 6222
Joined: 11 Jan 2017, 17:59

Re: Optical character recognition (OCR) with UWP API

Post by swagfag » 23 Jul 2021, 09:20

like @malcev wrote, one code is for use with files(or rather filenames, strings) and the other for use with screenshots(or rather IRandomAccessStreams, the bytes of which could come from anywhere u like)
what ure doing in both ur test examples is pass a filename(as a string), so its no wonder one of the examples isnt working(its expecting an IRandomAccessStream interface pointer, not a filename string!!)
byzod
Posts: 87
Joined: 21 Jun 2021, 06:46

Re: Optical character recognition (OCR) with UWP API

Post by byzod » 23 Jul 2021, 21:55

swagfag wrote:
23 Jul 2021, 09:20
like @malcev wrote, one code is for use with files(or rather filenames, strings) and the other for use with screenshots(or rather IRandomAccessStreams, the bytes of which could come from anywhere u like)
what ure doing in both ur test examples is pass a filename(as a string), so its no wonder one of the examples isnt working(its expecting an IRandomAccessStream interface pointer, not a filename string!!)
Oh thanks I know what's the problem now

I thought @teadrinker add a wrapper for original ocr, something like

(Pseudo codes)

Code: Select all

ocr(args *){
	if(typeof args is string){
	   originalOCR.call(args)
	} else if (typeof args is pIRandomAccessStream){
	  screenOCR.call(args)
	}
}
Since the function signature remains the same: ocr(file, lang := "FirstFromAvailableLanguages")



But he actually replaced the orignial function, the ocrfunction now takes completely different argument, something like ocr(randomAccessStream, lang := "FirstFromAvailableLanguages")
Post Reply

Return to “Scripts and Functions (v1)”