My use case is very straight forward. I want to locate very clear and specific text labels in the GUI of an app. They are quite limited in number and where they can appear, size and font is fixed.
I'm really hoping for something that works with V2 test scripts out of the box. I'm working with a 3rd monitor located placed above and to the right of the primary one (negative Y coordinates) which seems to make various AHK primitive functions not work (have to do DllCall, etc), so I'm wary of this, hoping for something that works o.k. without too much r&d on this...
Thanks for any hints.
Does anyone have a good OCR solution with V2?
Re: Does anyone have a good OCR solution with V2?
I've just tested Vis2 which seems to work. Problem is it's written in V1.
I'm new to AHK and it seems only reasonable to do all my coding in V2 which is a more conventional and orthogonal language wise, than V1.
V1 seems very idiosyncratic and ad hoc. I'd really rather not deal with it, if possible.
Problem is so examples and existing code it written in it....
I'm new to AHK and it seems only reasonable to do all my coding in V2 which is a more conventional and orthogonal language wise, than V1.
V1 seems very idiosyncratic and ad hoc. I'd really rather not deal with it, if possible.
Problem is so examples and existing code it written in it....
Re: Does anyone have a good OCR solution with V2?
A fantastic learning opportunity! You can translate the script and then post it!
I would also like to ask a small favor. If you can translate Gdip_All and also JSON.ahk at the same time, it would be great, even fantastic.
I would also like to ask a small favor. If you can translate Gdip_All and also JSON.ahk at the same time, it would be great, even fantastic.
Re: Does anyone have a good OCR solution with V2?
https://github.com/thqby/ahk2_lib/tree/master/RapidOcr
This is a local ocr, which uses CPU for reasoning.
This is a local ocr, which uses CPU for reasoning.
Re: Does anyone have a good OCR solution with V2?
Thanks, looks very promising. Is there a simple usage example(s) anywhere?thqby wrote: ↑14 Jan 2023, 22:23https://github.com/thqby/ahk2_lib/tree/master/RapidOcr
This is a local ocr, which uses CPU for reasoning.
Re: Does anyone have a good OCR solution with V2?
At the back of the source code.
Re: Does anyone have a good OCR solution with V2?
Great, the jpg file ocr test works for me!
I want to ocr parse the app window to determine the gui layout state before I blindly send mouse clicks to it.
This will involve capturing smaller rectangles to test their contents.
Does RapidOcr have screen grab capabilities or would ank2_lib/wincapture be the way to do it?
btw; Thanks for the library.
I want to ocr parse the app window to determine the gui layout state before I blindly send mouse clicks to it.
This will involve capturing smaller rectangles to test their contents.
Does RapidOcr have screen grab capabilities or would ank2_lib/wincapture be the way to do it?
btw; Thanks for the library.
Re: Does anyone have a good OCR solution with V2?
I had a similar situation last year when I needed to automate some tasks in an app that involved recognizing specific text labels. The regular tools I tried weren't cutting it, especially because my setup also included a multi-monitor arrangement with quirky coordinates. After a bit of searching and trial and error, I landed on an OCR solution that worked wonders for my project.
What really did the trick was using ID Analyzer's OCR Visual Data Scanning technology. Their advanced OCR could accurately pick up text from the GUI of my app without the fuss. It wasn't just the accuracy that impressed me, but how it handled different languages and tricky documents without any hitches. This made integrating into my V2 test scripts a breeze. Plus, their solution didn't require any complex setup to work with my monitor layout, which saved me a ton of R&D time. For anyone diving into similar projects, checking out their Identity Verification services could be a solution.
What really did the trick was using ID Analyzer's OCR Visual Data Scanning technology. Their advanced OCR could accurately pick up text from the GUI of my app without the fuss. It wasn't just the accuracy that impressed me, but how it handled different languages and tricky documents without any hitches. This made integrating into my V2 test scripts a breeze. Plus, their solution didn't require any complex setup to work with my monitor layout, which saved me a ton of R&D time. For anyone diving into similar projects, checking out their Identity Verification services could be a solution.
Re: Does anyone have a good OCR solution with V2?
@Allen — Would you like to share the details of how you implemented their product into your AHK code? Is their OCR package standalone? Is it free?
Re: Does anyone have a good OCR solution with V2?
Yep. The fact is paid APIs offer a much more compelling OCR solution than what's possible on the local system. You could also create an esemble of APIs, where you use about 3 services, and mix those results up for a better truth model.
If you're asking about integration, there's only 3 ways image data can be uploaded. Either use ImagePutBase64, ImagePutSafeArray, or CreateFormData.
If you're asking about integration, there's only 3 ways image data can be uploaded. Either use ImagePutBase64, ImagePutSafeArray, or CreateFormData.
Who is online
Users browsing this forum: docterry, Draken, Insaid, reddyshyam and 19 guests