FindText tutorial

Descolada · 12 Apr 2023, 10:26

@Ralf_Reddings200244, sorry, for some reason I didn't get a forum notification about this post, so I totally missed it!

Your code works fine in my tests (after changing the Text, because the example image is different, and your/my screen DPI-s differ).
FindText().OCR works by analyzing the result array of the FindText call: it will loop through the ok variable and combine comments' of the results that are 3 pixels away from eachother (at least in your example you set max offsets of 3). This means that the first FindText call has to also actually find the Text. Make sure that X and Y are getting set at all with MsgBox, % X " " Y, and if not then try to work out why. Perhaps the coordinate limits you set are wrong and so the Text isn't found?

dostroll · 18 Apr 2023, 20:07

The purpose of this code is to write the findtext function in one line.

The "clicktype" part doesn't work.
How should I describe how to specify the arguments?

I also asked ChatGPT3.5 and 4, but I could not solve it.
Please support me if you like!

Code: Select all

FindTextClick(clicktype:="L 2",ByRef X:="", ByRef Y:="", args*) {
	max_tries := 5
	tries := 0
	while (!FindText(X, Y, args*) && tries < max_tries)
	{
		Sleep, 350
		tries++
	}
	if (tries < max_tries)
	{
		FindText().Click(X, Y, clicktype)
		return true
	}
	else
	{
		return false
	}
}

FindText library is great.
Some jobs don't work without it:)

Descolada · 19 Apr 2023, 09:11

@dostroll, the FindTextClick works perfectly in my testing...

Code: Select all

#include <FindText>

Text:="|<>*138$71.zk00000000zzzU00000001zzz000000003zzy000000007zzw0A000000Dzzs0T000000zzzk1zk00001zzzk3zk00003zzzU3rU00007zzz037U0000Dzzy7kD00000Tzzw7UT00001zzzsD0y00003zzzkz0Q00007zzzly0E0000DzzzXs000000Tzzz3U000000zzzy7U1k0001zzzw7k7k0007zzzsDVbU000DzzzkT7j0000TzzzUCTy0000zzzzU0Tw0001zzzz00Tk0003zzzy00T0000Dzz" ; I took a Text of the Recycle Bin on the desktop to test double-clicking

FindTextClick("L 2",&X, &Y, 57-150000, 43-150000, 57+150000, 43+150000, 0, 0, Text)


FindTextClick(clicktype:="L 2",ByRef X:="", ByRef Y:="", args*) {
	max_tries := 5
	tries := 0
	while (!FindText(X, Y, args*) && tries < max_tries)
	{
		Sleep, 350
		tries++
	}
	if (tries < max_tries)
	{
		FindText().Click(X, Y, clicktype)
		return true
	}
	else
	{
		return false
	}
}

dostroll · 19 Apr 2023, 20:21

@Descolada
I solved! Thank you very much Descolada!!
The cause was that &X and &Y were not specified.

Ralf_Reddings200244 · 24 Apr 2023, 10:48

@Descolada

sorry, for some reason I didn't get a forum notification about this post, so I totally missed it!

Its quite allright, I think I was at fault, I forgot to use @Descolada, which may be why you was not notfied.

Your code works fine in my tests (after changing the Text, because the example image is different, and your/my screen DPI-s differ).

Thanks for looking into this on your end. Now I am certain in its possbile.

In a week, when I get some free time, I will mount a second effort. I have been using numbers and letters and its just amazing.

dostroll · 30 May 2023, 06:08

@Descolada
Is it possible to ocr and click text on screen using the FindText function?

I understand how to extract text information with "GetTextFromScreen", but I don't know how to match keywords and click.
Preparations for decomposing the path and extracting the keywords you want to click are complete.

Code: Select all

+dir001
+dir002
↓
\dir001\fruits\banana\banana_cake.txt
-dir001 ← Click
  -fruits ← Click
    -banana ← Click
      -banana_cake.txt
+dir002

Descolada · 30 May 2023, 07:37

@dostroll, GetTextFromScreen returns the Text (the pixel image) of the screenshot, not the actual text in the screenshot. You can use that Text from GetTextFromScreen with FindText to find the corresponding image on the screen, but not for OCR.

If you want to do OCR then you need to have Text for all individual characters of the text you want to recognise: for example for the text "dir001" you would have to have Text for the letters "d", "i", "r", "0" and "1". Then you can use FindText with the Text of the characters, and finally use OCR on that (see the OCR section of the tutorial).

For AHK v2 I've written a library to do actual OCR, which can be used to click text on the screen without capturing it with FindText. See OCR v2 examples for that (in GitHub example 5), but the code would look something like this:

Code: Select all

#Requires AutoHotkey >=2.0-
#include OCR.ahk
result := OCR.FromDesktop()
found := result.FindString("dir001")
result.Highlight(found)
result.Click(found)

Note that this requires AHK v2. There might be similar libraries available for AHK v1.

30 May 2023, 12:08

Hey @Descolada, Thanks for posting the in-depth tutorial explaining the inner workings, really appreciate it.

I have little experience in scripting, and i'm trying to piece together a working code, based on my limited understanding, but i cant seem to grasp how to do so.

For my case, there will be 2 popups that can appear anywhere on screen. First popup will have an answer to the question that will appear like so,
Question : Which of the following is the least dense?
Answer : Air

, and the second popup will have a list of answers like so
>>Water
>>Air
>>Iron
>>Glass
note: Question and answers are randomised but the answer will always be provided in the first popup
I managed to piece together the script to find text Answer, and move mouse to the right side of Answer, where i want to capture again.

Could you guide me on how to modify the script to mimic clicking on capture GUI to do the following
1. Click on capture initiate capture sequence on the answer (ie. Air for this case)
2. click on gray2two, click on accept, click on OK, click on COPY
3. Paste the copied bitmap into script to find text based on this captured answer. and click on it.

Much appreciated!

Descolada · 30 May 2023, 14:04

@FrankenBOOM, there is no need to use the FindText user interface to capture the sequence, that is you don't need to initiate capture sequence or click any of the buttons. Instead you can use GetTextFromScreen with Gray2Two mode: Text := FindText().GetTextFromScreen(x1, y1, x2, y2, "*180"). Here "*180" uses Gray2Two with threshold value 180, and since you already know the location of the answer text then the coordinates shouldn't be a problem. Then you can simply use that Text to find the answer and click it.

dostroll · 11 Jun 2023, 05:05

@Descolada
Sorry for the late reply...
I haven't moved to v2 yet so testing took some time, but OCR.ahk was the library I was looking for. Thanks, I'll take the time to learn:)

19 Aug 2023, 14:47

This seems like a really cool library. I found my way here after reading your tutorial for Feiyue's screen capture / image to text tool. I am wondering if your library or his would be 'better' to accomplish the following objective:

Each day, when I start work, I open up a browser window , which essentially is used to access a legal database within which I perform work. There are myriad buttons / checkboxes, gui elements inside said database. They are all labeled such that if I find the text for those labels, and mouseClick at that location, I can select/toggle what I need to at that point. Their scale really doesn't change, but their positions on the page can change depending on what else I'm working on. Therefore, I'd like something that doesn't rely on mouse clicking at fixed locations (which I currently have to find each morning). It seems like using your OCR library I could have a list of say 15 buttons/elements I want to find the location of, and then call a function to click whichever one I need at the moment? Could that all be done by just running 1 ahk script, and triggering the function calls by (making these up" F1::, F2::, F3:: etc? " Or, could I instead use Feiyue's FindText to accomplish the same thing? Would one method be 'better' than another? I am a real newbie, and will try to cobble together whatever I can. Using Feiyue's FindText I've been able to get convert one sample button to a text string, and have gotten the script to click there (by removing the ; before the FindText.Click , but that of course only gets me one click on one element. I just can't wrap my head around how I would input all my strings into a single ahk file and then trigger the clicks at their individual locations on demand. I'm *not* asking for you to do any work/coding (or anyone else either). I would love to arrive at the working solution on my own so I actually understand it and can do more independently in the future), but pointers in the right direction, or an outline of what I should try would really be appreciated. Thanks so much for any pointers.

Descolada · 20 Aug 2023, 10:21

@ahkHereToLearn, Feiyue's screen capture / image to text tool is included with the FindText library. This thread is only a tutorial on that library, so there is no "your library or his", only "his" (Feiyue's).

The best method for that is whichever one you get decently reliably working. If you have control over your workstation then I recommend Chrome.ahk library for browser automation, since it is the most reliable and fastest. If you don't have such control then try whether UIAutomation works: if the browser layout doesn't change often then it might work reliably. If you decide to use UIAutomation then I recommend using AHK v2 (if you haven't switched already), since UIA.ahk (the AHKv2 UIAutomation library) is much better than v1's UIA_Interface.
Lastly you are left with the image-based detection options: FindText or OCR. I haven't been in a situation where I would need to pick one or the other so I can't tell you which to choose, but I can tell you that I have had trouble using FindText in browsers. It's related to the fact that character smoothing changes the image slightly and also the space between characters varies, so it's hard to create reliable Text. Maybe your specific website is different though, you'd need to try it out.

17 Sep 2023, 12:59

Is it possible to use an image saved already and convert that it into the search string insted of doing it by the capture region route. Have a bounch of icons and images i want to convert based on some fixed settings like gray threshold 100 and they are all 40x40 cropped already. Take to much time going through the full manual prosess for it.
Would be great if anyone could help how to achive this if possible.

Thanks

feiyue · 18 Sep 2023, 11:53

@mblom122 You can do it like this :

Code: Select all


s:=""
Loop Files, c:\pic\*.jpg
{
  Text:=GetTextFromFile(A_LoopFileLongPath, "100")
  , s.= StrReplace(Text, "|<>", "|<" A_LoopFileName ">")
}
Clipboard:=s
MsgBox, 4096,, Finished ! please check the clipboard

GetTextFromFile(file, Threshold:="")
{
  FindText().ShowPic(file, show:=0, x, y, w, h)
  return FindText().GetTextFromScreen(x, y, x+w-1, y+h-1, Threshold, ScreenShot:=0)
}

Nevermore · 26 Sep 2023, 08:52

Hi, using FindText, is it possible to search for text regardless of font size? The functionality works fine for me up to a pre-prepared text, but as soon as I zoom into the browser view the script doesn't work
I am concerned about using the script regardless of screen resolution

Descolada · 27 Sep 2023, 09:30

@Nevermore, the answer is that it depends on what kind of text you are looking for and how the program changes it in differing screen resolutions. Sometimes it works, sometimes it doesn't. I wrote a tutorial for AHK v2 on adjusting for screen resolution and put some FindText examples there as well, and it seemed to work rather okay in multiple DPIs if the error levels were adjusted to high enough. At least it worked with images, I'm not sure how good it'd be with text. However, that method wouldn't work when zooming into a browser view, because then you'd also need to somehow get the zoom level in the browser.

If you are searching for text only then it might be easier and faster to use some kind of OCR method instead. There are multiple AHK libraries available to do that, and you don't even have to install anything extra if you use Microsofts' UWP OCR.

28 Sep 2023, 13:06

Thank you for your tutorial

Ralf_Reddings200244 · 20 Jan 2024, 15:17

@Descolada

FindText.PixelSearch() fails when I specify a rectangular region

Following your amazing guide again, to get a bit deeper into this library. I am trying to get FindText().PixelSearch() to only look within a rectangular region but it keeps failing on me.

The function works when I dont specify a region at all but when I do it does not find anythinng. I even set up a test area on the top left corner area of my primary monitor, and gave FindText().PixelSearch() coordinates that are rough approximation of that area on my monitor and it finds nothing:

Code: Select all

#singleinstance, force
#Persistent
#NoEnv
#Warn, All, OutputDebug
#Include <FindText>
CoordMode, mouse, screen
;ok := FindText().PixelSearch(,,,,,,"0xFF0000")         ; This works fine
ok := FindText().PixelSearch(var1,var2,10,10,500,500,"0xFF0000")
MsgBox,% var1        ;Empty
MsgBox,% ok[1].x     ;Empty

Here is an image of the set the above code is based:

Any help would be greatly appreciated!

Descolada · 20 Jan 2024, 16:31

@Ralf_Reddings200244, FindText().ImageSearch and FindText().PixelSearch are similar to the native ImageSearch and PixelSearch in almost every way, including the way they handle CoordModes. In this case although you set CoordMode, mouse, screen, you didn't set CoordMode, pixel, screen which would've been necessary for normal PixelSearch as well.

Ralf_Reddings200244 · 21 Jan 2024, 16:55

@Descolada

I have been using FindText for these kinds of tasks for while now, I completely forgot about CoorMode, pixel! I can confirm that it is working now. Thank you man!

FindText tutorial

Re: FindText tutorial

Re: FindText tutorial

Re: FindText tutorial

Re: FindText tutorial

Re: FindText tutorial

Re: FindText tutorial

Re: FindText tutorial

Re: FindText tutorial

Re: FindText tutorial

Re: FindText tutorial

Re: FindText tutorial

Re: FindText tutorial

Re: FindText tutorial

Re: FindText tutorial

Re: FindText tutorial

Re: FindText tutorial

Re: FindText tutorial

Re: FindText tutorial

Re: FindText tutorial

Re: FindText tutorial

Who is online