Jump to content

Sky Slate Blueberry Blackcurrant Watermelon Strawberry Orange Banana Apple Emerald Chocolate
Photo

ImageSearch without external images


  • Please log in to reply
2 replies to this topic
Jamie
  • Members
  • 129 posts
  • Last active: Dec 02 2012 04:59 AM
  • Joined: 26 Mar 2010
Others have figured out how to including images in code, but sadly they cannot be used with ImageSearch.

I wanted to ship a single executable without extra image files, so I developed a way to package template images within the code, and an alternative to ImageSearch that uses them.

It requires
1. Image search function written in C and included via the MCode/dllcall mechanism. This function searches a 24-bit DIB for a template in a special binary format in memory.
2. Functions to grab a window to a buffer in memory (24-bit DIB)
3. Functions to load a bitmap from disk and process it into the format required by the new image search. There is also an export function which produces AHK code to reconstruct the image from its base64 encoding. When the AHK code is included in the script, the bitmap on disk is no longer needed.
4. Some wrappers to simplify the interfaces.

Here is bmpread.ahk. This has some basic bmp reading functionality and perhaps most importantly, transforming into the format required by the new image search.

You shouldn't have to worry about the format, but if you're curious, it's run-length encoded to save space on large uniform areas, and it is reordered to search the least common colors first. In some cases this can be much faster than the normal method for detecting when the template matches the image.

Here are functions to grab a window to a DIB, and do the searching: is2.ahk. It will require encodebin.ahk for base64 and hex encoding and decoding.

The functions to grab the screen and search are separated, so that you can grab the screen once and then search it for several templates. On my machine the search goes substantially faster than the grabbing the window.

BlitSnap(wid) grabs a window, where wid is the window id from WinExist or similar. It saves the buffer to a global variable (the "last snap") and then SearchLastSnap() searches it. BlitSnap uses BitBlt, so if the window is obscured or offscreen, it won't get the window contents correctly (same is true of ImageSearch).

These are the lower level functions.

Tying everything together into the main API is what I call the fragment database: fragdb.ahk.

Each fragment has a name, but multiple fragments can have the same name. Each name corresponds to a list of fragments. The main search function is FragDBSearch, which given a fragment name, searches the "last snap" for all the fragments by that name.

Images are loaded into the fragment database using FragDBPutFile, which loads bitmaps (only .bmp is supported). The files must conform to a special filename convention which rather quirky. The file names encode the search region, rather than having to hard code them in the AHK file.
The files must be named like so:
name~100~200~10~15~w.bmp
Tilde (~) separates the parts of the filename. The first part is the name of the fragment. The second and third parts represent the (x, y) coordinate of the upper left corner of the search area, and fourth and fifth parts represent the width and height of the search area. The 'w' at the end signals the app to search the entire window. If the width or height is zero or negative, the upper left corner coordinates are ignored and the entire window is searched. So the simplest way to search the entire window for a fragment is to just name the image like so:
name~0~0~0~0~w.bmp

The key for searching without external images is FragDBExport, which will write the entire database to an AHK file that will look something like this:
tmp:="AQ0FKwDDw8MDAQH29vYCAAGdnZ0EBAG8vLwGAgGOjo4HBAGwsLAEAwHExMQFAgHNzc0AAgF/f"
tmp.="38FBAGQkJAMBAHAwMAMAgHQ0NAABAGBgYEGAwHt7e0EAAHGxsYCAQG+vr4EAgHS0tIAAQGcnJ"
tmp.="wGBAHw8PAAAAHR0dEDAgHR0dEBBAHx8fEBAAHx8fEFAAHPz88AAwHPz88DBAHp6ekGAAODg4M"
tmp.="FAwGDg4MHAwLOzs4EAQHOzs4CAwLHx8cFAQHHx8cBAgHHx8cCBAGEhIQJAwTMzMwBAQHMzMwK"
tmp.="AQOPj48IBATq6uoDAAHq6uoJAAS/v78HAgXLy8sGAQTLy8sCAgHLy8sBAwE="
FragDBPut("wbcorner", "win", 0, 0, -1, -1, tmp)
VarSetCapacity(tmp, 0)
(it uses concatenation into a temp variable instead of line continuations because very long line continuations are rejected by the AHK parser)

Generally the way my scripts work, I #include the code that populates the fragment database on startup (using only code, no files). I then have a key or special function (which is disabled for the compiled version) that rebuilds the database from the bmp files on disk. Note: you should wipe the database using FragDBClear() before adding the bmp files and re-exporting, because of how a single name can have multiple fragments. If you just add bmp files and re-export, you can easily get multiple copies of the same fragment.

Then, once your fragment database is loaded, use BlitSnap() to grab the window and then FragDBSearch to find the fragments. FragDBSearch is defined like so:
FragDBSearch(name, ByRef foundx="", ByRef foundy="", getcenter=0) {
...
The function returns 1 if it finds the fragment in the last snap, or 0 if it does not. The name is the name within the database, foundx, and foundy are self-explanatory. getcenter is off by default, but when enabled, it will return the center of the fragment instead of the upper left corner, which is useful if you are searching for a button and you will want to click it.

I hope this is helpful. Let me know if you have questions or issues.

Jamie
  • Members
  • 129 posts
  • Last active: Dec 02 2012 04:59 AM
  • Joined: 26 Mar 2010
I forgot to mention, the templates cannot be any larger than 255x255. This constraint shrinks and greatly simplifies the encoded format.

Also, the search function does not support variations, but it does support transparency. The transparent color is 0xFF00FF (100% magenta). It should probably be a parameter but I haven't had a need for it to change.

Finally, here is the C source code for the search function that's encoded in SearchLastSnap, if you're curious:
typedef unsigned char uchar;
typedef unsigned short ushort;
typedef unsigned int uint;

#define PLIST_HSIZE 5

uint __stdcall imgsearch24(uchar *DIB24, uint dibwidth, uint dibheight, uint searchx0, uint searchy0, 
				 uint searchwidth, uint searchheight, uchar *pixlist, uint *foundx, uint *foundy) 
{
	const uchar format = pixlist[0];
	if (format != 1) { // this function can only handle the 1-byte templates (dimensions up to 255)
		return 2;
	}
	const uchar templwidth = pixlist[1];
	const uchar templheight = pixlist[2];
	const ushort pixcount = *(ushort *)(pixlist + 3);
	const uint ylim = searchy0 + searchheight - templheight + 1;
	const uint xlim = searchx0 + searchwidth - templwidth + 1;
	const uint dibstride = (dibwidth*3+3)/4*4;
	const uchar firstr = pixlist[PLIST_HSIZE+0];
	const uchar firstg = pixlist[PLIST_HSIZE+1];
	const uchar firstb = pixlist[PLIST_HSIZE+2];
	const uchar firstx = pixlist[PLIST_HSIZE+3];
	const uchar firsty = pixlist[PLIST_HSIZE+4];
	for (uint y=searchy0; y < ylim; y++) {
		for (uint x=searchx0; x < xlim; x++) {
			uint match = 0;
			// check the first pixel outside the loop, might be faster
			if (firstr==DIB24[(dibheight-1-y-firsty)*dibstride+(x+firstx)*3+2] &&
				firstg==DIB24[(dibheight-1-y-firsty)*dibstride+(x+firstx)*3+1] &&
				firstb == DIB24[(dibheight-1-y-firsty)*dibstride+(x+firstx)*3]) 
			{
				match = 1;
				// look for a nonmatching pixel in the list
				for (ushort pixno=0; pixno < pixcount && match; pixno++) {
					const uchar pr = pixlist[pixno*6+PLIST_HSIZE+0];
					const uchar pg = pixlist[pixno*6+PLIST_HSIZE+1];
					const uchar pb = pixlist[pixno*6+PLIST_HSIZE+2];
					const uchar px = pixlist[pixno*6+PLIST_HSIZE+3];
					const uchar py = pixlist[pixno*6+PLIST_HSIZE+4];
					const uchar pc = pixlist[pixno*6+PLIST_HSIZE+5];

					for (uchar j=0; j < pc && match; j++) {
						const uint yoffs = (dibheight-1-y-py)*dibstride;
						const uint xoffs = (x+px+j)*3;
						const uchar b = DIB24[yoffs+xoffs];
						const uchar g = DIB24[yoffs+xoffs+1];
						const uchar r = DIB24[yoffs+xoffs+2];
						if (b != pb || g != pg || r != pr) {
							match = 0; // found a nonmatching pixel
						}
					}
				}
			}
			if (match) {  // didn't find any nonmatching pixels, we're good!
				*foundx = x;
				*foundy = y;
				return 0;
			}
		}
	}
	return 1;
}


lokok
  • Members
  • 1 posts
  • Last active: Nov 15 2014 02:07 PM
  • Joined: 08 Nov 2014

links to scripts aint work, does any have tuto for this