New data type: Image and a set of routines around it

Propose new features and changes
Humbug
Posts: 18
Joined: 14 Jul 2017, 14:11

New data type: Image and a set of routines around it

21 Nov 2021, 17:50

Hi all,
I would like to have a new native data type called image.
It has to be native so it's referencable, lightning-fast and follows the same garbage collection logic like other variables.
It would be basically a byte array, with metadata around it.

The bytes would hold the pixel data while the metadata would describe the pixel data:
- IsValid (i.e. has image data?)
- source coords x and y (i.e. the top left coords, 0 when read from file),
- dimensions width and height,
- colour; default to RGBA32 when not specified. Does alpha channel make sense? If yes, include that, too. I think the transparent pixels would have a critical role to play in comparisons (i.e. simply skipping transparent pixels in comparisons would speed things up considerably).
- LastResult; the resulting error code of the last call. A generic set of codes could be defined to cover most calls (0 means success, negative numbers mean some sort of failure). Could take call-specific values, too. Up to you, really to define error handling. i am thinking of making troubleshooting AHK scripts easy is a priority, so clear understanding of the error would be very helpful. On error, the image should not change in any way, except of course LastResult.

This is how I'd like to be able to create it:
MyImage:=ReadPixelsFromScreen x1, y1, x2, y2
or
MyImage2:=ReadPixelsFromImage, MyImage x, y, w, h
or
MyImage:=ReadPixelsFromFile x1, y1, x2, y2
All of these would read the pixels into an internal data structure, which would default to RGB32 unless of course the source image is different, of course. The A channel will need to be added when not there.
File formats could be those already supported by ImageSearch, no need for anything fancy. PNG for the win ;-)

Once read, the image could be used as follows:
if MyImage.IsValid; if expression to check that we have a valid image to work with, using direct read-only access to any of the image's metadata.
or
width:=MyImage.width; another example of direct read-only access to any of the image's metadata.
or
MyImage:=0; explicitly destroy pixel data (freeing up memory) and set IsValid to false.
or
MyImage:=ConvertPixels, MyImage, RGB8; explicit conversion to RGB8. Not sure what other colour conversion might be needed, maybe black and white? I'll leave this to the pros. I suppose conversion up is easy, it is probably the down-sampling where it gets tricky.
or
MyImage2:=CropPixels, MyImage, x, y, w, h; crop the image (cut off an area that should not be part of the image) by keeping the image starting at the x, y position and going until the specified width and height. When crop area is beyond image, use the area that does exist in the source (do not throw an error). This of it like a cross-section of an image and a selection mask. If resulting image is empty, set IsValid to 0.
or
ComparePixelsExact MyImage, MyImage2; would return a boolean value, indicating whether the two images are identical or not. Set LastResult to 0 when identical, -1 when even a single pixel is different, or -2 when comparison makes no sense (different size or colour mismatch).
or
ComparePixelsFuzzy MyImage, MyImage2; would return a value, indicating the tolerance that would be required to consider the two images as "close enough". Set LastResult to 0 when identical, -1 when relative distance is too large (overflow), or -2 when comparison makes no sense (different size or colour mismatch).
or
ComparePixelsLimit MyImage, MyImage2, 0.1; would return a value, indicating whether the two images are within the specified tolerance limit (e.g. they are "close enough"). Set LastResult to 0 when within limit, -1 when threshold was breached, or -2 when comparison makes no sense (different size or colour mismatch). This is basically an alternative to the fuzzy comparison, allowing early termination in case it makes no sense to check the remainder of the image when it's clear it does not fit the bill.
or
ComparePixelSubSetExact MyImage, MyImage2; would return a value, indicating whether the second image can be found in the first image. Set LastResult to 0 when found, -1 when not found, or -2 when comparison makes no sense (second image is larger in either dimension than the first or colour mismatch). The idea here is to scan for a set of pixels, which could be a single row, single column, or an area. Same as a ComparePixelsExact, but tries to find an exact copy of an image in another.
or
ComparePixelSubSetFuzzy MyImage, MyImage2; would return a value, indicating the tolerance that would be required to consider the second image to be found in the first image. Set LastResult to 0 when identical, -1 when relative distance is too large (overflow), or -2 when comparison makes no sense (different size or colour mismatch). Same as a ComparePixelsFuzzy, but tries to find the "closest match" of an image in another. Possibly very slow because the comparison must be repeated starting at each pixel (until the subpicture can't fit into the first image anymore).
or
ComparePixelSubSetLimit MyImage, MyImage2, 0.1; would return a value, indicating whether the second image can be found in the first image within the specified tolerance limit (e.g. they are "close enough"). Set LastResult to 0 when within limit, -1 when threshold was breached, or -2 when comparison makes no sense (different size or colour mismatch). Possibly very slow because the comparison must be repeated starting at each pixel until a "good enough" match is found, which, in worst case scenario means: until the subpicture can't fit into the first image anymore.
or
WritePixelsToFile, MyImage, PNG, "C:\path\filename.ext", true; write pixels to file using specified format (save formats as ImageSearch), the last flag is optional, which forced overwriting the file, if exists.
or
PixelSetColor; a pair for the already existing PixelGetColor, to change colour of one specific pixel. I think transparent pixels would have a large part to play in comparisons. Naturally, this would fail when trying to write onto the screen instead of an image object.

The image could also be drawn into a control to view it (for testing the AHK code) but I am not sure what would be the best way to do it.
Maybe a new menuitem under the View menu where all images in memory could be checked like all the other variables?
You decide.

Existing routines - PixelGetColor, PixelSearch and ImageSearch - should be updated to take the new Image data type as inputs.

The effective use of ComparePixelSubSet* routines would require being able to "create" an image in runtime. Not sure what would be the best way to do that aside from snapping it from the screen. Creating a blank canvas using a background colour (transparent?) and setting each pixel to a specific colour might be good enough. Maybe a "draw rectangle" and a "flood fill" routine would be useful.

Last, but not least, there is the matter of animations, which adds a third dimension to the images. I doubt anyone would need this at this point, but it might be worth considering in the design in case of a future expansion, when animated files like GIF, PNG and ANI become supported. Metadata here would include IsAnim, frame rate, frame count and IsLooped. The pixel data of each frame would need to become indexable somehow. MyImage[0], MyImage[1], etc. comes to mind to access the individual images of the entire animated image. I doubt animations would ever be subject to comparisons.

Obviously, the function names and the purpose of these functions should be reviewed by people way smarter than me.
I'm sure there is a better way to structure these, but it should be good enough to understand what I'd like to achieve and trigger some discussions.

Well, that's my wish, Santa ;-)

Thank you for reading and considering!
iseahound
Posts: 1472
Joined: 13 Aug 2016, 21:04
Contact:

Re: New data type: Image and a set of routines around it

25 Nov 2021, 17:50

That's interesting. Have you seen some of my projects?

ImagePut
ScreenBuffer

What you are proposing could be done. As an extension to ImagePut in the form of ImagePutBuffer.

Code: Select all

#include ImagePut.ahk

image := ImagePutBuffer([x, y, w, h]) ; Read from screen.
image2 := ImagePutBuffer("vacation.jpg") ; Read from file.

; View image
ImagePutWindow(image)

; Validate Image
MsgBox % ImageEqual(image)

; Compare two images.
MsgBox % ImageEqual(image, image2)

; Crop
image := ImagePutBuffer({buffer: image, crop: [20, 20, -20, -20]})

; Scale
image := ImagePutBuffer({buffer: image, scale: 2})

; WritePixelsToFile
ImagePutFile(image, ".png")

; ImageSearch
ImageSearch, OutputVarX, OutputVarY, X1, Y1, X2, Y2, % "HBITMAP:" ImagePutHBitmap(image)

 ; Delete the image
image := ""
Currently not supported but could be added:
* width and height
* pixel := image[x, y] getter and setter.

Will never be added:
  • ComparePixelsFuzzy/ComparePixelsLimit - Unfortunately, there are many ways to compare an image. A better way would be to use an imagehash. Because there are many implementations, each with their own pros and cons, it does not make sense to compare anything except exact pixel values.
  • ConvertPixels - RGBA32 only for speed and as an interchange format. Any other format is slow. The class will however can be extensible, so chances are if you need to convert to a different pixel format, you need other specialized functions in conjunction.
  • ComparePixelSubSetExact - Interesting, but again, not super relevant. This one is a maybe.
  • ComparePixelSubSetFuzzy/ComparePixelSubSetLimit - No.
Let me know what you think.
iseahound
Posts: 1472
Joined: 13 Aug 2016, 21:04
Contact:

Re: New data type: Image and a set of routines around it

11 May 2022, 14:39

Just wanted to keep you updated on the progress I've made.

Done:
  • Use an internal byte array.
  • Expose width, height, ptr, size properties.
  • A crop method that returns a new BitmapBuffer.
  • PixelSearch & ImageSearch
  • PixelGetColor is done by passing the x, y coordinates. Such as: image[x, y]
  • PixelSetColor via image[x, y] := 0x00FF00
  • Some machine code functions that can replace colors and set transparency.
https://github.com/iseahound/ImagePut/wiki/PixelSearch-and-ImageSearch

If there is continued interest, I can actually add most of what is specified in the first post. So, variation in pixel/imagesearch would be next, as well as being able to return a set of all matches.

Return to “Wish List”

Who is online

Users browsing this forum: No registered users and 35 guests