APIs

The Recognizer class operates on a list of actions defined in a configuration. Each action can then be executed using Recognizer.execute(). Different actions exist, mainly to retrieve information from screen pixels or to interact with the screen. Actions can be piped together piped together and the screen area can be preprocessed.

Actions

ActionType defines the different kinds of operations that can be performed on screen data or user input.

class guirecognizer.ActionType(*values)

Available action types.

CLICK = 'click': Click on the selected point.

COMPARE_IMAGE_HASH = 'compareImageHash': Compute the image hash of the area selection then compute the difference with the hash in reference.

COMPARE_PIXEL_COLOR = 'comparePixelColor'

Compute the pixel color of the point selection or the average pixel color of the area selection then compute the difference with the pixel color in reference.

The difference is the average difference of the rgb colors. It’s always between 0 and 1.

COORDINATES = 'coordinates': Return the coordinates of a point or an area.

FIND_IMAGE = 'findImage': Find the locations of an image inside the selected area. Specify a detection threshold, the maximum number of locations and a resize interval to find the same image but a bit smaller or bigger.

IMAGE_HASH = 'imageHash'

Compute the image hash of the area selection. The color is taken into account. Similar images generate close hashes.

More about image hashes: https://pypi.org/project/ImageHash .

IS_SAME_IMAGE_HASH = 'isSameImageHash': Compute the image hash of the area selection and compare it to the hash in reference.

IS_SAME_PIXEL_COLOR = 'isSamePixelColor': Compute the pixel color of the point selection or the average pixel color of the area selection and compare it to the pixel color in reference.

NUMBER = 'number': Try to recognize a number. Return None if no number has been recognized.

PIXEL_COLOR = 'pixelColor': Compute the pixel color of the point selection or the average pixel color of the area selection.

SELECTION = 'selection': Return point as a pixel or an area as an image.

TEXT = 'text': Try to recognize text. Return the empty string if no text has been recognized.

Even though the following examples define an action by giving a valid configuration using guirecognizer.recognizer.RecognizerData, it’s best to create a configuration file with guirecognizerapp because the companion application helps create and preview actions.

Coordinates

Actions of type ActionType.COORDINATES return absolute screen coordinates computed from the action ratios and the defined borders.

From a point:

from guirecognizer import ActionType, Recognizer

recognizer = Recognizer({'borders': (0, 0, 50, 50), 'actions': [{'id': 'coord',
    'type': ActionType.COORDINATES, 'ratios': (0.5, 0.5)}]})
coord = recognizer.executeCoordinates('coord')
print(coord)

(25, 25)

From an area selection:

from guirecognizer import ActionType, Recognizer

recognizer = Recognizer({'borders': (0, 0, 50, 50), 'actions': [{'id': 'coord',
    'type': ActionType.COORDINATES, 'ratios': (0.2, 0.4, 0.6, 0.8)}]})
coord = recognizer.executeCoordinates('coord')
print(coord)

(10, 20, 30, 40)

Selection

Actions of type ActionType.SELECTION return either the color of a selected pixel (for point selections) or a PIL.Image.Image (for area selections).

from guirecognizer import ActionType, Recognizer

recognizer = Recognizer({'borders': (0, 0, 50, 50), 'actions': [{'id': 'point',
    'type': ActionType.SELECTION, 'ratios': (0.5, 0.5)}]})
color = recognizer.executeSelection('point')
print(color)

(218, 213, 214)

From an area selection:

from guirecognizer import ActionType, Recognizer

recognizer = Recognizer({'borders': (0, 0, 50, 50), 'actions': [{'id': 'area',
    'type': ActionType.SELECTION, 'ratios': (0.2, 0.4, 0.6, 0.8)}]})
image = recognizer.executeSelection('area')
print(image)

<PIL.Image.Image image mode=RGB size=20x20 at 0x1F0BCA68A50>

Find images

Actions of type ActionType.FIND_IMAGE search for an image (or visually similar ones) inside a larger image and return the coordinates of the matches.

Here is an example. Let’s try to find the camera inside the following image from Where’s Wally?.

Let's use guirecognizer to find the camera inside this image. Let's call this file wallyExcerpt.webp. — Let’s use *guirecognizer* to find the camera inside this image. Let’s call this file *wallyExcerpt.webp*.

Here is the camera we are looking for.

The most convenient way is to define the action with guirecognizerapp. Without guirecognizerapp, we can manually extract an area of pixels representing the camera.

A subsection of the camera. Let's call this file wallyCamera.webp. — A subsection of the camera. Let’s call this file *wallyCamera.webp*.

Then we are going to find a similar area of pixels inside the image wallyExcerpt.webp.

You may need to install pillow to follow the example.

(venv) $ pip install pillow

import base64
from io import BytesIO
from guirecognizer import ActionType, Recognizer
from PIL import Image

excerpt = Image.open('wallyExcerpt.webp')
camera = Image.open('wallyCamera.webp')

# Convert the camera image to a correct format.
buffered = BytesIO()
camera.save(buffered, format='PNG')
imageToFind = base64.b64encode(buffered.getvalue()).decode('utf-8')

recognizer = Recognizer({'borders': (0, 0, excerpt.width, excerpt.height), 'actions': [{'id': 'find',
    'type': ActionType.FIND_IMAGE, 'ratios': (0, 0, 1, 1), 'imageToFind': imageToFind, 'maxResults': 1, 'threshold': 15}]})
coords = recognizer.executeFindImage('find', screenshot=excerpt)
print(coords)

[(639, 398, 660, 411)]

The camera is found following the coordinates. It's circled in blue. — The camera is found following the coordinates. It’s circled in blue.

Three parameters are available: guirecognizer.recognizer.ActionData.maxResults, guirecognizer.recognizer.ActionData.threshold and guirecognizer.recognizer.ActionData.resizeInterval.

Pixel color

Actions of type ActionType.PIXEL_COLOR return the color of the selected pixel or the average color of the selected area.

from guirecognizer import ActionType, Recognizer

recognizer = Recognizer({'borders': (0, 0, 50, 50), 'actions': [{'id': 'color',
    'type': ActionType.PIXEL_COLOR, 'ratios': (0.5, 0.5)}]})
color = recognizer.executePixelColor('color')
print(color)

(97, 96, 101)

Actions of type ActionType.COMPARE_PIXEL_COLOR return the difference between the color of the selected pixel (or the average color of the selected area) and the color reference. The returned difference is a float between 0 and 1, where 0 means identical colors and higher values indicate greater differences.

from guirecognizer import ActionType, Recognizer

recognizer = Recognizer({'borders': (0, 0, 50, 50), 'actions': [{'id': 'compareColor',
    'type': ActionType.COMPARE_PIXEL_COLOR, 'ratios': (0.5, 0.5), 'pixelColor': (100, 100, 100)}]})
diff = recognizer.executeComparePixelColor('compareColor')
print(diff)

0.01045751633986928

Actions of type ActionType.IS_SAME_PIXEL_COLOR return whether the difference between the color of the selected pixel (or the average color of the selected area) and the color reference is 0.

from guirecognizer import ActionType, Recognizer

recognizer = Recognizer({'borders': (0, 0, 50, 50), 'actions': [{'id': 'isSameColor',
    'type': ActionType.IS_SAME_PIXEL_COLOR, 'ratios': (0.5, 0.5), 'pixelColor': (100, 100, 100)}]})
isSame = recognizer.executeIsSamePixelColor('isSameColor')
print(isSame)

False

Image hash

Actions of type ActionType.IMAGE_HASH return an image hash of the selected area. Two visually similar images produce hashes with a small difference. Color information is taken into account when computing the hash. It uses https://pypi.org/project/ImageHash.

from guirecognizer import ActionType, Recognizer

recognizer = Recognizer({'borders': (0, 0, 50, 50), 'actions': [{'id': 'imageHash',
    'type': ActionType.IMAGE_HASH, 'ratios': (0.2, 0.4, 0.3, 0.8)}]})
imageHash = recognizer.executeImageHash('imageHash')
print(imageHash)

f852add897319465,07000000000

Actions of type ActionType.COMPARE_IMAGE_HASH return the difference between the image hash of the selected area and the image hash reference. The difference is a positive integer.

from guirecognizer import ActionType, Recognizer

recognizer = Recognizer({'borders': (0, 0, 50, 50), 'actions': [{'id': 'compareImageHash',
    'type': ActionType.COMPARE_IMAGE_HASH, 'ratios': (0.2, 0.4, 0.3, 0.8), 'imageHash': 'f852add897319465,07000000001'}]})
diff = recognizer.executeCompareImageHash('compareImageHash')
print(diff)

Usually a difference of 10 or below means the two images are very similar.

Actions of type ActionType.IS_SAME_IMAGE_HASH return whether the difference between the image hash of the selected area and the image hash reference is 0.

from guirecognizer import ActionType, Recognizer

recognizer = Recognizer({'borders': (0, 0, 50, 50), 'actions': [{'id': 'isSameImageHash',
    'type': ActionType.IS_SAME_IMAGE_HASH, 'ratios': (0.2, 0.4, 0.3, 0.8), 'imageHash': 'f852add897319465,07000000001'}]})
isSame = recognizer.executeIsSameImageHash('isSameImageHash')
print(isSame)

False

Text

Actions of type ActionType.TEXT return the recognized text, or an empty string if no text was detected. It uses an optical character recognition library. The supported OCR libraries are listed in the section below.

As an example let’s retrieve the text from the following image. Install one of the available OCRs.

Use guirecognizer to retrieve the text inside this image. Let's call this file textExample.webp. — Use *guirecognizer* to retrieve the text inside this image. Let’s call this file *textExample.webp*.

from guirecognizer import ActionType, Recognizer

imagePath = 'textExample.webp'
recognizer = Recognizer({'borders': (0, 0, 500, 221), 'actions': [{'id': 'text',
    'type': ActionType.TEXT, 'ratios': (0.1, 0.1, 0.9, 0.9)}]})
text = recognizer.executeText('text', screenshotFilepath=imagePath)
print(text)

Hello World

Number

Actions of type ActionType.NUMBER return a float, or None if no number was detected. It uses an optical character recognition library. The supported OCR libraries are listed in the section below.

As an example let’s retrieve the number from the following image. Install one of the available OCRs.

Use guirecognizer to retrieve the number inside this image. Let's call this file numberExample.webp. — Use *guirecognizer* to retrieve the number inside this image. Let’s call this file *numberExample.webp*.

from guirecognizer import ActionType, Recognizer

imagePath = 'numberExample.webp'
recognizer = Recognizer({'borders': (0, 0, 500, 221), 'actions': [{'id': 'number',
    'type': ActionType.NUMBER, 'ratios': (0.7, 0.1, 0.9, 0.9)}]})
number = recognizer.executeNumber('number', screenshotFilepath=imagePath)
print(number)

42.0

Recognizer class

class guirecognizer.Recognizer(data=None)

__init__(data=None)

Parameters:: data (str | RecognizerData | None) – (optional) config filepath or config data
Raises:: RecognizerValueError – invalid data

clearAllData()

Remove all data.

Return type:: None

execute(*args, expectedActionType=None, **kwargs)

Overloads:

self, args (str | ActionType), expectedActionType (Literal[ActionType.COORDINATES]), kwargs (Unpack[ExecuteParams]) → Coord
self, args (str | ActionType), expectedActionType (Literal[ActionType.SELECTION]), kwargs (Unpack[ExecuteParams]) → Point | Image.Image
self, args (str | ActionType), expectedActionType (Literal[ActionType.FIND_IMAGE]), kwargs (Unpack[ExecuteParams]) → list[AreaCoord]
self, args (str | ActionType), expectedActionType (Literal[ActionType.CLICK]), kwargs (Unpack[ExecuteParams]) → None
self, args (str | ActionType), expectedActionType (Literal[ActionType.PIXEL_COLOR]), kwargs (Unpack[ExecuteParams]) → PixelColor
self, args (str | ActionType), expectedActionType (Literal[ActionType.COMPARE_PIXEL_COLOR]), kwargs (Unpack[ExecuteParams]) → int | float
self, args (str | ActionType), expectedActionType (Literal[ActionType.IS_SAME_PIXEL_COLOR]), kwargs (Unpack[ExecuteParams]) → bool
self, args (str | ActionType), expectedActionType (Literal[ActionType.IMAGE_HASH]), kwargs (Unpack[ExecuteParams]) → str
self, args (str | ActionType), expectedActionType (Literal[ActionType.COMPARE_IMAGE_HASH]), kwargs (Unpack[ExecuteParams]) → int
self, args (str | ActionType), expectedActionType (Literal[ActionType.IS_SAME_IMAGE_HASH]), kwargs (Unpack[ExecuteParams]) → bool
self, args (str | ActionType), expectedActionType (Literal[ActionType.TEXT]), kwargs (Unpack[ExecuteParams]) → str
self, args (str | ActionType), expectedActionType (Literal[ActionType.NUMBER]), kwargs (Unpack[ExecuteParams]) → float | None
self, args (str | ActionType), kwargs (Unpack[ExecuteParams]) → AnyActionReturnType

Return the result of the given action(s).

When many action ids or action types are specified, actions are executed as a pipeline.

Parameters:

args (str | ActionType) – Action ids or types. At least one must be given.
expectedActionType (ActionType | None) – Expected last action type or through reinterpret option. Raises an exception if wrong type.
kwargs (Unpack[ExecuteParams]) – Extra parameters, see ExecuteParams

Returns:

The result of the last action in the pipeline.

If none of the parameters screenshot, screenshotFilepath, bordersImage and bordersImageFilepath is given, the screen is used when necessary.

With option reinterpret, the last action, if given by an id, is executed as if it was from the given type. An exception is raised if some parameters are missing.

The selected area can be preprocessed using the id of a defined preprocessing operation with option preprocessing.

Raises:

RecognizerValueError –

No action id or type is specified
An action id is unknown
Could not open image from the borders filepath
Could not open image from the selected area filepath
One of the parameters is invalid
A needed parameter is missing while using an action type
Last action type is not the expected one

executeClick(*args, **kwargs)