Image Find Objects Image Logic navigation bar

Table of Contents

Image Find Tags

Finds tags within an image.

Library

QUARC Targets/Image Processing/Generic

Description

Image Find Tags

The Image Find Tags block finds tags within an image. It is specifically designed for April tags but can also locate some of the ARToolkit tags as well. However, ARToolkit tags are not recommended, as they do not have as good a Hamming distance between tags. The recommended tag family is the tag36h11 family, which has a Hamming distance of 11 between tags and supports over 500 unique tags.

The Image Find Tags block currently supports only grayscale (HxW) uint8 images. The algorithm performs any thresholding required to locate the tags. If a camera is being used, a camera with a global shutter is recommended as motion blur makes tag detection more difficult.

For each tag found, the block outputs the identifier of the tag, the center of the tag, the four corner locations, a homography matrix which relates an "ideal" tag to the tag in the image, the number of error bits corrected, and measures of the localization and decoding quality.

These outputs are vectors or matrices containing the information about each tag. If the Use variable-size outputs option is checked then there is one element in the vector, or column in the matrix for each tag found. The dimensions will vary according to the number of tags found.

If the Use variable-size outputs is not checked, then the outputs are fixed size, according to the number of tag identifiers specified. In this case, each element of the vector or column of the matrix corresponds to the tag at the same index. In other words, the Nth element corresponds to the Nth tag. This makes it much easier to track individual tags because the information for a particular tag is always found in the same element of the output. In this case an additional fnd boolean output is provided which indicates which tags were found. The corresponding element will be false when the tag was not found and true when the tag was found.

The homography matrix is a 3x3 matrix for each tag that maps the coordinates of an "ideal" tag, whose corners are (-1,-1), (1,-1), (1,1) and (-1,1) to the actual coordinates of the corners of the tag in pixel coordinates. Let H be the 3x3 homography matrix. Given an "ideal" coordinate:

u = [ur uc 1]'

where ur is the row and uc is the column, then the pixel coordinates are found by applying the equations:

v = H * u

v = v / v(3)

whence:

v = [vr vc 1]'

Input Ports

image

The input is an HxW grayscale uint8 image.

Output Ports

#

The number of tags actually found. This value indicates the number of valid elements or columns in the other outputs. When the Use variable-size outputs option is not checked, the other outputs are fixed size so the valid elements or columns may not be consecutive. Use the fnd output to determine which elements or columns are actually valid in that case.

id

The identifiers of each tag found. If the Use variable-sized output option is checked then the output is a variable-size uint32 N-vector, where N is the number of tags found. Otherwise it is a fixed-size vector whose length is equal to the length of the Identifiers of tags to find parameter. In this case, column i corresponds to the ith tag in the Identifiers of tags to find parameter.

ctr

The center of each tag found. If the Use variable-sized output option is checked then the output is a variable-size 2xN single-precision floating-point matrix, where N is the number of tags found. Otherwise it is a fixed-size matrix where the number of columns is equal to the length of the Identifiers of tags to find parameter. In this case, column i corresponds to the ith tag in the Identifiers of tags to find parameter.

Each center coordinate is a 2-tuple of the form: (row, column). Row and column indices are zero-based. The values are single-precision floating-point because the calculated center of the tag may not lie on a pixel boundary.

If the Use complex outputs option is checked, then the output is a complex N-vector instead of a matrix. In this case, the real part of each complex number is the row, and the imaginary part is the column.

crn

The corners of each tag found. If the Use variable-sized output option is checked then the output is a variable-size 2x4xN single-precision floating-point matrix, where N is the number of tags found. Otherwise it is a fixed-size matrix where the number of columns is equal to the length of the Identifiers of tags to find parameter. In this case, page i corresponds to the ith tag in the Identifiers of tags to find parameter.

Each 2x4 submatrix defines the coordinates of the four corners of the tag. The corners are always ordered in counter-clockwise order around the tag, starting in the top, left corner of the tag. Each center coordinate is a 2-tuple of the form: (row, column). Row and column indices are zero-based. The values are single-precision floating-point because the calculated corner location may not lie on a pixel boundary.

If the Use complex outputs option is checked, then the output is a complex 4xN matrix instead. In this case, the real part of each complex number is the row, and the imaginary part is the column.

xform

The homography transformation for each tag found. If the Use variable-sized output option is checked then the output is a variable-size 3x3xN single-precision floating-point matrix, where N is the number of tags found. Otherwise it is a fixed-size matrix where the number of pages is equal to the length of the Identifiers of tags to find parameter. In this case, page i corresponds to the ith tag in the Identifiers of tags to find parameter.

Each 3x3 submatrix defines the homography matrix representing the projection from an "ideal" tag, with corners (-1,-1), (1,-1), (1,1) and (-1,1), to the pixels in the image. The values are single-precision floating-point.

err

The number of error bits corrected for each tag found. If the Use variable-sized output option is checked then the output is a variable-size uint32 N-vector, where N is the number of tags found. Otherwise it is a fixed-size vector where the number of elements is equal to the length of the Identifiers of tags to find parameter. In this case, element i corresponds to the ith tag in the Identifiers of tags to find parameter.

The number of error bits corrected will lie between 0 and 2. A hamming distance larger than 2 is treated as an unrecoverable error because the computational cost of correcting for such errors is not worth it, particularly since too many false positives would begin to occur.

loc

The localization quality. If the Use variable-sized output option is checked then the output is a variable-size single-precision N-vector, where N is the number of tags found. Otherwise it is a fixed-size vector where the number of elements is equal to the length of the Identifiers of tags to find parameter. In this case, element i corresponds to the ith tag in the Identifiers of tags to find parameter.

The localization output is only valid if the Refine pose option is checked. It is a measure of the quality of localization based on the average contrast of the pixels around the border of the tag.

dec

The decoding quality. If the Use variable-sized output option is checked then the output is a variable-size single-precision N-vector, where N is the number of tags found. Otherwise it is a fixed-size vector where the number of elements is equal to the length of the Identifiers of tags to find parameter. In this case, element i corresponds to the ith tag in the Identifiers of tags to find parameter.

The decoding output provides a measure of the quality of the binary decoding process based on the average difference between the intensity of a data bit and the decision threshold. Higher numbers generally indicate better decoding quality. It is really only a meaningful measure of decoding quality for very small tags in the image.

fnd

A boolean vector indicating the tags which were found (true) and those that were not (false). This output is only available if the Use variable-sized output option is not checked. It is a fixed-size vector where the number of elements is equal to the length of the Identifiers of tags to find parameter. Element i corresponds to the ith tag in the Identifiers of tags to find parameter.

This output may be used to determine which elements of the other outputs are currently valid. When variable-size outputs are used, then all the elements are valid so this output is not needed. Use of such a "found" output makes it much easier to track particular tags because the information for each tag is always in the same element of the outputs whenever the tag is found.

Parameters and Dialog Box

Image Find Tags

Tag family

The tag family from which to choose the tags to find.

The tag families are defined in XML files in the fullfile(qc_root, 'blocks', 'image_processing', 'april_tags') folder. New tag families may be added to this folder if desired. Each bit in a 64-bit code defines a white (1) or black (0) pixel in the tag image going from left-to-right, top-to-bottom, with the least-significant bit being the last (bottom-right) pixel.

Identifiers of tags to find

A vector of tag identifiers to find. The block will only search for these tags in the image. Use the browse button next to the edit field to select the tags graphically. The tag selection dialog also allows the tags to be saved to disk for printing. The Paint program in Windows, for instance, can be used to print each tag. Even though the tags are very small, the Paint program will print them full page, without blurring the image.

Minimum number of pixels in a cluster (tunable offline)

The minimum number of pixels that a quad must contain to be considered in the search. A quad is a four-sided polygon. The algorithm finds the quads in the image first and then attempts to match the quads found to tags. This parameter may be used to ignore noise in the image. Set it as large as possible without excluding desired tags for maximum performance.

Maximum number of corner candidates (tunable offline)

The maximum number of corner candidates to consider when segmenting a group of pixels into a quad. A quad is a four-sided polygon. The algorithm finds the quads in the image first and then attempts to match the quads found to tags.

Critical angle (tunable offline)

The angle in radians between pairs of edges in a quad below which the quad is rejected. A quad is a four-sided polygon. The algorithm finds the quads in the image first and then attempts to match the quads found to tags. When the angle between pairs of edges is close to zero then the quad is very distorted and is not likely a tag. A value of zero will mean that no quads are rejected.

Maximum line fit mean-squared error (tunable offline)

When fitting lines to the contours of a quad, a mean-squared error is computed between the line and the contour. If this error is too large then the quad is too distorted and is not likely a tag. Rejecting these potential quads early saves expensive decoding processing so this parameter should be set as small as possible without interfering with tag detection.

Minimum white to black difference (tunable offline)

The algorithm builds up a model of the tag of white and black pixels. This parameter determines how much brighter the white model must be than the black model for the quad to be considered. Hence, this option should be as large as possible without compromising the ability to locate tags (particularly with varying lighting conditions).

Number of threads (tunable offline)

The number of threads to use to accelerate the algorithm. This parameter will not provide any gains if it is set larger than the number of CPU cores in the system.

Decimation (tunable offline)

Detection of quads can be performed on a decimated version of the image to reduce the computation time at the expense of pose accuracy and a slight decrease in detection rate. Decoding the tag is still performed at the full resolution of the image.

Standard deviation for Gaussian blur (tunable offline)

Setting this parameter to a non-zero value causes the algorithm to apply Gaussian blur to the segmented image. Very noisy images benefit from non-zero values such as 0.8. Set this parameter to zero to avoid doing any Gaussian blur.

Deglitch image (tunable offline)

Check this option to deglitch the thresholded image during quad detection. This option is only useful for very noisy images.

Refine edges (tunable offline)

Check this option to adjust the edges of quads to "snap to" strong gradients nearby. This option is useful when decimation is being used. It can increase the quality of the initial quad estimate substantially. It is not computationally expensive, so it is generally recommended.

Refine decoding (tunable offline)

Check this option to refine the decoding process to increase the number of detected tags. It is especially effective for very small tags near the resolution threshold.

Refine pose (tunable offline)

Check this option to refine the extraction of pose information from the quads. The accuracy of pose extraction is increased by maximizing the contrast around the black and white border of the tag. This option generally increases the number of successfully detected tags, though not as quickly or effectively as the Refine decoding option.

If this option is checked then the loc output will be computed. Otherwise the localization quality output will be zero.

Use variable-sized outputs

If this option is checked then all outputs but the # output will be variable-size signals, whose dimension varies with the number of tags actually found, up to the number of tags specified in the Identifiers of tags to find parameter. If it is not checked, then the outputs will be fixed-size signals, whose dimension is determined by the length of the Identifiers of tags to find parameter.

Use complex output

If this option is checked then the center output, ctr, becomes a complex N-vector, and the corner output, crn, becomes a complex 4xN matrix. Refer to the discussion of those output ports for details.

Targets

Target Name

Compatible*

Model Referencing

Comments

QUARC Win32 Target

Yes

Yes

QUARC Win64 Target

Yes

Yes

QUARC Linux Nvidia Target

Yes

Yes

QUARC Linux QBot Platform Target

Yes

Yes

QUARC Linux QCar 2 Target

Yes

Yes

QUARC Linux QDrone 2 Target

Yes

Yes

QUARC Linux Raspberry Pi 3 Target

Yes

Yes

QUARC Linux Raspberry Pi 4 Target

Yes

Yes

QUARC Linux RT ARMv7 Target

Yes

Yes

QUARC Linux x64 Target

Yes

Yes

QUARC Linux DuoVero Target

Yes

Yes

QUARC Linux DuoVero 2016 Target

Yes

Yes

QUARC Linux Verdex Target

Yes

Yes

QUARC QNX x86 Target

Yes

Yes

Last fully supported in QUARC 2018.

Rapid Simulation (RSIM) Target

Yes

Yes

S-Function Target

No

N/A

Old technology. Use model referencing instead.

Normal simulation

Yes

Yes

* Compatible means that the block can be compiled for the target.

 

navigation bar