What is machine vision? 3 points to understand

Machine vision refers to imaging-based technologies and methods that provide the data needed for automated analysis.
In a broad sense, it also includes related software, hardware, systems, and expertise. It is also a field of engineering.

Machine vision is used in the industrial field for automated inspection, process control, and robot guidance, which is achieved through imaging, image processing (automatic analysis), and output processes.
Among these processes, information is automatically extracted from the image during “image processing”. Machine vision is the technology and method for this.

This time, I will explain three points that are indispensable for understanding machine vision.

Three points to understand machine vision

  • task
  • Component
  • process

A task represents a “purpose”, a component represents “what is needed to achieve the goal”, and a process represents “how the goal is achieved (method)”.

“Tasks” performed by machine vision

Machine vision systems mainly automate visual inspections performed by humans.
For example, it performs limited tasks such as reading, examining, distinguishing, and measuring at high speed and with high accuracy.


Read barcodes and serial numbers.


Check for quality-related information and materials such as the presence or absence of scratches and color differences.


Measure the dimensions of products and parts.


It identifies products and separates garbage.

It also functions as a robot guide. This is the so-called “machine eye”.
Machine vision provides position and direction information so that the robot can properly grasp the state. The robot operates based on the information obtained by machine vision and performs tasks.

For example, “reading barcodes” and “measuring dimensions” as described above are tasks that can be performed by humans, but with machine vision, tasks can be performed much faster and more accurately than humans…
On the other hand, it is also possible to examine minute scratches that cannot be discerned by the human eye, and to discern differences that are not visible to the human eye. (For example, when using a hyperspectral camera, it is possible to identify the plastic material from the image information.)

So what do you need to do to accomplish these tasks?

Next, let’s look at the components of machine vision. A rough understanding of the components will give you the components you need to accomplish your task.

“Components” of machine vision systems

Machine vision is a concept that includes hardware, software, and systems, but it is specifically composed of the following elements.

cameraA camera suitable for the application / purpose such as black / white / color, resolution, frame rate, etc. is selected.
Frame grabberA frame grabber is needed to convert the analog camera signal to digital format. Also called an image input board.
illuminationFluorescent lamps, fiber-driven halogens, and xenon strobe light sources are often used in machine vision systems, but in recent years they are being replaced by LED lighting.
softwareThe captured image is automatically processed to determine pass / fail. Image processing methods are often specialized for applications and purposes.

In addition to these, it is composed of innumerable components such as optics, lenses, processors, and displays.

The requirements for machine vision components vary from task to task.
For example, when performing the task of “color discrimination”, a camera that can shoot color images, appropriate lighting that illuminates products and parts without waste, and whether or not the color is suitable from the read image is analyzed and judged based on programmed rules. You need software to do.

By now, we have organized the tasks that are automated by machine vision and the elements necessary for task execution.
Next, let’s take a look at “How to execute tasks” and the process of machine vision.

The “process” of machine vision execution

The process of machine vision execution can be divided into three parts: imaging, image processing, and output.

1. Imaging

First, acquire the image with a shooting device such as a camera.

2. Image processing

The image data is processed by software and a pass / fail judgment is made.

3. Output

The judged result is output.

Let’s look at the details.

Step.1 Imaging

The process of “getting the data needed to accomplish the desired task”.
Much of machine vision is used as a guide for automated inspections and robots. It is necessary to acquire the information necessary for inspection and the information necessary for operating the robot.

When humans “see”, they receive optical information with their eyes, but when they receive optical information with a machine, that is, when they “see” with a machine, a shooting device (camera) is used.

As of 2020, 2D visible light cameras are the mainstream, but various cameras are used for imaging, specializing in applications and purposes.

For example, a visible light camera shoots only “light that can be seen by the human eye”, but depending on the application, there are cases where a special camera that can shoot light that is invisible to the human eye, such as infrared rays and X-rays, is used.

Imaging technology / methodExplanation
2D visible lightThe wavelength that the human eye can capture is taken as a two-dimensional image. Limit the wavelength or adjust the frame rate depending on the application.
InfraredIt is used in applications that cannot be achieved in the wavelength range of visible light. Specifically, it is related to agricultural sorting and inspection, web inspection, semiconductor analysis, environmental monitoring, and defense.
X-rayIt can be expected to play an active role in quality inspection and management to find hidden defects such as defects in electronic devices.
Line scanIt is a shooting method that scans in one line. It is characterized by high resolution.
snap shotIt is a method to shoot a specific range in an instant. It can also be used on very fast lines.
3DThis is a field / technology that is expected to grow in the future. Currently, shooting by “triangulation” is the mainstream.

Shooting is done with these cameras.
In addition, “shooting equipment” and “image processing” may be integrated or separated.

Step.2 Image processing

The captured images are processed in a way that suits the task being performed.
There are the following methods for processing images.

Processing methodExplanation
Stitch / registrationCombine multiple images.
Refers to image stitching and image registration.
filteringMathematical morphological filtering.
Threshold settingThreshold setting starts with setting (or determining) the gray value.
It uses the gray value you set to separate the image into parts, and in some cases converts each part of the image to simple black and white based on whether it is below or above that grayscale value.
Pixel countCounts the number of bright or dark pixels.
segmentationDivide a digital image into multiple segments to
simplify and change the representation of the image to make it easier to analyze.
Edge detectionDetects the contour of the target.
Color analysisUse color to identify parts, products, and items, evaluate quality from color, and use color to isolate functionality.
Blob detection and extractionInspect an image of a discrete blob of pixels connected as a landmark in the image
(for example, a black hole in a gray object).
Pattern recognitionSearch, match, and / or count specific patterns.
This includes objects that can rotate, be partially hidden by another object, or change in size.
barcodeRead barcodes, data matrices, and 2D barcodes.
Optical character recognitionAutomatically read text such as serial numbers.
measurementMeasure the dimensions of an object.

In this process, image processing is performed using various methods as described above, and the information necessary for pass / fail judgment is extracted from the captured image data.
From the extracted information, we make a pass / fail judgment such as whether the product has cleared the quality.

The result of the judgment is “output”.

Step.3 Output

Machine vision makes a “judgment” by processing the image.
The result of the judgment is “output”.
For example, in the case of quality inspection, products that do not meet the standards, products that are scratched, or products that do not conform to the quality are excluded according to the judgment, but these are judged by image processing.

There are various data output formats such as numerical values, measured values, character data, alarms and signals.

When it becomes an “input” signal of another device

When the result of pass / fail judgment is output from machine vision, it may be an input signal to another device.

For example, if a “problem product” is found, the machine vision system “outputs” the judgment result to the robot arm.
The robot arm “inputs” this signal and removes the offending product from the line.

For robot guides

When it becomes a guide for the robot, after “determining the target” and “determining the position / direction”, the data is converted to the coordinate system of the robot and becomes “output”.

Target discrimination

Target determination is performed by template matching, brightness (threshold value), and color. (In some cases, “machine learning” is required for each.)
The position (x, y coordinates) and direction will be determined as soon as the image processing completes the determination of the target.

Coordinate system translation

When the position information of the target is determined by the machine vision system, “conversion to the robot’s coordinate system” is performed.
Machine vision systems and robots have their own coordinate systems that represent “x” and “y” that are orthogonal to each other, but to communicate with robots, go from “machine vision system coordinate system” to “robot coordinate system”. Needs translation.

The reason why we have to translate the coordinate information is that even if the position information measured by the machine vision system is “output” as it is and passed to the robot, the robot cannot understand the information.
A common example is when Japanese and Americans communicate, it’s like translating Japanese into English. Machines need to “translate into a form that the other person can understand” in the same way.

In order to pass information from the machine vision system to the robot, the machine vision system processes the coordinate information in a “robot-understandable format”.


After translating into the robot’s coordinate system, these coordinate data are “output”.

At the end

This time, I have focused on three important things (tasks, components, and processes) for understanding machine vision.

Machine vision performs tasks that humans can perform faster and with higher accuracy, and automatically performs tasks that humans cannot do.
With the development of 3D imaging and robot technology, machine vision is expected to develop further.

Related Articles

Back to top button