Overview
TOKYO— Toshiba Corporation (TOKYO: 6502) has developed a world-first AI technology that can determine actual-scale 3D measurements from only monocular camera images taken under various conditions, such as those taken from a distance with a zoom lens or with autofocus. Using several photos taken from a distance by a monocular camera during infrastructure inspections, this AI facilitates easy measurements of the size of areas for repair without the need to approach dangerous and difficult-to-inspect areas such as high places and inclinations.
The conventional use of only a monocular camera provides only relative values, requiring gyroscope to obtain absolute values or basic information about sizes. However, the developed AI can provide absolute values by combining relative depth information obtained from multiple viewpoints of a monocular camera with blur information contained in the captured images. Considering that the average age of infrastructure facilities in Japan now exceeds 35 years, this AI can be used to develop efficient maintenance plans that prioritize and realize appropriate repairs without excess or deficiency. Toshiba has been developing robot-integrated management and inspection image analysis technologies like those shown in Figure 1 to save labor and automate infrastructure inspections, and the newly developed technology improves performance in size measurements.
Toshiba will present the details of this technology on November 22 at the British Machine Vision Conference 2021 (BMVC 2021).
Development background
Infrastructure maintenance is becoming increasingly important for the long-term stable operations of social infrastructure. Particularly in Japan, a country where roads, bridges, and tunnels built during a period of high economic growth are rapidly aging, and where an aging workforce is causing labor shortages, there is need for safe and efficient infrastructure maintenance. Improving the efficiency of infrastructure maintenance requires well-prioritized maintenance plans. Measuring sizes of areas requiring repair is effective in determining prioritizations, but an outstanding challenge is that such visual measurements can be difficult in dangerous areas such as high places and inclines. Inspections of infrastructure facilities can thus be greatly streamlined if sizes can be measured solely from photographs taken from a distance, without the need for a human to approach. However, using current cameras and image recognition AI technologies to measure sizes in images taken from a distance remains difficult. For example, some smartphones are equipped with 3D reconstruction technologies combined with gyroscope, but small lenses create large error when imaging from a distance. Also, there are existing techniques for determining depth by learning differences in blur due to lens aberration, but those are problematic in that size measurements are limited to those in images acquired from pretrained distances, making them difficult to apply in outdoor inspections and other situations where imaging distances can vary.
Features of the technology
To address this issue, Toshiba has developed an AI technology that can easily measure the sizes of areas to be repaired by using several photos taken from a distance by a monocular camera, even in high places, inclinations, and other situations where inspections are difficult (Figure 2).
The developed AI is the world’s first technology that does not require gyroscope or reference information regarding sizes, as have been required in the past, and can perform actual-scale 3D measurements even from imaging distances that have not been pretrained.
The developed AI is characterized in that, by combining blur information contained in captured images with relative depth information obtained from multiple imaging positions (multi-viewpoint images), it can measure absolute sizes using only a monocular camera. Because depth information obtained from multi-view images is obtained as relative values, it has been necessary to provide a separate gyroscope that gives absolute values and information about them. Furthermore, size measurements from blur information require a camera parameter called focal length, which conventionally requires pretraining.
Toshiba recently discovered that absolute size values can be obtained using only acquired images by solving optimization problems in which depth information obtained from multi-view images and blur information of captured images are used as inputs, and scale information and focal length are set as unknown parameters (Figure 2). Applying this AI to crack measurement, Toshiba confirmed that it can accurately measure crack sizes from a distance of 7 m, which is difficult to do with technologies for size measurements using smartphones due to the large error involved (Figure 3). Measuring object sizes from distances of 5–7 m at 11 outdoor locations, they found that the size error was 2.5% under ideal conditions with a fixed lens, and restrained to 3.8% even under more difficult conditions with a zoom lens (Figure 4). Numerical simulations (assuming that cracks can be accurately detected in images) based on concrete crack repair guidelines established by the Japan Concrete Institute confirmed this accuracy and also that the necessity of repair can be determined with high accuracy (Figure 5). Toshiba also confirmed the possibility of measuring the absolute sizes of fine cracks with widths of less than 2 mm (Figure 6), and of obtaining reasonable measurements for cracks in high walls (Figure 7), which had previously been difficult.
Future developments
The developed AI can be applied not only to infrastructure inspections, but also to various other situations where cameras are used for size measurements, including manufacturing, logistics, and medical care. In the future, Toshiba will continue conducting demonstration tests using various cameras and lenses, and furthermore increase the speed of computational processing, so that the developed AI can be put into practical use as soon as possible.