AI technologies have made dramatic advances in recent years, and many companies are starting to use AI in their business operations. However, a good number of these are still in the verification testing stage. One of the issues they face in moving on to the next stage, actual deployment, is guaranteeing the quality of the AI. There is a growing global need for AI quality assurance, and the topic is being discussed in numerous companies and organizations. In order to deploy Toshiba's AI in various industrial fields, we are creating systems for assuring the quality of AI in line with this trend. Let's look at some of the efforts we have made.


There is a need for high quality, high reliability AI


 AI is already being used in all kinds of services that we use every day, such as smartphones and marketing on electronic commerce (EC) sites. There are few people who have yet to encounter AI.

As the use of AI rises throughout the world, Proof of Concept (PoC) are being carried out in social infrastructure, factories, and other parts of the industrial field. The goals of these verification tests are to explore how AI can be used in business and where it can be used most effectively. However, the deployment of AI in operation requires AI that is highly reliable and high quality -- AI which can address the unique problems of the industrial field and its mission-critical systems, such as the large size and complexity of systems and the need to ensure that systems operate continuously, without interruption.

Typical software development, we create software that works reliably based on the rules designed according to the request that customer wants such a function. Many years of research and practical implementation have gone into creating the quality assurance processes used in this kind of software development. The Toshiba Group, as well, has established its own highly reliable quality assurance processes. However, with AI development, it's not enough to simply apply this quality assurance approach as-is. AI is designed based on customer requirements and created by performing learning on actual data. The training data determines how the AI will function. This makes it difficult to determine how the AI is actually operating, leading to it being called a "black box." Developing AI, whose operation involves this kind of uncertainty, requires a new quality assurance approach that takes these characteristics into consideration.

In Japan, there are growing calls for quality assurance for AI, and there are ongoing discussions on the topic within businesses, academic groups, and governmental organizations. For example, in Japan, the Consortium of Quality Assurance for Artificial-Intelligence-based Products and Services has been created to bring together experts to deliberate regarding AI quality assurance. In the industrial field, as well, organizations such as the Task Force on Evaluating the Reliability of AI in Plants have been established. Within the International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) there is also rising momentum to explore the issue of quality assurance for AI.

In response to these domestic and international AI quality assurance movements, Toshiba is leveraging the technologies and expertise it has accrued to create AI quality Management systems. These systems are based on the latest approaches to quality assurance for AI. Toshiba's goal is to produce AI that is used in actual society and which people can feel safe using (Fig. 1).


Formulating quality assurance guidelines for the development of high quality AI services


Toshiba has formulated AI quality assurance guidelines as guiding principles for AI quality Management development activities. The guidelines are systematized by Toshiba based on the concept of a global standard of quality assurance that takes into consideration the unique characteristics of AI. They reflect the perspectives of personnel involved in every process of AI services, from planning and development to operation -- perspectives that we have discovered through our extensive experience with developing AI. Specifically, we have comprehensively organized and integrated the following three elements from vantages essential to quality assurance.

  • Quality assurance axes: The areas of quality evaluation: "Data," "Models," "Systems," "Development processes," and "Customers"
  • Stakeholders: The people involved: "technical sales personnel*," "AI model developers," "AI system developers," "AI system operators," and "quality assurance personnel"
    *Technical sales: This refers to making proposals to customers based on specialized technical knowledge.
  • Development processes:  The stages of the development lifecycle: "Deliberation," "Proof of Concept (POC)," "Development," and "Operation"

In the development of AI services for industry fields, we believe that quality assurance measures can be implemented throughout the AI service lifecycle by using these guidelines as a foundation, assessing trends in the field for which AI is to be developed, and developing AI that will meet the AI quality Management standards and benchmarks that will be defined in the future.


Developing quality evaluation technologies envisioned for real-world use


When performing quality management based on these guidelines, it is important to evaluate the quality of AI. Toshiba has developed various quality evaluation technologies based on envisioned real-world AI use in order to conform with the perspectives essential for quality assurance. Let's look at two types of these technologies: explainability and robustness evaluation.

Explainability technologies are technologies that are used to address one of the issues with AI, its "black box" nature. These technologies enhance the explainability of AI by demonstrating the bases on which AI models draw their conclusions. For example, AI is sometimes used on factory production lines to detect defects. When defects are detected, the AI outputs the class of defect it has identified and the level of confidence it has in its conclusions. While the AI model does output its confidence level and its defect determination results, the confidence level alone is not sufficient to determine the basis on which it has reached its conclusions. That's why Toshiba has developed an explainability technology for use in the visual inspections of products. This technology shows which part of the input image affected the AI model's decision-making. It improves the trust placed in AI inference results and makes it possible to determine the causes of incorrect AI determinations (Fig. 2).

Next, let's look at the robustness evaluation technology we have developed. Robustness is the ability of an AI model to consistently output desired results even if there are changes in the input data. When AI are used in actual applications, they must be able to maintain a consistent level of performance even in the face of changes in the usage environment, such as seasonal changes, temperature changes, and changes caused by equipment aging, and changes in input data due to the conditions of the articles, such as components, that are being inspected. We developed a robustness evaluation technology for evaluating the degree to which AI systems maintain consistent performance in the face of changes such as these. This not only makes it possible to evaluate the functions normally provided by AI, but also to quantitatively evaluate the AI's ability to handle various environment changes -- its robustness. This enables the provision of AI that offers more stable and consistent performance in real-world use.

By developing quality evaluation technologies such as these, Toshiba is continuously working to promote the evolution of AI services that customers can use with greater confidence.


Accelerating the development of high reliability AI services


AI quality Management must be carried out appropriately, rooted in a foundation of knowledge and expertise regarding the field where the AI is used. That's why we coordinate with the engineers that develop AI services to carry out development and perform quality management using tools based on quality evaluation technology. Through these activities, we aim to provide high quality, high reliability AI services based on global AI quality approaches, Toshiba's AI development expertise, and Toshiba's unique AI quality Management, which leverages our MLOps Platform*, a framework developed to support AI development and operation.

* The MLOps Platform is introduced in detail in the second article of this article.

Toshiba has developed diverse strengths through its experience with developing AI technologies. We take our customers' problems head-on and meet and exceed our customers' expectations by contributing AI that draws on Toshiba's strengths.

We will continue to develop systems of quality assurance for AI as we strive to further the deployment of highly reliable AI that people can feel safe using. We will support social infrastructure and data services through the high performance, high functionality AI that Toshiba is uniquely positioned to provide.

  • The corporate names, organization names, job titles and other names and titles appearing in this article are those as of June 2021.

>> Related information

Related articles