The release of ChatGPT by OpenAI shook the world. In just two days from its release, ChatGPT was being used by one million people. In two months, that number rose to 100 million. The use of ChatGPT is spreading at a rate unprecedented in history. Everyone -- not only AI researchers and specialists, but also companies, politicians, experts, and members of the general public -- is surprised and fascinated by generative AI such as ChatGPT and is watching closely to see what impact it will have on society. How will generative AI change the IT industry? How will it affect businesses and social systems? And how will our own lives be changed by generative AI? In this running feature, we will focus on large language models (LLMs), the foundation of generative AI. We will learn about their key technical points, how they can be used in business, and their future prospects.

In the first issue, we talked about the nature and position of generative AI with respect to AI technology as a whole and discussed its technical implementation. In Part 2, we will look at how generative AI will be used in business and the impact it will have, using as examples some of the initiatives of Toshiba Digital Solutions.


What are generative AI and ChatGPT?


Discriminative AI, which is already in wide use, is AI that distinguishes between correct and incorrect data by first training on massive amounts of correct and incorrect data. It excels at tasks such as image identification, data analysis, and product defect detection.

Generative AI, on the other hand, is AI that produces new results in a form that is easy for people to understand. For example, it is adept at tasks such as creating meeting minutes from records of meeting contents, generating images of an armchair in the shape of an avocado[1], or producing new data.

New technologies are being developed in various generative AI fields, and especially a large amount of progress has been seen in the field of large language models, or LLMs. The Transformer[2] model has proven particularly powerful, so countless ways of implementing it have been proposed. The type of transformer used by OpenAI is called the Generative Pre-trained Transformer, or GPT. This GPT was outfitted with a chat-based user interface and made available to the general public in the form of the ChatGPT service. This generative AI technology stunned the world. Microsoft, which has invested over 10 billion USD in OpenAI, offers a commercial generative AI service for business use called the Microsoft Azure OpenAI Service. This service supplements the GPT service with additional functions and service guarantees, such as enhancing GPT’s data security, protecting privacy, and preventing log data tampering or the use of log data for retraining. Toshiba Digital Solutions uses this secure commercial service in its own generative AI.


The current state of generative AI use in business


Following the debut of ChatGPT in November 2022, the generative AI market has undergone explosive growth. By 2025, it is expected to reach 28 billion USD and by 2030 is even predicted to reach 128 billion USD in worldwide[3]. This is because unlike discriminative AI, which is effective for specific operation processes such as numerical data analysis and defect detection, generative AI has the potential for use in all kinds of business fields[4]. The greatest feature of generative AI is the fact that it can be controlled by issuing commands using natural language. It is hoped to function almost like a personal assistant, contributing to all kinds of administrative and site work.

According to the rough calculations by one company that is fully leveraging generative AI, generative AI can reduce the company’s administrative workload by 400 hours per person[5]. This is the equivalent of roughly 22% of the annual workload of 1,800 hours per person. The company has announced that it will use the time this has freed up to provide better customer support, develop new products, and improve work-life balance.

In this way, generative AI is hoped to serve as a universal tool that can significantly transform conventional business styles. That is why it is being actively deployed in a wide range of industries and business fields (Fig. 1).


Toshiba Digital Solutions’ initiatives for applying generative AI to three utilization fields


For over half a century, the Toshiba Group has been researching and developing AI technologies. For example, in 1967, the Toshiba Group developed the world’s first automatic postal code reading and sorting device that could recognize handwritten characters. In 1978, it released the JW-10 Japanese word processor, Japan’s first word processor with kana/kanji conversion capabilities. We continued this trend of AI technology development with the creation of the RECAIUS communication AI and the SATLYS analytics AI. We have also paid attention to generative AI technology from early on.

There is now a growing tide of applying generative AI to administrative work such as document processing. We believe there is tremendous potential for generative AI not only in this administrative work but also in industries and production fields that work with multimodal data, such as log data, sensor data, and behavior data. We are actively applying the knowledge we have developed through both our communication AI and our analytics AI to generative AI technologies.

Specifically, we have identified three key utilization fields -- “enterprise use” “multimodal application” and “improvement of design and development operation efficiency,” and we are applying generative AI technologies to each of these fields (Fig. 2).

The first field, enterprise use, is the field for which the market has the highest hopes for generative AI. In this field, AI can be used to work with internal documents within organizations, to assist with handling inquiries from customers, and to help with creating office documents.

In the second field, multimodal application, generative AI is envisioned as being used in production, manufacturing, maintenance, inspection, and the like. Worksites use a tremendous range of data, from documents like work manuals, daily work reports, inspection reports, and failure reports to image and video data taken at worksites, audio data containing reports or notifications from workers, sensor data, system log data, worker activity data, and various other types of data. Combining various types of data like this and processing them with generative AI will make it possible to analyze and assess site conditions, issue warnings regarding dangerous activities, issue work instructions, assist with passing on skills to new generations of workers, and more.

For the third field, the improvement of design and development operation efficiency, generative AI can be applied to the massive amounts of design documentation, drawings, regulations, program code, and the like that companies build up over time. Our goal is to apply generative AI to help automatically generate program code, boost the efficiency of design and development work by applying unique industry and business category know-how, help read and interpret program code for legacy assets, and assist with migration.

We will apply generative AI to existing solutions in these three fields, helping accelerate the digital transformation (DX) efforts of our customers.


Steps for effectively utilizing generative AI


Professor MATSUO Yutaka of the University of Tokyo says that there are three steps to effectively utilizing generative AI[6]. Based on these steps and our own usage approach, described above, we have defined our own steps for effective generative AI use (Fig. 3).

Step 1 is the dialog-based input stage, equivalent to ChatGPT. This step has already been taken by many companies. Step 2 is the application of generative AI to documents and regulations within an organization. For generative AI to be used in corporate administrative operations and workflows, it has to be able to work with the documents used within the organization. However, ordinary generative AI are not meant to be trained on confidential internal documents. Because of this, step 2 requires some special technical measures to be applied during implementation.

We have mapped major activities for applying generative AI to enterprise applications as step 2. We have also defined our own step, step 2.5, situated between steps 2 and 3. This step is envisioned as cases where the generative AI utilization fields in step 2 are applied to maintenance and inspection work, as well as software design and development.

The documents handled in step 2.5 are often more complex than the documents used within organizations in step 2. For example, they might involve more rigorous notation rules and guidelines, use industry-specific expressions and phrasing, or contain symbols or figures. Various additional measures are needed to handle documents such as these. The program code accumulated by companies often contains unique processing steps and notation regulations that are specific to the company’s operation systems, along with processing modules or libraries that cannot be publicly disclosed. Therefore, this step requires special measures of a more technical nature than those of step 2. In addition to enterprise AI utilization, we have also mapped step 2.5 to solutions for improving design and development operation efficiency and solutions that can be used in the initial stages of multimodal applications.

Step 3 is the use of generative AI in the manufacturing and industrial fields. This step aims to combine various types of multimodal data, such as image data and sensor data acquired from the field, with field documents, design drawings, and the like, and to apply generative AI to the processes used in diverse worksites, such as production, manufacturing, maintenance, and inspection operations. We believe that step 3 will require additional training of generative AI internal models, known as fine-tuning, and the development of unique large language models (LLMs) optimized for multimodal processing. We will take on step 3 by effectively combining this fundamental research and technology development with our own unique multimodal data processing technologies, which we have nurtured through our RECAIUS and SATLYS development.


Specific initiatives for utilizing generative AI in enterprise applications


The majority of the inquiries we get from customers are about how to carry out step 2 in actual environments. The majority of the administrative work done within organizations consists of document processing. Workflows are defined for each type of work, and documents must be created and processed in accordance with rules and procedures stipulated in various regulations and guidelines. Because of this, there are a lot of hidden costs in organizations, such as employees rereading regulations or asking other employees about them. Even if a regulation document search system is deployed, employees will not be able to find the regulations and past documents they need if they do not enter the correct search keywords for the work they are trying to perform and for the information they need. This is one of the reasons that document search systems do not get utilized to their fullest.

One of the greatest features of generative AI is that it can be given instructions using the natural language that people use every day. For example, you can tell it “I need to submit ○○, but I don’t know how to write △△” or you can ask it “Who should I consult with in advance regarding ○○ payment?” If the generative AI is trained on documents regarding workflows, users can obtain accurate answers regarding administrative work rules and document creation simply by talking with the generative AI using natural language. This would greatly increase the efficiency of many different kinds of operations.

However, a major problem presents itself here. As mentioned earlier, typical generative AI is trained on massive amounts of documents from around the world, containing all kinds of different information. However, they cannot be trained on the confidential regulations and documents used within an organization. This prevents them from being able to generate appropriate answers regarding administrative work that is performed within the organization. Furthermore, unlike conventional discriminative AI, generative AI training processes require prodigious computing resources, said to amount to billions of yen per month. Therefore, performing additional training on confidential data within an organization is no easy matter.

A special technique has to be used to leverage the features of generative AI while being able to draw out information from documents used within organizations. That special technique is called retrieval augmented generation (RAG). However, different companies have different types of internal documents, so the method of implementing RAG varies by company. We use the method shown in Fig. 4 to provide solutions that leverage generative AI in utilizing internal documents.

The way our RAG system works is, broadly speaking, as follows. First, the user uses natural language prompts to enter what they want to do or what they are having difficulty with. For example, they might enter the following prompt: “When I got back from my Hiroshima business trip to my home in Tokyo, it was already past 11:00 p.m. Please tell me the correct way to apply for business trip expenses. Also, are there any other related applications that I need to submit?” The user does not need to think of any search keywords.

The RAG system, upon receiving this prompt, uses generative AI to analyze the intent of the question and generates the search query needed to search the documents within the organization. The documents within an organization are often saved in commercial databases and file servers in appropriate formats, so the RAG system has been equipped with templates (which use prompts and other methods) to generate search queries that are appropriate for the formats of files to be searched and the methods used to search them.

The documents within the organization are then searched using the generated search queries. These searches are performed using the organization’s IT infrastructure, so there are no security risks of documents being leaked or the generative AI being used in re-training. After extracting the passages from the regulations or documents that match the question, the RAG system then calls on the generative AI to generate a response based on the question, using both the prompt and the extracted regulation and document passages. Through this process, the system engages in a back-and-forth of natural questions and natural answers. Furthermore, the corresponding regulations and documents are displayed as sources, so the system can recommend that the user check the accuracy of the information and confirm any details.

* Click here for actual case examples of Commendry scenario-less AI chatbot service.

In this part, we used examples of our own initiatives to explain how generative AI is used in business. The internet was once primarily used as a system for sharing information between universities, but it has now become an essential part of societal infrastructure, supporting the activities of companies and people. In the same way, generative AI will, without a doubt, completely transform the very way people work. Differences in the degree to which companies can leverage generative AI will result in a major divide in those companies’ activities. All companies would benefit by advancing into this new world together with an IT partner they can trust. In Part 3, we will imagine the society of the future, in which generative AI plays a greater part in our daily lives, and discuss the new relationships between people and AI.

Up next: (First half of part 3) The future of generative AI and the new relationship between humankind and AI

 

Reference materials
[1] Automated image generation by DALL-E: “an armchair in the shape of an avocado”
     https://openai.com/blog/dall-e/
[2] “Attention Is All You Need” Ashish Vaswani, Noam Shazeer et al., 12 Jun 2017
[3] According to Toshiba research performed using information sources such as the following.
     https://www.gii.co.jp/report/dmin1336687-global-generative-ai-market.html
     https://japan.zdnet.com/article/35209507/
[4] Ministry of Economy, Trade and Industry, 9th Review Meeting Regarding Human Resource Policies in the Digital Age (July 6, 2023)
     https://www.meti.go.jp/shingikai/mono_info_service/digital_jinzai/009.html
[5] Nissin Foods Holdings and Givery Collaborate to Use ChatGPT in Work Automation
     https://www.youtube.com/watch?v=hEvXH-onVRI
[6] From speech by professor MATSUO Yutaka at Digital Business Days -SaaS EXPO- 2023 Summer (August 22, 2023)

KOYAMA Noriaki

Senior Fellow
ICT Solutions Division
Toshiba Digital Solutions Corporation


KOYAMA Noriaki had researched software design optimization and real-time distributed processing in Corporate Research & Development Center of Toshiba. At iValue Creation Company, he had been engaged in new business development for cloud services, knowledge AI, and networked appliance services. At Toshiba Digital Solutions, he has led business, technology, and product development for the RECAIUS communication AI, and currently directs several projects related to generative AI, product management, and cloud delivery platforms.

  • The corporate names, organization names, job titles and other names and titles appearing in this article are those as of February 2024.
  • All other company names, product names, and function names mentioned in this article may be trademarks or registered trademarks of their respective companies.

>> Related information

Related articles