Using AI translation developed in Japan to improve the efficiency of translating Japanese laws and regulations and contribute to the promotion of global business

This voice was created by using Toshiba’s speech synthesis middleware, ToSpeakGUI.

The Ministry of Justice works together with the ministries with jurisdiction over individual laws to translate Japanese laws and regulations and publish these English translations. Its goal is to deepen the world’s understanding of Japan and to promote overseas investment and business in Japan. Laws and regulations contain a great deal of specialized terminology and have complex sentence structures. In addition, translations must be accurate and must not lead to misunderstandings. Because of this, until now, the entire translation process was done by hand, and it took a long time for English translations to be released after Japanese laws or regulations were made public. From perspectives such as promoting investment in Japan, it is essential that up-to-date English translations of current laws and regulations are always available. One important government initiative to achieve this has been the use of AI to accelerate the translation process. Toshiba Digital Solutions has developed an AI system that produces accurate and natural translations by utilizing the natural language processing and advanced AI technologies that we have refined over the years and by collaborating with partners. The Ministry of Justice has completed its pilot test of the system, and starting in FY2024, the ministries with jurisdiction over individual laws and regulations have started putting it to use. Let’s learn more about this Law and Regulation Translation System.


Shortening the time it takes to publish English translations of laws and regulations


In general, the Diet, Japan’s legislative organ, formulates laws, while administrative bodies formulate ordinances and ministerial orders, which are collectively referred to as regulations. Laws and regulations are rules which must be abided by when living in Japan. They must also be understood and complied with by overseas investors and companies when investing or doing business in Japan. In 2009, the Ministry of Justice began providing English translations of laws and regulations on a dedicated website, the Japanese Law Translation Database System (JLT). As of the end of June, 2024, it provides English translations for roughly 950 laws and regulations and roughly 90 outlines. The site has been accessed by people in over 100 countries.

From the perspective of encouraging overseas investment in Japan and attracting foreign companies to come to Japan, it is essential to always provide English translations that are up to date with current laws and regulations. However, on the average, it took two and a half years for an English translation to be published after a law was made public. Shortening the time it took to publish English translations was a pressing issue.

The first step in creating an English translation is for the ministry with jurisdiction over the law to create a draft translation. In many cases, the ministry would contract with an outside translator to create the draft translation. The draft thus involved the ministry selecting a translation contractor and placing a translation order, the translator creating the translation, and the ministry checking and revising the submitted translation. The completed draft translation would then be submitted to the Ministry of Justice, which would have experts well-versed in English, laws, and regulations inspect the translation to check if the English is natural and if it is consistent with related laws and regulations, etc. The experts would work with the relevant ministries to make revisions to the translation. The revised translation would then undergo a final check by the relevant ministry, and then the Ministry of Justice would publish it on the JLT.

This entire process took, on the average, two and a half years. In preparing the English draft translation, it took time to arrange for an outside translator to take on the job, and then additional time for the actual translation to be produced. As a result, it took roughly two years for ministries to submit draft translations to the Ministry of Justice. To address this issue, the Ministry of Justice has been working on AI translation, which can be used across relevant ministries, to shorten the draft translation preparation process.

Reference materials
* “The Japanese Law Translation Project - June 2024” Material 1 from the first meeting in 2024 of the Japanese Law Translation Council.
https://www.moj.go.jp/content/001419956.pdf (PDF)(829KB)

* SOKI Shiori, “Utilizing Machine Translation in Translating Laws and Regulations into Foreign Languages,” AAMT Journal, No. 80, 2024
https://aamt.info/wp-content/uploads/2024/06/AAMT-journal-No80.pdf (PDF)(1.63MB)


Producing accurate and natural translations


As part of these initiatives by the national government and the Ministry of Justice, in December 2023, the Ministry of Justice conducted a four month trial run of the Law and Regulation Translation System developed by Toshiba Digital Solutions. The effectiveness of the system was confirmed, shortening the draft translation preparation process to just a matter of weeks. In April 2024, the ministries with jurisdiction over individual laws and regulations also began using the system, and it is expected that the system will make it possible to publish English translations of laws and regulations in less time than was previously required (Fig. 1).

The Law and Regulation Translation System combines an AI translation engine developed in Japan, the latest neural translation models specialized for laws and regulations, and additional learning to address issues specific to the texts of laws and regulations. This has made it possible to create accurate, natural AI translations that comply with law translation rules.

The system uses the latest AI translation engine developed by the National Institute of Information and Communications Technology (NICT). However, deep learning alone is not sufficient for producing AI translations that comply with law and regulation English translation rules. This is because laws and regulations contain a great deal of specialized terminology and have complex sentence structures. Translations generated using deep learning alone sometimes omit necessary information or contain terms which are unrelated to the source text. This makes it difficult to ensure the accuracy of translation results. To compensate for the deficiencies of deep learning, we used the natural language processing and rule-based machine translation technologies we have developed over the years, applying them both before and after translation production. This approach successfully improved the quality of translations.

For the front end, which is used for inputting the files and text to be translated and outputting the translation results, we worked with translation company Kawamura International Co., Ltd. We used application programming interfaces (API) not only for translation screens, but also to connect to the translation engine and servers that provide user authentication and organization management functions.


User-focused functions and screens created through dialogue with the Ministry of Justice


The Law and Regulation Translation System has two types of translation: “document translation,” in which prepared documents are uploaded, and “text translation,” in which text is entered directly. With document translation, files containing the original texts of laws and regulations, in Word, Excel, or PowerPoint formats, are uploaded for batch translation, and the translated results are available for download. With text translation, the material to be translated is entered into a form as text to produce a translation. In many cases, law and regulation translation consists of translating revised laws and regulations, so this is useful for translating only the portions that have been revised.

To efficiently confirm the translation results and make revisions, the system supports post-editing, in which a human revises the AI translation. In post-editing, the translation results can be compared against the original document, or the translation results can be translated back to Japanese to show how well they match. This makes it possible to determine the accuracy of translations and make revisions. Furthermore, if translation rules are configured in advance, the system can use color-coding to point out areas requiring special rule compliance attention, improving the efficiency of the verification process. There are also other functions that make it possible to see, at a glance, the status and accuracy of the translation, such as the status of each operation and comparisons of current and pre-revision translations.


Making AI translation usable through close handling of details


One of the major features of the Law and Regulation Translation System is that it can produce accurate and natural draft translations that comply with English law and regulation translation rules. For laws and regulations, in particular, it is common for certain terms or expressions to be used repeatedly within the same document, or across multiple documents. If these terms and expressions are not uniform, they can confuse the reader of the English translation. The Ministry of Justice has created a Standard Legal Terms Dictionary and a Legal Translation Guide and provided a uniform policy for translations.

To comply with this uniform policy—that is, this set of rules on English law and regulation translation, terminology is being registered in a terminology list and associated with specialized terminology, and English translations of laws and regulations published on the JLT are being used to perform additional learning, unifying the terms and expressions used in translations. Three other measures are being used to improve compliance with English law and regulation translation rules.

The first is the handling of the articles, paragraphs, and items in legal documents. This is one of the notable characteristics of legal translation. AI translation can sometimes misinterpret numbers in Japanese texts that are written with kanji instead of Arabic numerals. These are often used in the numbering of articles, paragraphs, and items, leading to mistranslations. The system helps address this issue through pre- and post-processing of AI translations. In pre-processing, “Article,” “Paragraph,” or “Item” appearing at the start of a sentence are separated from the rest of the sentence using regular expressions. This removes these designations from the AI translation scope. In post-processing, these are then converted into the appropriate final form based on the Legal Translation Guide. For example, in post-processing, “Daikyujo-no-ni (第九条の二)” will be converted to “Article 9-2”. Other conversions include translating “ko (項)” as “paragraph” and then following it with a number in Arabic numerals, enclosed in parentheses, or translating “go (号)” as “item” and following it with a number in Arabic numerals, enclosed in parentheses. For elements below the item level, there are detailed English notation rules, such as converting “i (イ), ro (ロ), ha (ハ)” to “(a), (b), (c),” and the system’s post-processing deals with all of these (Fig. 2).

The second is the elimination of unnecessary words. For example, article titles often include the Japanese word “to (等).” Normally, AI translation would change this to “etc.,” but the Legal Translation Guide states that “As a rule, when an article title contains a ‘to (等),’ do not use an ‘etc.’ in the English translation.” In post-processing, these “etc.” are removed. There are several such rules and issues that must be kept in mind in order to produce the results that the Ministry of Justice seeks, such as eliminating words from foreign language translations that are present in the original Japanese laws and regulations. We engaged in ongoing dialogue with the Ministry to address each of these issues.

The third is translation tuning. The Legal Translation Guide indicates words whose usage requires special care, or which should be avoided. For example, the word “shall” can refer to obligations, potential, rights, and the future, so it can lead to misunderstandings by readers. This is why training data which does not contain “shall” was used to perform additional learning, enabling the system to produce appropriate translations. Another example is the handling of sentences which do not specify the gender of the party being discussed. From the perspective of gender neutrality, words like “he” or “she” which indicate gender must be avoided whenever possible. The system deals with terms like this appropriately, complying with modern rules.


Further evolving AI translation to ensure translation quality and contribute to operational efficiency


Through this attentive handling, we achieved a high level of AI translation quality. During the trial deployment at the Ministry of Justice, the time it takes to produce an English draft translation was shortened dramatically, from roughly two years to just a matter of weeks. We are also continuously quantitatively measuring the quality of those translations.

Machine translation evaluations are performed using BLEU scores*. Five laws and regulations were selected for pre- and post-improvement evaluations. The document with the greatest improvement saw a score increase of over 5 points. A score of 40 or higher indicates a highly accurate translation, and the document’s score rose to 54.89. Human evaluations by translation agencies have also found that the number of translations requiring no changes is rising, and the number of translations requiring major changes is falling.

* BLEU score: A widely used method for evaluating machine translations. It mechanically evaluates the similarity between a machine translation and a reference translation by a human, using a scale of 0 to 100. The closer to 100, the more positive the evaluation. A score of 40 or above is considered a highly accurate translation.

In this way, the latest neural translation specially tailored to law and regulation translation and the highly reliable natural language processing technologies and rule-based machine translation technologies developed by us are combined to ensure a sufficient level of translation quality, even for highly complex and specialized legal translations, and to contribute to greater operational efficiency. However, compared to translations of revised laws or translations, there is still room for improvement in the quality of translations of new laws and regulations without prior translations that can be used for reference. We will use future law and regulation translations for additional learning and for updating definitions files to continuously improve the quality of translations.

In recent years, there has been strong demand for high quality translations not only for laws and regulations, but also for international communications, conducting studies involving foreign technical documents, creating manuals, and introducing Japanese technologies and culture. Toshiba Digital Solutions will use the knowledge gained from the development of this complex and specialized Law and Regulation Translation System to further evolve its translation solutions. In doing so, it will contribute to Japan’s global development by meeting the wide-ranging needs of Japanese government offices, research institutes, and companies.

Members involved in the Law and Regulation Translation System development project for the Ministry of Justice (From left to right) SONOH Satoshi, HAMAGAMI Takuya, KAWAI Midori

  • The corporate names, organization names, job titles and other names and titles appearing in this article are those as of July 2024.
  • All other company names or product names mentioned in this article may be trademarks or registered trademarks of their respective companies.