(Part 3) How Should We Address Copyright Issues Related to AI?

A diverse range of consumer-oriented devices, such as appliances and car navigation systems, use speech to relay information to users. They are used to make announcements on trains and buses, and also for the lines spoken by characters in smartphone apps and video games. Until now, speakers have been recorded and their voices have been played back as audio on devices, but thanks to recent evolutions in speech synthesis technology, artificially created synthesized voices are increasingly being used. Toshiba has many years of experience with speech synthesis technologies. We have developed numerous fundamental technologies that create more natural, higher quality speech. In this three-part special feature, we will discuss societal trends related to speech synthesis, the features of Toshiba's technologies, the frontlines of product development, and the future outlook for the use of these technologies throughout the world.

In Part 1, we introduced typical use cases of speech synthesis, its underlying technologies, and Toshiba’s proprietary technologies. In Part 2, we described Toshiba’s speech synthesis SDK products in the context of technological evolution. In this third and final part of the series, we examine copyright-related issues raised by generative AI as a recent topic in speech synthesis technology and its applications. We also present Toshiba’s approach to addressing these issues, drawing on its long experience in speech synthesis technology and application development.

Issues that have emerged with the advent of generative AI

In Parts 1 and 2, we focused primarily on speech synthesis technologies and products, along with specific usage scenarios. In this part, we first explain key points for properly understanding copyright-related issues surrounding generative AI, which is becoming an indispensable technology in the field of speech synthesis. We then introduce Toshiba’s initiatives in speech-related generative AI (speech AI).

In Japan, a new movement has been launched, with the slogan "No More Unauthorized Generative AI."

This movement involves performers, including voice actors, whose voices have been synthesized and used without permission by generative AI. They speak out about their experiences and call for an end to the unauthorized creation and use of AI-generated content. Beyond this movement, companies and organizations involved in voice-related businesses have held press conferences calling for the protection of voice rights, and issues related to voice and speech have been widely covered in the media. Copyright issues related to voice and speech have become concrete only with the advancement of speech AI, and they are now attracting significant attention.

Why, then, has AI-generated speech become an issue? It is because (under current legal frameworks) voice itself is not protected by copyright.

* Generally, voice itself is not protected by copyright; however, creative performances may be protected as part of a copyrighted work.

However, listening to the views of those who have spoken out on AI-related copyright issues reveals that these concerns are not limited to voice alone. While human creators invest significant effort and creativity into their works, there is currently no clear framework governing how AI-generated content that imitates such works should be handled. Despite the lack of consensus among stakeholders, generative AI technologies continue to advance rapidly and are increasingly being adopted by the general public, giving rise to serious concerns.

Background of copyright-related issues surrounding AI

First, this section provides an overview of copyright issues arising in AI across all fields—not limited to speech and voice—and the relevant aspects of Japanese copyright law.

Article 30-4 of Japan’s Copyright Act, introduced in 2019, provides that under certain conditions copyrighted works may be used without the copyright holder’s permission during the development and training of AI systems. Why does the law adopt what may appear to be a counterintuitive position, allowing copyrighted works to be used without the author’s permission?

The answer lies in the timing of the law’s introduction (Fig. 1). At the time, improving AI recognition and big data analysis capabilities required access to vast amounts of training data. Obtaining permission from the copyright holder for each individual dataset made large-scale data collection extremely difficult. It is widely understood that this legal change was intended to prevent such barriers from hindering competition in the development of AI technologies for recognition and analysis.

At the time, it was not anticipated that AI would evolve to generate content comparable to that produced by human creators. The law is therefore often described as having been designed solely to advance recognition and analysis technologies. Just five years later, though, AI had become capable of generating all kinds of content based on the data used to train it.

The rapid evolution of AI made it easy for anyone to output content that was identical to the data that the AI was trained on, as often as they wished. However, Japan has a law on its books that makes it legal to use other people's copyrighted creations to train AI without the copyright owners' permission. This has made Japan a "machine learning paradise" for training AI on copyrighted materials from around the world, without the need to gain permission.

Despite this, the Agency for Cultural Affairs, which has jurisdiction over the Copyright Act, has taken a passive stance with respect to revising Article 30-4 of the Act (as of September 2025). Apparently, this is because Japan's Copyright Act looks at the AI training stage and usage stage in very different ways. Specifically, even if an AI is trained on data (copyrighted works) without permission during the training stage, in accordance with Article 30-4 of the Copyright Act, the law is said to view copyright infringement as being avoidable by giving due consideration to copyrighted works during the usage stage (Fig. 2).

However, the Copyright Act currently has no clear provisions related to the usage stage. Because of this, there are users who use AI to generate content that imitates copyrighted works, and this has ultimately led to bifurcation into two opposing camps: pro- and anti-AI.

Facing AI‑Related copyright issues

Recently, a major provider released a generative AI feature that converts facial photographs into illustrations in the style of Studio Ghibli, which attracted significant public attention. This case highlighted the difficulty of placing responsibility for copyright compliance solely on end users.

Legal experts noted that most of the so‑called Ghibli‑style images merely resembled the studio’s artistic style and did not depict identifiable characters created by Studio Ghibli, and therefore did not constitute copyright infringement (as of September 2025). But there remains a non‑zero possibility that a generative AI system could output an image that closely resembles an existing character. If such an image is generated and used without the user recognizing the similarity, the user may be held liable for violating copyright law.

One might initially assume that the provider of the AI system should bear responsibility. However, under Japanese law, training AI models on copyrighted material is permitted. In addition, most providers have usage terms which state that they cannot be held responsible for the output of their AIs, and users agree to these terms before using the AIs. Consequently, legal responsibility ultimately falls on users. At present, it is said that there are no definitive measures that allow users to completely avoid copyright risks.

Given this situation, companies that use generative AI for creative work must assume responsibility, and individuals using such tools for personal purposes must likewise exercise caution.

Initiatives for address copyright issues

So far, we have focused on complex legal issues. What we aim to convey is that resolving copyright‑related problems arising from generative AI—which is rapidly evolving, learning from ever‑growing datasets, and already being deployed across diverse domains—is extremely challenging, and requires the understanding and cooperation of all relevant stakeholders. This point is explicitly stated in the government’s Intellectual Property Promotion Plan 2025, which notes that legal, technological, and contractual measures must complement one another, and that coordination between the public and private sectors is essential to ensuring the healthy use of AI.

So, what can be done at present to avoid copyright risks when using generative AI? Below, we provide a brief explanation using the example of preparing planning materials.

From a copyright standpoint, caution is required when materials containing AI-generated images, videos, or audio of people or characters are released outside the company or provided to clients in exchange for compensation. As a general rule, using such materials internally—for tasks such as drafting proposal structures or brainstorming ideas—does not pose copyright issues.

If materials released externally contain AI‑generated people or characters that are deemed identical or substantially similar to those in existing copyrighted works, the use may be treated as unauthorized exploitation of protected content, potentially leading to various liabilities. This is the same type of risk involved in generating Ghibli-style characters. Likewise, similar issues arise when generated audio resembles the voice of a famous character or a real celebrity.

It is unrealistic to expect those involved in preparing planning materials to be familiar with the vast number of characters that appear across comics, anime, and other content. For this reason, verifying the safety of AI‑generated outputs is extraordinarily difficult—so difficult that, in practical terms, it may be considered nearly impossible.

So how can copyright‑safe materials be created? One effective approach is to avoid using AI‑generated output as is, and instead have humans apply additional modifications to alter its impression before use. This approach is already employed by content production companies. It is important that humans perform the final adjustments rather than relying solely on generative AI.

Of course, responsibility ultimately lies with the person who produces the final deliverable. This point must always be kept in mind.

Toshiba's initiatives as a developer of generative AI

So far, we have explained key points to keep in mind when utilizing generative AI in various domains. From here, we will narrow our focus specifically to the field of speech synthesis. Toshiba is one of Japan’s longstanding commercial companies engaged in the development of speech‑related AI technologies. As described in Part 2, we have long provided the ToSpeak series—our speech synthesis software development kits (SDKs). Across all generations of our products, we have taken copyright‑related issues seriously. When procuring the audio sources required for development, we have consistently ensured that fair agreements^* are concluded.

^* Fair agreements: Agreements whose conditions are mutually acceptable to all parties, reached without unfair pressure or bias.

Specifically, such agreements must be clear in their terms, contain no clauses that unduly disadvantage any one party, and be established through a process in which sufficient information is disclosed and decisions are made freely.

For example, for over a decade, we have collaborated with customers to develop “sound‑alike speech AI^*,” which uses AI to reproduce the voices of highly prominent copyrighted characters. In these projects, before concluding agreements, we explain to performers (voice actors), their agencies, and other related parties how sound‑alike AIs are created and how they may be used in business contexts. We also discuss usage scope, compensation, and related terms with all parties, and reach a consensus before signing contracts.

^* Sound-alike speech AI: Speech AI capable of producing output that closely resembles the voice of actual people or characters.

From the very start, Toshiba has been mindful of the risks that could arise from training AI models on unauthorized audio sources. Because we have taken appropriate measures throughout product development, the copyright issues now surfacing in the industry should, in principle, not directly concern us. However, unfortunately, performers and related parties who have suffered harm from other AI systems may perceive our speech AI as no different, which is a regrettable reality.

That said, what's important in such circumstances is not confrontation but dialogue. In particular, it is essential to understand and empathize with those who have been affected, to propose concrete solutions, and to put those solutions into practice. As a company that has diligently concluded contracts and built up experience in developing speech AI, we will continue to take fair and responsible action with even greater commitment.

Latest initiatives to promote fair speech AI

We now turn to Toshiba’s latest initiatives addressing copyright‑related issues in the area of speech AI.

As a developer of speech AI, Toshiba Digital Solutions respects voice rights in the development and use of speech AI. To help foster an environment in which speech AI can be used with confidence, we are a member of the JAPAN AI voice Learning Data Approval Service Association (AILAS), which operates a certification scheme for this purpose.

This certification scheme is expected to be adopted by a growing number of speech‑AI providers, and it has also attracted attention from overseas developers. It is mentioned in the government’s Intellectual Property Promotion Plan 2025 as an example of private‑sector activity aligned with national initiatives.

Founded in 2024, AILAS operates under the mission of building and running a framework that prevents the cultural value of Japan’s uniquely developed content industry from being undermined by disorderly use or development of AI, and of enabling the coexistence and mutual prosperity of AI and creators.

AILAS's activities were decided on through extensive consultations with companies engaged in voice-related businesses, performer associations, content production companies and organizations, and speech-AI developers. As a result, its framework is structured to be broadly acceptable to a wide range of stakeholders.

AILAS’s primary activities are twofold:

Performer intent registration: Collecting and disclosing the intentions of individual performers and rights holders regarding the use of AI.
Business registration certification: Verifying whether a developer’s speech‑AI systems respect the intentions of performers and rights holders, and issuing and managing certification numbers and labels for compliant businesses and products.

Business registration certification enables providers to demonstrate—via certification numbers and labels—that their speech AI has been developed using audio sources duly authorized by the performers. This allows end users to identify certified products and services and to use them with confidence, without undue concern over copyright‑related risks. Through its speech AI development history, Toshiba has properly handled rights clearances for training datasets in coordination with rights holders. In the future, an increasing number of speech‑AI systems will offer audio quality and capabilities so advanced that they are virtually indistinguishable from real human voices. To provide customers—and the end users of their products and services—with greater assurance, Toshiba is currently pursuing certification for its speech‑AI technologies. This certification will help alleviate concerns and enable our technologies to be used with confidence.

We expect that as such mechanisms spread, issues arising from unauthorized generative AI will gradually be resolved.

Summary

In this article, we introduced the copyright-related problems emerging from the rapid evolution and widespread adoption of generative AI. We outlined how current Japanese law views these issues, highlighted key points for using generative AI safely, and explained the measures Toshiba is taking to address copyright risks in its ToSpeak speech AI.

Because this article is intended to provide an accessible overview for a broad audience, some terminology and definitions have been simplified.

As AI‑related regulations and legislation continue to be developed around the world, it is expected that numerous rules will emerge in parallel. As a company that deploys AI in its business, we will remain fully abreast of the latest regulatory developments and provide customers with products and services that are compliant and responsibly designed. We hope that this article has deepened your understanding of copyright issues in AI and has fostered interest in Toshiba’s speech‑AI technologies, which continue to address these challenges.

KURATA Yoshinori

Fellow, Managed Services Promotion Dept.,
Digital Engineering Center,
Toshiba Digital Solutions Corporation

Representative Director,
JAPAN AI voice Learning Data Approval Service Association (AILAS)

KURATA Yoshinori has been engaged in the development of speech‑dialogue AI for more than 20 years.
He led the development of an anime‑character voice chat application that achieved over one million downloads and reached No. 1 in the Android app rankings. He received the Technology Award at the Digital Signage Awards 2016. Since joining Toshiba Digital Solutions Company, he has planned, developed, and commercialized Voice Track Maker, a speech synthesis tool for the general customers.

The corporate names, organization names, job titles and other names and titles appearing in this article are those as of September 2025.
ToSpeak is a registered trademark or trademark of Toshiba Digital Solutions Corporation in Japan.
All other company names or product names mentioned in this article may be trademarks or registered trademarks of their respective companies.