Corporate Research & Development Center

Toshiba’s Auto-Subtitling System for Online Classes is a Win-Win for Educators and Students
-In our COVID-19 world, high accuracy subtitling helps educators to deliver material and students to understand it-

June 10, 2020
Toshiba Corporation

TOKYO—In the new normal of the COVID-19 pandemic, schools and universities have to come up with novel and effective ways to reach and teach their students. Many have turned to distance learning and online classes, including almost half of Japan’s universities.

While on-line teaching works, there are drawbacks: educators have to explain complex subjects over networks that might have patchy video or sound quality; and it is difficult for students to review lessons and lectures efficiently, and especially to pinpoint just where the teacher said what. With ToScLive™, Japan’s Toshiba Corporation (Tokyo: 6502) has developed a powerful tool that brings benefits to both teachers and students.

ToScLive™ is an automatic subtitling system for online classes that delivers a real-time record of what the teacher is saying. It is self-contained and operates independently, and can be used with any of today’s popular video- and voice-streaming conferencing software.

For educators who have had to bring courses and classes on-line quickly, without the luxury of long-term planning or preparation, ToScLive™ is a real boon. A lot of courses have specialist vocabulary or technical terms that an automated speech recognition might not handle well on the fly. ToScLive™ can automatically scan the text of lecture materials in advance, extract and learn the terms, and deliver them precisely during class.

Teachers are supported in other ways too. High accuracy speech recognition needs good sound to work with, and usually has to be set up by someone who understands sound balance and microphone positioning. ToScLive™ makes this easy with a guide function that enables the teacher to confirm the mic setting simply by speaking a few sentences before the class. It measures sound volume in the mic and ambient noise, all to ensure good sound and accurate speech recognition.

For students, ToScLive™ improves the class experience. In the course of the class, they get streaming subtitles of real-time voice-to-text that reinforces what the teacher is saying, and if they miss something they can refer back to the section during or after class. They also get a copy of the transcript once the class is over, delivered via a web browser to their PC, tablet or smartphone. Students can also use this record of the class if they want to review all or part of the class in an archived video, as a guide to fast forwarding or rewinding to a particular section.

Figure 1: Outline of ToScLive(TM)

Figure 2: ToScLive(TM) demonstration image

For both educators and students, ToScLive™ is a high-precision subtitling system that can be used by non-experts, and that helps maintain and improve the quality of online classes.

Toshiba has a long history of R&D in media intelligence, and speech recognition AI for conferences and lectures that the company unveiled in March 2019 provides the template for ToScLive™. Without any prior familiarization with the speaker’s voice or the material covered, ToScLive™ now achieves a voice recognition rate of 85%, a level that allows people to fully understand the content. The software can also recognize unnecessary fillers, utterances like “uh” and “umm” and expressions of agreement that add nothing to the content, and shows them in a muted color in the subtitles, improving their readability. This is done to assist people who lip read the class, and who might be confused by apparent speech with no subtitles.

This June, Toshiba plans to start trials to verify the effectiveness of ToScLive™ at two of Japan’s leading private universities, Keio and Hosei. Feedback from teachers and students will support Toshiba in improving the functionality and quality of ToScLive™, and help to secure its faster introduction into the education sector.