Toshiba AI Technology Catalog

  • Media recognition

Large vocabulary speech recognition

Recognize spoken language audio with high accuracy, and detect words not needed to understand the meaning (e.g., fillers and hesitations).

  • Using neural networks, created a model for fillers and hesitations with the same framework as short-term speech equivalent to one syllable.
  • Depending on the application, display or remove detected fillers and hesitations.


  • Supports increased understanding of online lectures and presentations.
  • Guaranteeing information accessibility for the hearing impaired
  • Supports the creation of meeting minutes.

Benchmarks, strengths, and track record

  • Achieves a speech recognition rate of 85%, which is considered sufficient to understand spoken content.


Please include the title “Toshiba AI Technology Catalog: Large vocabulary speech recognition” or the URL in the inquiry text.
Please note that because this technology is currently the subject of R&D activities, immediate responses to inquiries may not be possible.