Toshiba AI Technology Catalog

  • Media recognition
  • Media data analysis
  • Speech dialogue

Speech recognition technology for domain-specific terms

We improve speech recognition performance in environments with many domain-specific terms and proper nouns, without requiring additional training.


  • We achieve highly accurate speech recognition using end-to-end speech recognition technology based on deep learning.
  • By simply preparing a user dictionary containing the spelling and pronunciation of technical terms and proper nouns, our system can handle domain-specific vocabulary without time-consuming additional training.
  • We improve speech recognition accuracy by detecting keywords such as technical terms and proper nouns, and applying post-processing correction using large language models (LLM).

Applications



  • Speech Dialogue System
  • Speech-to-Text Captioning System
  • Broadcasting System

Benchmarks, strengths, and track record



  • We evaluated our system using speech data from eight internal seminars containing many technical terms. Compared to conventional methods, our approach maintained the character error rate while improving the F-score for user-defined terms by an average of 3.05 points, with a maximum gain of 8.08 points.

Inquiries



Inquiries to Toshiba Corporate Laboratory (Komukai region)

Please include the title “Toshiba AI Technology Catalog: Speech recognition technology for domain-specific terms” or the URL in the inquiry text.
Please note that because this technology is currently the subject of R&D activities, immediate responses to inquiries may not be possible.