News
Toshiba Digital Solutions Releases the RECAIUS Voice Trigger Speech Recognition Middleware
- Use of Specific Keywords Supports Swift AI Speech Recognition by Various Embedded Devices -
January 30, 2018
Toshiba Digital Solutions Corporation
KAWASAKI, Japan—Toshiba Digital Solutions Corporation (TDSL ) will release the RECAIUS voice trigger speech recognition middleware for embedded devices as part of Toshiba’s RECAIUS™ communication AI service.
The RECAIUS voice trigger speech recognition middleware is a product for embedded devices that detects specific trigger words*1 in Japanese, such as, “Hey, RECAIUS, increase the volume,” to enable devices to respond by launching voice assistants*2 or receiving commands. It requires only a small memory size and computational effort, making brisk operation and continuous stand-by possible. The trigger words can be set up freely. The middleware can be utilized in various fields and for various applications including smart speakers, robots, home appliances and in-car devices.
Key features of the RECAIUS voice trigger speech recognition middleware
1. Small memory size and computational effort enable brisk operation
The middleware requires only about 300KB of memory*3 and can be embedded in devices that have a limited memory capacity. Swift response to the trigger words is also possible because only a small computational effort is required.
2. Always listening for ease of use
The device will always be listening for the trigger words and does away with a push-button type talk switch*4. Voice commands can be used without the need to avoid speaking words other than needed commands.
3. Trigger words can be set up freely
Users can set up their own trigger words, making it possible to use a wide-range of devices.*5
4. Supported language: Japanese (English, Chinese and other languages are scheduled to be added sequentially from FY2018)
In addition to the voice trigger, which responds to key words, Toshiba’s communication AI, RECAIUS™, offers other middleware for embedded devices. It includes the ToSpeak™ text-to-speech middleware for high-quality speech synthesis and speech recognition capable of handling long sentences. Meanwhile, advanced speech recognition, synthesis, translation, voice interaction, intent understanding and image recognition functions are provided through cloud-based services (WEB API, etc.). Hybrid packaged services that enable swift speech and image processing to be carried out locally by a terminal, while advanced processing is handled by the cloud are also available.
RECAIUS is focusing its efforts on three areas. The first is field operations support, such as through the automation of situational awareness and work reports. The second is call and contact center support for the provision of real-time support to operators as well as reduction of recording tasks. The third is for products embedded into IoT devices, such as the voice trigger announced here. Through such efforts, TDSL will not only provide simple operational support but also realize support for the automation of operations. The development and provision of new services will also be promoted through embedding into devices and seamless collaboration with the cloud-based AI platform.
Notes:
- *1 Trigger words are the key words used to wake up a voice assistant or give a voice command
- *2 Voice assistants help users by responding to voice queries and commands made by users, whether by answering the question using a synthesized voice or performing a task.
- *3 The required memory size depends on such matters as the number of trigger words.
- *4 A talk switch is a button or other physical switch that is used to speak to a voice assistant to trigger an operation or movement.
- *5 To avoid unwanted operation, the use of a short trigger word is not recommended. It is better to use a short combination of several words.
- * RECAIUS™ and ToSpeak™ are registered trademarks or trademarks of Toshiba Digital Solutions Corporation in Japan and other countries.
- * Other company names and product names in this document may be trademarks or registered trademarks of the respective companies.