Advanced Technologies for Digital Media Processing Digital Media Supporting the Information Technology Society MORI Kenichi Recent Advances and New Applications of Computer Vision Roberto Cipolla / Carlos Hernandez / George Vogiatzis / Bjorn Stenger (468KB / PDF) Trends in Advanced Digital Media Processing in Familiar Digital Devices YAMAUCHI Yasunobu / DOI Miwako The performance of digital media devices such as PCs, TVs, and mobile phones has rapidly progressed in recent years, and people now depend on these devices for convenience and comfort in life. However, the increasing volume of Internet data, which can now be easily acquired via a broadband network, also creates problems for users in terms of accessing the data that they actually need. Although people's ability to handle large volumes of media data can be augmented by means of digital devices, it is difficult for users to access most of the data they need due to the gap between the capabilities of human users and digital devices. Human-friendly digital media processing is required to compensate for this gap. Toshiba has created various advanced media processing methods and digital devices focusing on high-quality media representation that appeals to people's sensibilities, natural and intuitive human-machine interaction, and data filtering using relational information. High-Dimensional Texture Technology for Photorealistic Computer Graphics SEKINE Masahiro /MIHARA Isao / YAMAUCHI Yasunobu If high-quality computer graphics (CG) offering photorealistic surface appearances can be created, they can be applied in a broad range of markets including not only motion pictures and video games but also industrial product design and e-commerce. Complex CG modeling and complex shading calculations have conventionally been necessary for rendering photorealistic CG. Toshiba has developed a high-dimensional texture technology that can create CG with a photorealistic surface appearance by using images of the material captured under various conditions. In addition, the high-quality CG can be controlled interactively using a graphics processing unit (GPU). Interactive Vector Rendering for 3D User Interfaces KOKOJIMA Yoshiyuki / SUGITA Kaoru Toshiba has developed a new method for graphics processing unit (GPU)-accelerated rendering of vector graphics such as Flash and TrueType characters embedded in a three-dimensional space. Our method requires no expensive preprocesses, allowing it to render dynamically deformable vector objects with high efficiency. We have implemented a prototype 3D electronic program guide (EPG) browser using this method. This browser provides an easy way for users to select their favorite TV contents from among a large number of videos stored on hard disk. High-Compression PDF Conversion Technology DOBASHI Toshimasa / MIZUTANI Hiroyuki In order to meet the growing demand for efficient document image compression, Toshiba Solutions Corporation has developed a high-compression PDF conversion technology suitable for color document images obtained by scanners, multifunctional peripherals (MFDs), and so on. This highcompression PDF conversion technology realizes smaller file size and better image quality compared with JPEG technology by separating character elements and non-character elements in the document and adopting the appropriate compression method for each element. ToSpeakTM High-Quality Text-to-Speech System KAGOSHIMA Takehiko Toshiba has developed ToSpeakTM, a new text-to-speech (TTS) system that synthesizes speech in a high-quality, natural manner. ToSpeakTM can generate synthesized speech having the individuality of an original speaker in terms of prosody and voice quality from any input text. This TTS system features corpus-based approaches including (1) statistical training of prosody control rules, and (2) a plural unit selection and fusion method for the speech waveform generation module (synthesizer). In the prosody training, representative fundamental frequency vectors are extracted from the speech corpus so as to minimize errors of the resultant fundamental frequency contours. In the synthesizer, the proposed method achieves stable, humanlike speech quality. Our TTS systems are used in a variety of applications such as the speech interface of car navigation systems. Face Recognition Technology for Identification of Walking Person YAMAGUCHI Osamu / NISHIYAMA Masashi / KAWAHARA Tomokazu Face recognition technology is widely utilized for various media processing purposes. Toshiba has continued to improve the performance of this technology in the field of security applications. An advantage of face recognition in the security field is its higher user-friendliness compared with other biometric techniques. In order to further enhance convenience, we have developed the SmartConciergeTM walkthrough type face recognition system that can identify a walking person. Moreover, we are also developing an advanced face recognition system for the simultaneous identification of multiple walking people. Online Overlapping Handwriting Recognition -New Character Input Interface for Mobile Phones TONOUCHI Yojiro / KAWAMURA Akinori In conventional Japanese online handwriting recognition systems, it is common to employ a multi-box writing interface where the user writes a character in each box in succession. The handwriting in a box is recognized as a character after the stylus moves to the next box. However, the size of the individual boxes is small because of the limited area available for writing in small devices. It is uncomfortable for users to write small characters in small boxes, particularly when writing by finger. Toshiba has developed a novel online overlapping handwriting recognition system for mobile devices such as cellular phones. It is suitable for small devices, because the user can input characters continuously without pauses in a single writing area. It also has two other features: (1) quick response from handwriting input to display of the recognition result, and (2) users can input characters without having to watch their hands. In addition, it enables users to not only input characters but also to perform basic operations directly by inputting handwritten gestures. These features provide mobile users with a comfortable character inputting system. Omnidirectional Acoustic Sense Technology for Voice Differentiation SUZUKI Kaoru / KOGA Toshiyuki Toshiba has developed a new omnidirectional acoustic sense technology to facilitate natural interactions between humans and robots. We used the Hough transform to detect straight lines from the frequency phase difference space for the detection and localization of sound sources. An ApriAlphaTM robot equipped with this function could localize and recognize multiple speakers from unlimited different directions and reply to each speaker. Home Security Robot Using Life Ontologies and Blog Interface CHO Kenta / KAWAMURA Takahiro Toshiba has been developing a home security robot using the ApriAlphaTM home robot to integrate legacy appliances in a home. This system provides a blog interface to receive users' requests remotely in natural-language sentences and show the status of appliances via a Web browser. The robot serves as an" intelligent glue" that connects and automates the legacy appliances, allowing the users to easily introduce an intelligent environment in their home. It uses ontologies about commodities in the home, locations where these are placed, and tasks the robot can achieve. By using these ontologies, the robot can select and combine appropriate actions to respond to a wide variety of user requests. "SASATTO Search" Human Interface Technology for Information Retrieval SUZUKI Masaru / ISHITANI Yasuto / SAKAMOTO Kei To realize easy and accurate information retrieval, Toshiba has developed a pen/mouse-based human interface called "SASATTO Search" for chaining searches of the Web search system. If a user selects a keyword from a document that he/she is reading, documents related to the keyword can be obtained simply by selecting the desired search method from a display context menu. It is easy to accomplish such a search because the meaning of the keyword is determined by semantic pattern analysis and the menu contains search methods suitable for the meaning. In an experiment involving 15 users, it was confirmed that the proposed interface is more effective in terms of easier and more accurate information retrieval compared with the conventional method. HOTWORDLINKTM for Topical Word Extraction and Related Information Retrieval OKAMOTO Masayuki / FUJINO Go / NEGISHI Shinichi Technologies for extracting and visualizing topical news are a current trend in Web services. Toshiba has developed HOTWORDLINKTM, a topic-extraction function for audiovisual-specialized notebook PCs. HOTWORDLINKTM visualizes topical news items and their trends and enables the easy retrieval of related Web pages with one click. The features of HOTWORDLINKTM include topic extraction with two-level clustering and statistical techniques, the classification of each topic into positive or negative, and person-name extraction. The results of experiments showed that the extracted topics and trend graphs contributed to the subjects' understanding. CommutentsTM Communication Support System via Blog and Video Contents TSUTSUI Hideki / YAMASAKI Tomohiro / URATA Koji Toshiba has developed CommutentsTM, a communication support system enabling exchanges of comments about video contents. In this system, comments about scenes on DVDs are shared via a blog system. This makes it possible for users to share only comments without sharing the contents themselves, by identifying DVDs that the users individually own. This system has two display modes: a video synchronous display mode with gathered comments, allowing users to find blog articles that they are interested in; and a blog display mode, which is suitable for the reading of articles. By cooperatively using these two display modes, users are appropriately led to blog articles. The effectiveness of this system was confirmed by an evaluation test. Digital Media Processing Technologies for Healthcare Solutions OSADA Masakazu Information technology(IT) has gained a solid position in healthcare institutions in Japan as a means of reducing costs and increasing efficiency while maintaining the quality of healthcare services. Toshiba Medical Systems Corporation has developed several innovative healthcare IT solutions such as the RapideyeTM picture archiving and communication system( PACS), the RapideyeTM hyperlink reporting system, and a teleradiology solution by applying the digital media processing technologies of the Toshiba Group. |