(Part 1) Sound technology that creates new ways of hearing

The use of sound as a means of communication has made great advances in society and in our daily lives. Thanks to these advances, we enjoy greater freedom and convenience in our handling of information. However, the audio information that surrounds us would cause some problems, such as sounds interfering with each other and listener’s fatigue caused by long-time listening. In the future, it will become an even more important viewpoint on how sounds should be heard to offer better use for the people. Toshiba has been developing technologies for use in our sound solution, Soundimension Sound with Controlled Distribution and Stereophony. Our concept is providing sound to people in desirable way. We have commercially released two Soundimension products: Soundimension for the Sound Images with controlled Stereophony and Soundimension for the Sound Field Control.* In this three-part serial article, we will show our concept, the expected applicability and the core technologies of our solution, sound images with controlled stereophony and sound field control for region separation.
In part one, we will introduce expected new ways of hearing sound and an overview of the sound technologies that will make them possible.

* This includes products scheduled for release in the fiscal year 2023.

What the Design of the Ways of Hearing Sound means?

Using sound to transfer some information to people has become an everyday part of society and people's lives. For example, equipment such as train station ticket vending machines, ATMs, and self-service cash registers have audio assists, and announcement sounds can be heard in all kinds of places, such as in buildings, vehicles, and trains.

With visual information, you will not know if there is any information available or not until you look at the information source. On the other hand, one gets noticed the information with sound itself. It also leaves people's hands and eyes free, enabling them to receive information with freedom and convenience.

However, as the use of sound becomes more and more common, the audio information that surrounds us is causing new problems.

These include, sound environment information overload and the sound being perceived as noise. When multiple announcements or messages are heard at the same time, it is not easy to accurately pick up the different sounds from their source, and it would be hard to understand the content and meaning of the sound. The sound information also includes both necessary and unnecessary components, and the importance of the information varies depending on the listener. Therefore, it is becoming difficult to differentiate between these sound information that can be heard and pick out which you need.

Another issue is the reduced privacy by hearable audio information. Other people could hear information that users may not want them to hear.

Furthermore, there are problems with the audio information provided via online meetings and seminars, which have rapidly risen to prominence over the past few years. Online meetings cut away some of the information relayed through in-person face-to-face interaction. Participants must concentrate on listening and understanding what is being conveyed. Often they should be exposed to the sound via speaker equipment or headphones for long periods of time. The listening fatigue which is caused by the long-time online communication would be an important issue in exchanging the information with sounds. Listening to sound in a way that differs from normal listening conditions over long periods of time, would cause feelings of strangeness and fatigue.

In other words, if the use of sound information is to make further progress, we must consider how the sounds surrounding us can be desirably heard so that people can optimally use that sound as an information source.

At Toshiba, we believe that treating those issues that arise when dealing with audio information would require more easily understandable, recognized and pleasing sound and that can be used without worries about bothering others.

This is why we have been focusing on how sound is heard -- that is, how it is presented.

In presenting sound to people, we have attended the component of "whereabouts" to sound itself. Then with simple output environmental conditions, we set the goal as providing sound to people in a better hearing way.

The "whereabouts" of sound refers to the sense of where the sound appears to be, where it is coming from and where it is not located. Reproducing this feeling of sound directionality and location has the potential to make the sound more natural and easier to understand.

We realize these "whereabouts" using the technologies of sound images with controlled stereophony and sound field control for region separation. These technologies are used in our solution, Soundimension Sound with controlled distribution and stereophony.

Various stereophonic technologies

Before starting the explanation about Toshiba's sound images with controlled stereophony technology, let's take a quick look at so-called stereophonic technology.

Stereophonic technology provides a three-dimensional audio effect with which one could feel the sounds with breadth and depth. It would include various technologies.

Figure 1 shows typically used approaches for producing stereophonic sound (scene-based, channel-based, and object-based).

Scene-based approach
This approach would record all sounds around an actual location, and then reproduces those sounds accurately. To collect the data, the omnidirectional microphones is necessary and this approach is mainly used on reproducing the sounds heard in actual environments precisely.
Channel-based approach
In this approach, the number and arrangement of reproducing speaker system are fixed. These speakers would be set to surround listener in many cases. Then the condition of the audio output from each speaker would be calculated and designed to reproduce the sound contents. This approach is often used to create much presence of sounds, so it would be appropriate for the facilities such as movie and stage theaters.
Object-based approach
Audio source locations are assigned to spatial coordinates, and the sound that reaches the ears of listener from a given coordinate position is calculated. Therefore, reproduced sounds give a sense of spatial location. So it would be possible not only to reproduce the sounds for actual environments with a sense of presence, but also to give each sound desired direction regardless of actual sound position.

Each of these approaches is suited to different usage situations depending on facility limitations and the types of desirable sound effects to be achieved.

Core technology 1: The features of sound images with controlled stereophony

As explained above, Toshiba aims to provide sound to people in a desirable way by adding the element of “whereabouts" to the sounds. When creating directional sound, our target is designing desirable sound for people, not to reproduce actual sounds in environments accurately. That is why we chose the object-based approach.

To make it clear that our approach consists of designing virtual sound sources within a space, we called our technology sound images with controlled stereophony, instead of just calling stereophony. Then we named the solution based on this technology Soundimension for the creation of sound images with controlled stereophony.

A good deal of research on auditory localization technology with the object-based approach has been there. In this field, head-related transfer function (HRTF) is commonly used to produce sounds with directional information.

Toshiba's technology also uses HRTF, but its feature is that it is relatively stable and effective at providing auditory localization experiences without using personal headphones or earphones.

In the case of creating directionality of sounds with conventional way, when listeners without earphones nor headphones change their position slightly, the feeling of sound direction is significantly changed due to the effect of cross-talk. In our solution, on the other hand, we take complex sound pressure ratios into consideration, so even if listeners who are not wearing earphones nor headphones change their position slightly, the feeling of directional sounds remains unaffected.

We will show the detail of this technology in the second article of this series.

Various sound field control technologies

Now let's take a look at sound field control technologies, which include other Soundimension core technology, sound field control for region separation. Figure 2 shows some examples to realize sound field control.

In general, sound field control technology creates a spatial distribution of sound pressure for the sounds output from reproduction equipment like speakers. In other words, this technology produces places where sound is clearly audible and places where it isn't.

This is typically achieved by using directional speakers. For example, an ultrasonic speaker array can produce sound distribution where one can hear the sound clearly along with a specific direction but cannot hear at other locations.

Also by using an array of speakers, a spatial sound pressure distribution would be producible. The basic approach is to control the phases of the sounds emitted from each speaker and overlap them to create the desired sound pressure distribution. Usually, a large number of speakers are required in this case.

Core technology 2: The features of sound field control for region separation

We are developing our sound field control technologies with the aim of delivering the sounds to the people who needs them. Our concepts are two. One is using a simple system to enable users to enjoy the benefits of sound field control with minimal burdens, and the other is realizing the change of sound pressure distributions at selected locations within a space without changing hardware configurations.

That is why we are designing technologies with using just two or three, commercially available speakers to produce sound field control benefits in desired areas within spaces. These sound field control technologies make it possible to design sound pressure distributions within selected spatial areas, so we call our technologies sound field control for region separation.

With our technologies, three speaker equipment would be able to create special sound pressure distribution. These area sizes can be anywhere from several dozen centimeters to roughly one meter. The solution with these technologies is commercially available through our Soundimension Sound Field Control.

We will explain this technology in detail in the third article of this series.

What our new sound technologies make possible

By applying these technologies, such as sound images with controlled stereophony and sound field control for region separation, listener-friendly sounds would be realized. For example, one would be able to hear more clear and natural-feeling sounds. One also would be able to catch the personalized audio information without any concern about bothering others or sacrificing one’s own privacy.

Figure 3 shows applicability of those new way for sound-hearing. It could be used in online meetings and seminars where users need to listen for long periods of time, to decrease listener’s fatigue. It also could be applied to various forms of mobility such as automobiles as moving space, to provide one individually optimized listening environment. In other situation, it could be used to apply some sound message according to the user’s actions, in homes and office.

In this part one of the serial articles, we have explained the technology base of our Soundimension Sound with Controlled Distribution and Stereophony and its applicability. In parts two and three, we will provide explanations for each core technology, sound images with controlled stereophony and sound field control for region separation. These articles will also present the potential applications and future vision with the strengths of Toshiba's technologies.

Up next: (Part 2) Technology of sound images with controlled stereophony

Yuki Yamada

Expert
New Business Development and Marketing Dept.
ICT Solutions Div.
Toshiba Digital Solutions Corporation

Since joining Toshiba, Yuki Yamada has been involved in the research and development of new semiconductor devices. In 2015, she began working on the development of data utilization technologies, and then became involved in new business and product development. She is now helping launch new business that leverages sound technologies.

The corporate names, organization names, job titles and other names and titles appearing in this article are those as of October 2022.
Soundimension is a registered trademark of Toshiba Digital Solutions Corporation in Japan.
Soundimension for the Sound Images with controlled Stereophony and Soundimension for the Sound Field Control are not currently available for purchase outside Japan.