Is there any real physical sound experience that can be exactly replicated through a stereo system? Probably not. Why? Because the sound engineer who’s making the recording is limited to coding a complex three-dimensional sound field using only two channels, which are then played back from two distinct locations in a probably less than perfect listening room.
There is however, one listening situation that bears a nice similarity to stereo listening. Imagine yourself at a concert hall, seated inside a private loge, or theatre box. Three walls are surrounding you, and the front of the room is open to the concert hall. As you listen to the performance, you are getting the same kind of acoustic experience as when you listen to a pair of speakers in your own living room. The sound stage is confined to the size of the “opening” in the front wall, or the distance between the speakers. And both direct sounds and reverberation tend to come from the front.
The point is that stereo is perhaps better at replicating the experience of “listening in” on another world, than the feeling that you’ve actually been transported there.
Behind the Sound Stage
One key to the success of the stereophonic system is that it’s possible to achieve a sound stage of fantastic precision while using only two speakers.
The basic method behind the placement of sounds in the sound stage involves the use of phantom sources. Sounds can be made to seemingly appear between the two speakers, floating in the air where there is, in fact, no speaker at all. This is accomplished by varying the amplitude and time delay of the signals to the left and right speakers.
How does a phantom center sound image sound, compared to when an actual center speaker is used? An experiment can be done where the two are compared directly for any stereo recording. A signal processing technology is used that separates the center-panned material from the recording so that the center signal can be sent to either a physical center speaker, or to the left and right speakers as a phantom center. If you do this then the special ‘sound’ of a phantom image becomes obvious and one of the first things you will notice is that using a phantom center, the sound appears to come from a point closer to you than the line between the speakers. You can also hear the coloration effect. And you can hear the comb-filtering effect changing as you move your head in and out of the sweet spot.
There are a few problems associated with phantom sources, however:
1. The “Sweet Spot”
The sweet spot is the spot between the speakers where the sound is the best. For the sake of simplicity, let’s assume you are listening to an instrument that’s playing equally loud, and with the same time delay, in both speakers. Let’s also assume that you’re sitting in the “sweet spot,” within equal distance to both speakers. You will then experience a phantom image between the speakers, causing you to perceive that the sound is coming from straight ahead.
Phantom sources don’t work as well outside the sweet spot. If you move slightly closer to one speaker, then the phantom source will seem to move closer to it. If you move closer still, then all the sound will appear to be coming from the closer speaker. This is due to the Precedence Effect, which states that human beings localize sounds to the direction from which they first come, regardless of whether a later echo comes in from another direction. We are lucky that our hearing system is this smart, but the Precedence Effect is apparently the enemy of the stereo system, and one of the reasons behind the “sweet spot.”
A second problem with phantom sources is that they inherently result in some unnatural coloration of the sound, compared to if a real speaker were sitting where the phantom source is located.
Let’s look at an example. The listening situation in stereo looks like this:
For a center panned sound, the sound waves from the left and right speakers will mix when they reach the ears with slightly different delays. This will cause some frequencies to add up, and others cancel out. This timbral distortion, which is built into the stereo system, can be measured at the ears of a listener if you conduct some measurements*. An example is plotted in the graph below:
There is quite a big lack of energy in the frequency response around 2 KHz, and this will be audible on center panned sounds like voices or snare drums. In fact, the coloration will be different depending on where in the sound stage the phantom image is located. Luckily, the effect of the cancellation dips becomes a bit less severe in a listening room where some room reflections are able to partly ‘fill them in.’ Of course, the sound engineer mixing the recording could partly compensate for the timbral coloration, but not for the time-smearing of transient sounds caused by the sound reaching the ears at slightly different delay times.
3. Sound Positioning
Depending on the listener, a phantom source tends to sound like it’s coming from a point slightly higher in elevation, rather than the line between the speakers. The reason for this is quite complex, but has to do with the fact that humans localize sounds in the height direction by subconsciously analyzing the spectral coloration inflicted by our outer ears— which selectively boost some frequencies, and cut others, in a unique way for each height direction. This height information is distorted when two speakers are playing the same sound simultaneously from different directions.
Can stereophonic listening get any better?
All of this said, after listening to many 3D-sound music demos using various numbers of speakers, I’ve almost come to appreciate stereo listening even more. I do believe there exists great potential to improve on the stereo system using a larger number of speakers. With a surround/3D sound system, it becomes possible to transport the listener out of the loge of the stereo system, and out into the space of the recording venue.
But I also believe that for pure musical enjoyment, the direction of improvement may not lie within the ability to pan musical instruments to weird places above or behind the listener. But rather, a better listening experience may come down to reducing the adverse effects of phantom image sources, enlarging the sweet spot, and producing a more convincing and enveloping recorded room/ambient sound.
– Viktor Gunnarsson, Senior Research Engineer at Dirac Research