|Home | Audio Magazine | Stereo Review magazine | Good Sound | Troubleshooting
Departments | Features | ADs | Equipment | Music/Recordings | History
by DUANE COOPER [Associate Professor, Electrical Engineering and Physics, University of Illinois, Urbana, III. ]
The accuracy of the reproduction of a Sound Field is dependent on the number of discrete channels used in the system. In this article, the author examines some psychoacoustic factors involved when using two, three, or four channels.
TWO APPROACHES MAY BE TAKEN in seeking to determine the number of channels needed for a true stereophonic reproduction-a reproduction characterized by the feeling of actually being in the presence of the original in its acoustic setting. The first is the objective, or holographic, approach, and the second is the subjective, or psychoacoustic, approach. A reconstruction of an adequate approximation to the original three-dimensional sound field is the goal of the first, whereas the goal of the second is to determine only those psychoacoustically-relevant attributes of the sound field that need be recorded for optimal presentation to the listener.
Both approaches may be seen in other fields. In color photography, for example, the little-used objective approach would require the reproduction of the physical color spectrum actually reflected by the colored object, whereas the subjective approach requires that only the appropriate balance be struck among the three color stimuli known to characterize human vision. Although there may be subtle human vision effects glossed over in the latter approach, it produces such satisfying results that only a major revolution in technology, or in understanding human perception, can renew the pressure to seek a new equilibrium in satisfaction, whether through reviving the objectivist program or refining the subjectivist program.
Audio engineering appears to have reached the stage at which multi-channel technology and newly-appreciated auditory phenomena have renewed the pressure towards a more satisfactory stereophonic reproduction. It is too early to tell when the new equilibrium in satisfaction will be obtained, or what system configuration it will require. While ultimate answers are not yet available, current sign posts do indicate directions in which it appears sensible to proceed.
The objective approach involves the erection of imaginary living-room boundaries in the concert hall, say, and the covering of the boundary surface with outward-directed microphones, each with its own recording channel. Then, in the actual listening room, preferably anechoic, the reproducing channels would excite loudspeakers in positions corresponding to those same microphone locations. A recent trial was made by Camras  He used twelve channels to obtain a very realistic effect. Of these, he than sought to select the ones most apt to provide a sufficient approximation, judged by listening, to objective reproduction. It was found difficult to remain content with fewer than four channels for the front and two for the back.
The subjective approach would remove the constraint that the microphones and the speakers be placed in corresponding places, or even that they be equal in number. Instead, the psychoacoustically important characteristics of the original would be recorded with microphones placed to serve that end. Similarly, the speaker placement would be designed to serve the psychoacoustic goals of providing an optimum presentation of those characteristics. For equal satisfaction, the requisite number of channels in the two approaches could well be quite different.
Of the large mass of psychoacoustic data in the literature, those in two papers have been pointed out by Madsen as indicating well-established effects clearly relevant to the perception of the directional qualities of direct and reverberant sound fields. These are the effects studied by Haas and by Damaske. In the usual two-speaker stereo setup, relative intensities govern the direct sound image localization, but, as Haas showed, if the two sounds reach the ear with differing time delays, the localization is shifted towards the earlier source.
For relative delays less than 2 milliseconds, an increase in intensity for the later sound can tend to restore the original localization at the expense of some increase in uncertainty. For steady-state signals, corresponding effects may be related to phase. The well-known instabilities in ordinary-stereo image localization, with respect to changing listener positions may be explained in this manner.
For relative delays in the range from. 2 to 20 milliseconds, the ,localization of the image at the first source is very pronounced, requiring quite substantial loudness increases to displace it to the second, and there is an increasing tendency, especially for delays beyond 20 milliseconds, for a double sound to be heard. If the double sound is not heard, however, the second source does contribute to the overall impression of loudness (integrating effect). Madsen used this integrating effect to help explain how a complicated reverberation pattern of time delays could be psychoacoustically equivalent to a much simpler pattern. A group of time delayed replicas of a first sound, produced by reflections within the concert hall and arriving within any 20-millisecond interval, would be heard as one by the listener. However, such groups would have a highly diffuse character with respect to sounds arriving within the very first 20 milliseconds, which first group would be localized at the source.
As Madsen points out, this confusion is caused by incoherence phenomena whose effects were studied by Damaske. Damaske produced pairs of mutually incoherent sounds by placing a pair of spaced microphones in a reverberant room excited by a single loudspeaker. The signals from these would drive a pair of loudspeakers in an anechoic chamber.
Subjects placed there would measure the detectability of one of these sounds relative to the other. The more widely different delay pattern (obtained with the greater microphone spacing) between the two sounds made for a greater detectability of one relative to the other. It also made for a greater feeling of vagueness as to localization.
Damaske also determined the optimal placement for the two speakers for the maximum detectability of one of these mutually-incoherent sources relative to the other. With one source in front, the movable source was equally difficult to detect in back as in front, but it was 23 dB more detectable if placed at the side.
With one source at the side, the movable source was 16 dB more detectable at the front than if placed at the same side, and 19 dB more detectable if placed at the opposite side. In such maximum detectable positions, the impression was reported of a sound that was diffused, appearing to come from everywhere.
Taken together, the Haas effects and the Damaske effects show that a very complicated source pattern, including reverberation sources, can be psycho acoustically equivalent to a very few sources suitably placed. Thus, short cuts around the more elaborate objective reproduction requirements begin to appear. Some of these requirements, namely that hall sounds, coming from the back, ought to be reproduced from the back, are actually seen to be psychoacoustically irrelevant, undoubtedly related to the uni-axial placement of the ears on the head, despite minor front-back asymmetries.
Requirements for Stereo Reproduction
At this point it is possible to begin enumerating the sources needed in the listening setup for stereo reproduction. If a feeling for the acoustical setting (ambiance) is to be obtained, then side speakers must be used, and two of these must be used if the impression is to be symmetrical. According to the Damaske effects, these should suffice for the presentation of ambiance, while neither front nor back speakers would make a significant contribution.
The side speakers will not, however, provide for a satisfactory localization of the direct sound. The direct-sound images would be too widely spread and too unstable in localization with respect to changes in listener position. Madsen's solution was to avoid using the side speakers for direct-sound localization and to rely solely upon a normally-placed stereo pair of speakers, up front, for that. To ensure that sole reliance, he delayed the reproduction through the side speakers by an amount equal to the propagation time from front speaker to side speaker (about 12 milliseconds) . This ingenious use of the Haas effect meant that the side speaker could carry the same information as the corresponding front speaker, as in Fig. 1, with the only net effect being that the side speakers would contribute whatever ambiance information that had been recorded, and which the front speakers could not provide. The direct sound would be affected only in loudness, not directionality, by the side speakers. The effect is stable, natural, and impressive in its recreation of ambiance from ordinary two-channel recordings.
Another approach is to allow the side speakers to contribute to direct-sound localization, but to use additional speakers to focus and stabilize that localization.
It has been found, with widely-separated front speakers, that such focused stabilization may be obtained with the help of a sum-connected center speaker. Thus, it is natural to think of using a single sum connected front speaker to augment the side-speaker pair, as in Fig. 2. Since only two channels are required, this triphonic speaker setup may be easily tried with ordinary recordings.
At first, the results are disappointing; in contrast to the Madsen system, in which the side speakers rarely attract attention, the triphonic side speakers are readily sensed as independent sources, and the usual well-ordered stereo effect is not obtained. This impression persists until mono recordings are tried. With these, a good front-and-center localization results, and the reproduction of ambiance, combined with a feeling for depth, is truly astonishing.
The mono experience indicates that a proper stereo spread would be obtained with a reduction being made in the electrical separation between the two channels. It is easy to show that, with the usual evenly-spread stereo recordings, a separation limited to about 6 dB is sufficient to obtain that same even spread with triphonic reproduction. In comparison to normal stereo, the spread source, instead of having a flat perspective--pasted against the wall, so to speak--has remarkable depth and ambiance.
The amount of ambiance seems to depend upon the amount that had been recorded and to characterize the recording site, often being less for chamber music, for example. Even then, solo voices have a three-dimensional quality.
Because of the Haas effects, the amount of blending required to reduce the electrical separation of the side speakers will depend upon their physical separation ( distance) relative to that for the front. For a greater physical separation, more electrical separation will be required. For very large distances, the system will be more like the Madsen system, and it may be necessary to replace the blended front speaker with a moderately-spaced stereo pair.
In both the Madsen system and the triphonic system, stereo separation is not required for the ambiance information.
The Damaske effects show that it is merely necessary to present the direct sound from the front and the ambiance from the side. In the triphonic system, the presentation of the direct sound is "steered around" to the front to appear as phantom sources, so that its physical appearance at the sides does not mask the ambiance. Curiously enough, the ambiance still comes through if the head is directed at a side speaker, evidently because there remains a front-side presentation of sounds containing mutually incoherent components. The direct sound then appears to come to the head from the side ( front of the room, of course), while the ambiance remains non-localizable. Upon moving close to a side speaker, the direct-sound localization shifts as usual, but the ambiance persists as before until the listener comes very close indeed.
With such little electrical separation between side speakers, reverberation overhang, which is found to contribute negligibly to the feeling of ambiance, is found to come from the front. This also happens in the Madsen system if there is no electrical separation between the side speakers. Any localization for overhang is false, of course, and results because the Haas-effect integration does not extend over a sufficient interval to allow the incoherence between direct and reverberant sounds to be compared if the direct sounds have been silent for a long enough time. This localization does not appear if the separation may be maintained for the ambiance information.
Recordings made specifically for the tri phonic system would not require the use of any blending in reproduction, and could be made with rather little separation for the direct sound, but retain full separation for the ambiance information.
Recording for the Triphonic System
A way in which recordings may be made specifically for the triphonic system is shown in Fig. 3. The two unidirectional microphones are closely spaced to make possible the derivation of a well focused center channel, free of spurious out-phasing effects. Thus, a good mono compatibility should obtain for the direct sound, while the angling of these microphones should provide good separation.
Some degree of blending should be provided. The two bidirectional microphones are widely spaced with the direction of their null response oriented toward the direct-sound source. Their spacing should ensure a high degree of mutual incoherence with each other and with the direct sound pickup. This ambiance-information pickup may have its separation enhanced by mixing some difference signal in opposing phases in the two ambiance channels (anti-blend). The final mix could provide stereo amplitude contours like those shown in Fig. 4.
Compatibility with ordinary mono and stereo reproduction is desirable. As noted, the mono compatibility is directly provided; the exalted ambiance separation is not important for mono, since little ambiance would be available anyway.
The same is true for ordinary stereo, except that the direct-sound separation may seem insufficient to some listeners. It is unfortunate that normal stereo amplifiers are not provided with a blend-antiblend control. In the interim, it will probably be necessary to leave some of the blending to be supplied in the tri phonic reproduction. This compromise would be of little moment if abundant ambiance separation be surely provided in the recording. The need for a blend control is necessary in triphonic systems, anyway, for playing ordinary recordings.
In relation to the requirements of tri phonic reproduction, normal stereo re cording systems are capable of providing a superabundance of separation. Thus, there is a capability for providing direct sound localizations that extend beyond the usual frontal range. Extreme side localization is obviously possible. Since Damaske's results indicate a +10 to -40 degree tolerance for side-speaker placement to provide good ambiance detection, these speakers could be displaced somewhat to the back. The three speakers would then be roughly at the vertices of an equilateral triangle, with the listener in the middle. Figure 2 already indicates such a placement. Then "extreme side," as mentioned above, becomes "side and to the back," in localization.
With the two side speakers carrying oppositely-phased, direct-sound information, that sound would appear to come from the back. Actually, the experience with oppositely-phased loudspeakers is that the localization is not well-defined.
It appears to come from somewhere else than the space between the speakers, and the listener tends to deny a frontal localization. At the same time, the sound does not have the same diffuse quality noted for ambiance information. The localization is better described as unstable; it is this property, taken with a liability to front-back ambiguities inherent in the uniaxial ear configuration, that, because of a willingness to deny a frontal localization, makes the listener resolve the ambiguity in favor of the back. With the side speakers displaced somewhat toward the back, the acceptance of a back localization is unhesitatingly made.
A center-back localization will rarely be desired, however, since that would force the sum channel to be silent for that source, and void the mono compatibility. Off-center back localizations entailing a minor diminution in the sum channel will be more acceptable in the mono reproduction. A difference-connected loudspeaker may be actually placed at center-back as in the Dynaco proposal° Such a speaker would, because of the Damaske effects, make no contribution to ambiance, and, since it would be in aiding phase with one or the other of the side speakers, it would tend to produce a further off-center bias in the back localization. Thus, such a speaker probably would make no worthwhile contribution, but that is a question for further study. The exploitation of these localization possibilities will require microphone and mixing techniques more elaborate than shown in Fig. 3. For the triphonic system, however, the usual matrix-mixing formulations appear to stand in need of modification because of the greater psychoacoustic weight attaching to the side speaker placement. It is just this extra weight which avoids any necessity for loudness-dependent matrix-steering circuits in playback.
Experimenters can easily try the tri phonic system for playing ordinary recordings, since no specialized equipment is involved, and many stereo amplifiers already supply a sum-channel output. In some cases, this is a powered output, while in others it is intended for driving a separate amplifier. Where the sum channel is not provided, one is easily derived, via a resistor network to excite a separate amplifier, or with the help of available transformers designed to provide a powered sum channel. Also, if the stereo amplifier makes use of a common return lead, and the same-model speaker may be used in the three locations, the Dynaco hookup (Fig. 5) is convenient.
This last hookup introduces a degree of anti-blend so that the blend circuit is mandatory. As noted, however, blending is needed in the triphonic arrangement, in any case.
The experimenter will find many ordinary stereo recordings with a sufficiently focused stereo perspective to permit triphonic reproduction, upon suitable blending, with a direct-sound directionality very close to that intended by the recording director. The depth and ambiance will lend a naturalness that the listener may feel should have characterized stereo reproduction at the outset.
The restoration of mono recordings is equally satisfying, and many will find the stereo spread, missing from such recordings, to be not so keenly missed after all. There will also be found a few recordings to contain some unexpected "far-out" effects.
Recordings likely to exhibit such far-out effects are those deriving from the use of a multiplicity of microphones recorded in a multiplicity of channels with elaborate mixing arrangements, both prior to recordings, and in stages leading to the final two-channel mix. Such techniques, increasingly common in recording rock groups, for example, can result in an unplanned, wide assortment of phasing combinations. The various instrumental and human voices often, in triphonic reproduction, then appear from a great variety of well-defined directions, although some of them may jump from one location to another, even from front to back.
While not intended by the recording director, such effects can be entertaining, and they do demonstrate the potential for a varied direct-sound localization.
Recording engineers may also experiment, of course, to find microphone and mixing techniques to bring such far-out effects under control, and to be able to obtain predictable localizations with tri phonic reproduction, while, at the same time, maintaining reasonable compatibility with ordinary mono and stereo reproduction.
The ultimate answer to the question as to the number of channels needed for stereophonic sound reproduction has not been obtained. It has become clear, however, that, if the limited knowledge already at hand, regarding the subjective properties of human hearing, are properly exploited, then the existing two channels are capable of presenting a richness of information hitherto unsuspected. A three-speaker, or triphonic, means of displaying that information has been explained that is readily tried with ordinary recordings to demonstrate the potentialities. Recording techniques to optimize the use of these two channels in triphonic reproduction have been partially sketched.
It has also become clear that the objective approach can readily prescribe more recording channels than necessary for a specified degree of satisfaction.
Four channels, once thought necessary to the reproduction of ambiance, can be replaced by two, or even one. .Moreover, the objective thought that speakers should be placed at the back is seen to be psychoacoustically faulty for ambiance reproduction. It is unfortunate that it has taken a decade of two-channel stereo practice before even the beginnings of an adequate exploitation of two channels could now be recognized. Doubtless, much remains to be appreciated about the relevant psychoacoustic phenomena, so that the art of two-channel recording would appear to be still in its infancy. So long as this is the case, a consideration of the introduction of additional channels would appear premature.
What are needed are further sign posts. Madsen's elucidation of ambiance perception, and the potentialities for wide-ranging direct-sound directionality shown by the triphonic system, are beginnings. Upon consolidation of these gains, it may be possible to provide rational formulations of psychoacoustic requirements to be met by adding further channels. Without deeper study, such formulations do not immediately come to mind, but it is possible to wish for a greater stability of the direct-sound image against varying listener positions. With sharper requirement formulations and corresponding deeper understandings, the psychoacoustic relevance of adding further channels would be less clearly an open question. Until then, two channels have a lot to offer, more than had been supposed.
1. J. S. Friedman, History of Color Photography, The American Photographic Publishing Company, Boston (1944).
2. M. Camras, Approach to recreating a sound field, J. Acous. Soc. Am, 43, 14251431 (June 1968).
3. E. R. Madsen, Extraction of ambiance information from ordinary recordings, J.A.E.S., 18, (in press).
4. Helmut Haas, Uber den Einfluss eines Einfachechos auf die Hörsamkeit von Sprache (On the influence of a single echo on the intelligibility of speech), Acustica, 1, 49-58 (1951).
5. P. Damaske, Subjektive Untersuchung von Schallfeldern (Subjective investigation of sound fields), Acustica, 19, 199.213 (1967/ 68)
6. David Hafler, A new quadraphonic system. Audio, July, 1970.
7. A number of readers who have tried the Dynaco arrangement do, in fact, find that there is a better ambiance.
(adapted from Audio magazine, Nov. 1970)
= = = =
Prev. | Next