|Home | Audio Magazine | Stereo Review magazine | Good Sound | Troubleshooting|
[article by Floyd Toole, orig. from Audio magazine 100 anniv. issue, May 1997]
This article deals with the problems of traditional 2-channel stereo sound and multichannel approaches to solving them.
The capture, storage, and reproduction of musical and other acoustical events has been an obsession of the audio industry for its entire existence. At first it was amazing that any sound could be captured and reproduced. With the passage of time and advancing technology, we became fussier, demanding timbral accuracy, an absence of noise and distortion, realistic dynamic range and bandwidth, and so on. With stereo came some limited impressions of direction and space. Now we demand more—more realism, more dramatic effects, and more listeners to share the auditory experiences. This article will examine our progress in meeting these objectives. (Part II will look at binaural hearing and related issues.)
In the beginning, there was monaural (it means, literally, one ear—we actually listen binaurally, through two ears, no matter how many channels are used). Everything we heard was stored in and reproduced from a single channel. In those early days of mono, listeners enthused, and critics applauded the technical accomplishments of Thomas Edison, Emile Berliner, and others as being the closest possible to reality. They were wrong, but clearly a revolution in home entertainment had taken place.
Monophonic reproduction conveys most of the musically important dimensions—melody, timbre, tempo, and reverberation—but no sense of spatial envelopment, of being there. In the 1930s, the essential principles by which the missing elements could be communicated were understood, but there were technical and cost limitations to what was practical. It is humbling to read the wisdom embodied in the Blumlein-EMI patent [ applied for in 1931, which describes two-channel stereo techniques that would wait 25 years before being exposed to the public. Then there are the insights of the Bell Telephone Laboratories scientists, who, considering the reproduction of auditory perspective, concluded in 1934 that there were two alternative reproduction methods that would work: binaural and multichannel.
By binaural, the Bell Labs scientists meant the technique of capturing a multidimensional sound field by using microphones at the ear locations in an artificial head (thereby encoding all of the directional cues in the left- and right-ear signals) and reproducing those signals through headphones. The listener’s ears would then hear what the dummy head “heard,” so that, in theory, perfect auditory perspective would be communicated.
Multichannel reproduction is more obvious, since each channel and its associated loudspeaker creates an independently localizable sound source, and interactions between them create even more. Inevitably, the question arose: How many channels are necessary? Bell Labs scientists concluded that a great many channels would be necessary to capture and reproduce the directional and spatial complexities of musical events. Being practical, they investigated the possibilities of simplification and concluded that, while two channels could yield acceptable results, three channels (left, center, and right) would be a desirable minimum to establish the illusion of a stable front soundstage, especially for a group of listeners. It is important to note that there was no attempt to re-create a surrounding sense of envelopment.
By 1953, ideas were more developed, and in his paper “Basic Principles of Stereo phonic Sound” William Snow describes a stereophonic system as one having two or more channels and speakers. He says, “The number of channels will depend upon the size of the stage and listening rooms, and the precision in localization required.” Snow goes on to say that “for a use such as rendition of music in the home, where economy is required and accurate placement of sources is not of great importance if the feeling of separation of sources is pre served, two-channel reproduction is of real importance.”
Thus, two-channel reproduction was known to be a compromise — ”good enough for the home,” or words to that effect. So what did we end up with? Two channels! The choice had nothing to do with scientific ideals, but with technical realities: When stereo became commercially available, nobody knew how to store more than two channels in the groove of a record.
Around that same time, however, the film industry was highly motivated to do better, and several major movies were released with multichannel surround sound accompanying their panoramic images. These were discrete-channel soundtracks recorded on magnetic stripes added to the film.
Although these soundtracks were very successful artistically, the technology languished because of the high costs of production and duplication. The industry reverted to monophonic optical soundtracks, at least until the development of the “dual bilateral light valve.” This device enabled each side of an optical soundtrack to be modulated independently, thus accommodating two channels. Once that barrier was surmounted, film sound tracks moved beyond two-channel stereo relatively quickly. And in the end, it was the film industry, not the audio industry or audiophiles, that drove the successful introduction of multichannel home sound re production. On the way, however, it learned much from the earlier missteps of others.
Multichannel Sound—First Try
The arrival of two-channel stereo in the ‘50s was a revolution, even though recording techniques being used at the time frequently resulted in hole-in-the-middle soundstages and exaggerated left/right effects. Conventional stereo is not blessed with an underlying encode/decode system or philosophy; it is merely a two-channel delivery mechanism. Over the years, micro phone and mixing techniques have evolved, but the struggle to capture, store, and reproduce a realistic sense of direction and space from two channels and two speakers has been a mighty one. There has been no single satisfactory solution, as is evidenced by the diversity of microphone techniques, signal processors, loudspeaker designs, and “tweaks” that have come and gone, as well as those that survive.
What can one say about a system that accommodates speakers having directional characteristics ranging from omnidirectional through bidirectional in- phase (so-called bipole), bidirectional out-of-phase (dipole), and predominantly backward-firing, to forward- firing, with a variety of directivity characteristics within each of those broad categories? The nature of the direct and reflected sounds arriving at the listeners’ ears from these different designs runs the entire gamut of possibilities. This is not really a system at all; it is merely a foundation for individual experimentation. The history of two-channel stereo is littered with examples of efforts to generate a more engaging sense of envelopment and depth—some acoustical, some electronic, and some that appear to operate simply on faith. Remember the Hafler system sold by Dynaco? And Carver’s Sonic Holography. Nowadays we have SRS, Spatializer, and hosts of digital signal processors (DSPs) that offer dimensional embellishments. We can only conclude that, in a multichannel system, two channels are simply not enough.
Added to these fundamental problems is the inconvenience of the stereo seat, or “sweet spot.” Two-channel stereo is an essentially antisocial system; only one listener can hear it properly (Fig. t). If one leans a little to the left or right, the featured artist flops into the left or right speaker and the soundstage distorts. Sit up straight, and the featured artist floats as a phantom image between the speakers, but the sound quality is altered because of the acoustical cross talk. That is, the sound from each loud speaker travels not just to the ear nearer to it, but to both ears. And when identical sounds radiate from both channels, as hap pens for a center image, there is a comb-filter effect at each ear when the direct sound from the nearer loudspeaker combines with the slightly delayed sound from the opposite speaker. The dominant effect is a distortion of the amplitude and phase response of the center image’s sound. Ironically, no matter how perfect a loudspeaker may be in frequency and phase response, those properties will not be appreciated in the sound of the center image because of an intrinsic limitation of two-channel stereo.
You don’t believe me? Play some mono phonic pink noise and move in and out of the stereo sweet spot. As you move from the left or right toward the center, you will experience phasiness, and as you approach the precise center location, the sound will get noticeably duller as destructive interference creates a dip at around 3 to 4kHz. Fortunately, room reflections help to minimize the annoyance of this effect in most home installations.
In fairness, it must be said that after more than 40 years of experimentation, the best two-channel stereo recordings reproduced over the right set of speakers in the right room can be very satisfying indeed. Sadly, only a fraction of our listening experiences fall into that category, so this is not a long- term solution.
Multichannel Sound—Second Try
In the ‘70s, we broke the two-channel doldrums with a misadventure into four-channel sound called quadraphonics. The intention was laudable: to deliver an enriched sense of direction and space. The key to achieving this goal lay in the ability to store four channels of information in the existing two channels of a vinyl LP and then to recover them.
Two categories of systems were in use at the time, matrixed and discrete. The matrixed systems crammed four signals into the bandwidth normally used for two channels. Something had to give, and as a result, separation was not the same between all channels. In other words, information that was sup posed to be in only one channel would appear in smaller quantities in some or all of the other channels. For the listener, the result of this channel leakage, or crosstalk, was confusion about where the sound was coming from. I well re member feeling as though I were inside a cello while listening to one of my quadraphonic LPs.
Various forms of signal-adaptive “steering” (a technique for routing signals in preferred directions) were devised to assist the directional illusions during the playback process. The alphabet soup is memorable:
SQ from CBS, QS from Sansui, EV-4 from Electro-Voice, and others. Peter Scheiber, a musician with a technological bent, figures prominently as a pioneer in the matrix game, with his patented encoder and de coder ideas incorporated into many de signs. The best matrix systems were remarkably good in creating the impression of four completely separate, or discrete, channels. However, matrix processing breaks down when there is a demand for several simultaneously occurring discrete images.
The sweet spot was now constrained in the front to back direction as well as the left to right.
Ultimately, there is no substitute for entirely separated channels. But getting four discrete channels into the grooves of a vinyl LP required that the recorded bandwidth be extended to about 50 kHz, which was quite a challenge. Nevertheless, it was accomplished in JVC’s CD-4 system, and although this quadraphonic format did not survive, the technology necessary to achieve the wider bandwidth did have a lasting benefit on the quality of conventional two-channel LPs. Half-speed cutting processes, better pressings, and playback cartridges with wider bandwidth and reduced tracing and tracking distortions were to live on. Discrete multichannel tape recordings were available, but open-reel tape was a nuisance, to say the least, and high-quality packaged tape formats (such as cassettes) were not yet ready for high-fidelity multi-channel sound.
Years passed, with the audio manufacturers unable to agree on a single standard.
Eventually, the whole thing dissolved into competitive squabbles. The industry lost a lot of money and credibility, and customers were justifiably disconcerted.
Although the failure of quadraphonics was regrettable, it has to be said that the system was not well founded psychoacoustically. Lacking an underlying encode/decode rationale, quad simply compounded the problems of two-channel stereo. There were even naive notions of panning images front to back using conventional techniques. The quadraphonic square array—of left and right, front and rear—created a more complex, but still antisocial, system (Fig. 2). The sweet spot now was constrained in the front-to-back direction as well as the left-to-right.
In addition, there was no center channel, a basic requirement if the stereo seat is to be eliminated. And placing the additional channels behind the listener is not the best arrangement for generating envelopment and a sense of spaciousness. Placement to the sides is better. Sounds arriving from the back are extremely rare in the standard repertoire of music, but the need for a credible spatial impression is common; sound from the sides is crucial to the generation of spatial impression. Ironically, the authors of a 1971 paper, “Subjective Assessment of Multichannel Reproduction” [ demonstrated that listeners preferred surround speakers positioned to the sides over ones placed behind them, granting scores that were two to four times higher. It seems as though nobody with any influence read it.
Fortunately, much of the innovation that went into quadraphonics would live on in different forms.
Hollywood to the Rescue
Failure in one market was not enough to kill good ideas, and quad contributed two: multiple channels and adaptive matrixes. Dolby Laboratories was well connected to the real multichannel pioneers, the movie makers, in the application of its noise-reduction system to stereo optical sound tracks. Putting the pieces together, Dolby rearranged the quad channel configuration to one better suited to film use (Fig. 3): left, center, and right across the front, plus a single surround channel, which was used to drive numerous speakers arranged beside and behind the audience. All of this information was stored in two audio-bandwidth channels. With the appropriate adjustments to the encode matrix and to the steering algorithm in the active decoding matrix, Dolby devised the system that has become so familiar in quality films and theaters: Dolby Stereo, or, as it is known in home media, Dolby Surround.
Although they were not explicitly stated, this system was subject to some basic rules that have set a standard for multichannel sound: well-placed dialog in the center of the screen, music and sound effects across the front and in the surround channel. Reverberation and other ambient sounds are steered into the surround channel, as are sounds of aircraft passing overhead and the like. At times the audience can be enveloped in sound (as at a football game), or it can be transported to a giant reverberant cave or gymnasium, or it can be inside the confines of a car engaged in a dramatic chase, or it can be treated to an intimately whispered conversation be tween lovers, where the impression is that of being embarrassingly close. To fully realize such a range of spatial environments requires a flexible multi-channel system, controlled-directivity speakers, and a degree of control over the acoustics of the playback environment. When it is done well, it may not be perfect, but it is remarkably entertaining—and it is not antisocial! The basic format of a front soundstage with enveloping ambience is also the basis for most of our real-life musical experiences, so audiences were immediately comfortable.
It is significant that the characteristics of the encoding and decoding matrixes and the spectral, directional, and temporal properties of the speakers and room (the theater, in this case) all are integral to the functioning of these systems. Fortunately, the film industry acknowledges the need for standardization and so from the outset tried to ensure that sound dubbing stages, where film soundtracks are assembled, would resemble theaters, where audiences are to enjoy the results. Although the industry standards provided a foundation, there were still inconsistencies. This left a need, and an opportunity, for Lucasfilm to establish its THX program to certify the audio performance of movie theaters, so that audiences would have an even greater assurance of quality.
Multichannel Sound—Third Try
With the popularity of watching movies at home on TV, it wasn’t long before Dolby Surround made its way there. Adapting it to the smaller environment required some changes, but nothing very radical (Fig. 4). Reducing the number of surround speakers to two ensured greater consumer acceptance, and recommending placement of these speakers to the sides of the listeners ensured that they would be most effective in creating the required illusions of space and envelopment. Delaying the sound to the surround speakers brought the precedence effect to bear to ensure that, even in a small room, the ambiguously localized surround sounds would be perceptually separated from those in the front channels.
At the outset, a simple fixed-matrix version of the decoding system was available in entry-level consumer systems. The fixed-matrix systems exhibited so much crosstalk among the channels (separation was as little as 3 dB) that listeners were surrounded by sound most of the time, even when it was inappropriate.
Fosgate and Shure HTS brought the first active-matrix decoders to the home theater market, albeit at premium prices. Low-cost integrated-circuit chips eventually brought active-matrix Dolby Pro Logic decoding to the masses, and home entertainment entered a new era. Admittedly, it was audio for movies, but it was multichannel audio nevertheless, and many of us began to appreciate some of the dimensions that were missing from our directionally and spatially deprived two-channel stereo lives.
Dolby Surround was designed for movie soundtracks reproduced in large theaters, and in that role it performs very well in deed. However, once audiophiles get a taste of something attractive, they want more. In this case, the “more” they wanted was realistic multichannel music reproduction in their homes.
Playing conventional stereo recordings through a Dolby Pro Logic de coder was a logical experiment, and most of us have done it. The results are spotty: Some recordings work well, and others don’t. A basic problem is that material mixed without a center channel in mind, when played through a conventional matrix decoder, yields center-channel signals that are perceived to be louder than they should be. The problem lies in the translation from large movie theaters to listening at shorter distances in smaller rooms.
The high-frequency rolloff in the surround channel is also noticeable, and the active-matrix steering is sometimes caught messing with the music. Recordings made specifically for Dolby Surround are better, but even they have failed to establish a large following in the music recording industry. None of this is surprising, but all of it means that we have not yet arrived at a general-purpose multichannel solution.
In a natural succession to its THX pro gram for certifying movie theater sound systems, Lucasfilm established a licensing scheme for certain features intended to enhance, or in certain ways ensure, the performance of home theater systems based on Dolby Pro Logic decoders. Home THX, as it is called, added features to a basic Pro Logic processor and to the speakers used in home theater systems, and it set some minimum performance standards for the electronics and speakers. At a time when the market was being inundated with “cheap and cheerful” add-on center-channel and surround speakers and amplifiers, THX made a clear statement that that would not do; all channels had to meet the same standard.
Tomlinson Holman deserves credit for assembling this amalgam of existing and novel features into what has become a bench mark for consumer home theater.
The Home THX features relevant to this discussion are:
1. High- and low-pass filters to approximate a proper crossover between a sub- woofer and satellite speakers. (Elaborate systems did this anyway, but the THX cross over brought an important feature to the mass market.)
2. Electronic decorrelation between the left and right surround channels. Reducing the number of surround speakers to two and putting them in a small room eliminates much of the acoustical decorrelation (randomization of the sounds arriving at the listeners’ left and right ears) that the many speakers at the sides and rear of a large movie theater accomplish automatically. Substituting electronic decorrelation is a good idea that was, to my knowledge, first introduced in the Shure HTS systems.
3. Timbre-matching of the surround channel to the left, center, and right (front) channels. In my view, this is a dubious feature. Sounds arriving from the sides, or even from random incidences, cannot and should not match the timbre of sounds arriving from the front. It is not natural—the complex shape of the external ears ensures that. However, it is a relatively minor matter in the larger scheme.
4. Re-equalization of the soundtrack to adjust for excessive treble that is usually built into film soundtracks to achieve correct tonal balance in large theaters. A single correction curve was chosen. This is a useful feature, but it should be an adjustable tone control because soundtracks vary in treble balance.
5. The Home THX loudspeaker standard requires some control of the vertical dispersion from the left, center, and right (front) units and a bidirectional out-of-phase con figuration (an approximation of a dipole) for the surrounds. The purpose of the former is to reduce the strength of floor- and ceiling-reflected sounds, and the purpose of the latter is to increase the proportion of reflected sound that is generated by the two surround speakers, thereby compensating somewhat for the fact that there are only two of them. Both of these are good ideas, but some of the implementations have created a belief that somehow they are incompatible with the objectives of good music re production. While there have been some less than worthy examples of home theater speakers, one can easily say the same about conventional “music” speakers. In principle, there should be no reason to differentiate between them. Good design is good design.
Recognizing an opportunity to improve on a good thing, inventors have had a field day manipulating the parameters of the standard surround matrixes, with delays and with steering algorithms, all in an at tempt to finesse the multichannel decoders to be more impressive when playing movies, more compatible with stereo mu sic, or both. In addition to varying the five- channel, five-speaker theme, the more ad venturous designers have augmented the surround system with additional speakers behind the listeners. Most provide for full- bandwidth surround or rear channels. Purists frown on such meddling, especially for film soundtracks, but lots of people, me included, find rewards in the artistry of several of the alternatives.
There have been many of these matrix-system variations. Some are de code-only, relying on Dolby Surround and regular stereo-encoded material for source material. Others are en code/decode systems that have some degree of compatibility with existing systems. All provide multichannel play back of two-channel program material that at least some listeners find attractive at least some of the time. In addition to stand-alone products like Circle Surround, there are proprietary algorithms built into surround processors from numerous companies, such as Proceed and Meridian.
The two systems described below are both long-term survivors and distinguish themselves by having evolved to the point that they include optional features and channels. And they are approaches I am particularly familiar with from my work with the companies at Harman where they were developed.
A veteran of the quadraphonic wars, Jim Fosgate found ways to decode Dolby Sur round soundtracks in a manner that many people found preferable to more conventional means (Fig. 5). Part of the improvement had to do with the responsiveness of the steering logic, and part of it had to do with providing some amount of left and right distinction in the full-bandwidth surround channel. Since there is no such left/right separation in the encoded pro gram, the art has been to judge how much, and when, left and right front information should be directed to the surrounds, with what spectral modifications (if any), and with what delay.
Fosgate practiced his art well and over the years has produced several positively received designs optimized for films and for different kinds of music, all in the analog domain. An interesting feature was the pro vision for separately powering the forward- and rear-firing drivers of the surround “dipoles” to generate more directional and spatial enrichment. His designs can be found in products bearing his own name as well as the Harman Kardon and now Citation brands. Fosgate’s latest effort is called 6-Axis, because in addition to the basic five steered channels, it provides for an optional sixth, behind the listener, to complete the surround effect.
Working independently, and in the digital domain, David Griesinger has done similar things to move beyond the basic Pro Logic process. He is probably best known in professional audio, as the author of the reverberation algorithms used in the Lexicon products found in most recording studios. Griesinger is driven by an intense interest in the physics and psychoacoustics of concert hall acoustics and has been a significant contributor to that area of science, so it is no surprise that his efforts in surround sound de coding and multichannel synthesis are based on his years of studying, synthesizing, and electronically enhancing the acoustics of concert halls. Accentuating the desirable aspects of complex multidimensional sound fields while avoiding undesirable artifacts is the essence of both endeavors.
The result is a suite of film- and music- playback algorithms embodied in Lexicon digital surround processors. Griesinger’s current effort is called Logic-7, since it pro vides for two additional channels and speakers behind the listener (Fig. 6). Using a sophisticated detection and steering process, these extra channels and rear speakers are supplied with strongly uncorrelated sounds—such as reverberation, applause, and crowd sounds—or with sounds that are strongly directed to move from front to surround or vice versa. Thus, the listeners (yes, these are still very much social systems) are treated to a truly enveloping sense of ambience and to occasional sounds that sweep dramatically forward or backward, even with appropriate left or right biases. An important focus in the continuing development of Logic-7 is the quest for compatibility in multichannel reproduction of film soundtracks and music as well as between two-channel and multi-channel reproduction of stereo music mixed for two channels.
The few samples of discrete multichannel recordings from the quadraphonic era were sufficient to generate a lasting desire, if not an outright lust, to develop a viable format that did not suffer from leakage, or crosstalk, among the channels. Today we are experiencing a version of that dream in the form of Dolby Digital, also referred to as AC-3. This system was designed for soundtracks and is widely used in that capacity for motion pictures. A consumer version is now available on laserdiscs and DVD, and other carriers, including HDTV, will follow. Following the basic geometry of the existing multichannel system, Dolby Digital (Fig. 7) incorporates five main channels, including separate left and right surround channels. All channels are completely discrete and full-bandwidth, offering multi- channel producers enormous flexibility. A sixth channel is used for occasional, very powerful low-frequency sound effects and is inherently bandwidth-limited. Thus we end up with the 5.1-channel appellation. In home systems, the LFE (low-frequency effects) channel normally is blended with low frequencies from the five main channels and routed to a subwoofer that handles all the deep bass.
In Europe, the MPEG-2 audio standard provides for multichannel audio that can be either five or seven channels. In the seven-channel mode, the additional channels are interpolated between the center and left and center and right front channels. It is difficult to imagine this configuration becoming popular for home applications, however. A better use of the bandwidth might have been to add some truly rear channels, as in some of the aforementioned enhanced matrix schemes. In any event, MPEG-encoded sound will be the standard for future DVD releases in Europe, with Dolby Digital an option.
Digital Theater Systems’ DTS and Sony’s SDDS systems have established presences in the professional domain, as the multichannel formats for numerous feature films. On the consumer side, DTS-encoded sound tracks are available on some laserdiscs and may be included on some DVD releases as a supplement to the standard Dolby Digital soundtrack. DTS has also been promoting its system for music, with a small but growing catalog of multichannel CD releases.
All of these discrete systems are really transparent transport media; none of them incorporates or is based on an underlying method for encoding and decoding spatial information. All of the matrix systems discussed up to now put serious constraints on the creative process and, indeed, were a part of that process. Discrete systems have no such limitations. In fact, recording engineers have had to learn new techniques, and need new production tools, to re-create some of the illusions with which we have become familiar in the matrix systems. In short, we have entered a new realm of multichannel entertainment, wherein what we hear will be almost entirely the result of individual creative artistry in the recording process and its interaction with the particulars of the playback systems. And since there are no standards whatsoever, we can expect considerable variety in the results, including some examples of extremely bad taste. Be prepared.
As multichannel transport media, however, these systems are potentially wonderful. They can store audio data encoded in forms designed to entertain large audiences (such as conventional film soundtracks) or audio data intended to reconstruct a three- dimensional sound field (such as the elaborate forms of Ambisonics) or for formats yet to be invented. They rep resent a freedom that we have never had before.
All of these systems are scalable— that is, they can be designed to fit into different channel or storage capacities. There are two ways to achieve this, and both are used.
Lossless data compression makes use of redundancy and signal variability to fit information into less storage space and then recover it, perfectly, during playback. Perceptual encoding, on the other hand, achieves data reduction by taking advantage of both simultaneous and temporal masking in our hearing systems. It is well known that loud sounds prevent us from hearing weaker sounds. If we know the rules governing this phenomenon, we can simply eliminate—or at least encode more simply—those small sounds that are normally masked. Either way, we can at tempt to store the same perceived sound in less space. The more aggressive the data reduction, the more likely that listeners will be aware that the signal has been modified—that something has been edited out.
High-end paranoia would have it that perceptual coding is intrinsically flawed. But having participated in comparative listening tests of Dolby Digital, DTS, and
MPEG-2, I can state categorically that among those systems, at least, the differences are not obvious. Even in the fairly aggressively data-reduced material I’ve heard, audible effects were quite infrequent and limited to certain kinds of sounds only. And the effects were not always describable as better or worse; sometimes they could be identified only as different. Naturally, it is possible to go too far, and in the most extreme examples of data reduction, things start sounding pretty bad. Needless to say, there is no reason to encumber our audio futures with systems that are annoying to listen to. However, I was frankly amazed at just how durable our auditory processes are and concluded that perceptual coding, if applied in moderation, is not a fatal flaw—in fact, it may not be detected at all.
In retrospect, perhaps one should not be totally surprised by this. After all, we have lived for many years with vinyl LP records that performed “data expansion,” adding information to the music in the form of crosstalk, noise, and distortions of every imaginable kind. It is mainly because of those very same masking phenomena exploited in data reduction that those distortions were perceptually attenuated and we were able to derive a great deal of plea sure from our LP records.
Fortunately, in the digital domain, all things tend to become possible at lower prices and higher speeds. With the end of this trend not yet in sight, it may be that the need for data reduction in critical applications will simply disappear eventually.
Too Much of a Good Thing?
Those of us who remember the quadraphonics debacle get a little queasy when we see what is going on presently. Could this wonderful progression to digital, discrete multichannel sound be stalled or stymied by a lack of agreement? Possibly, is one answer. No, is the one I prefer to believe. The reason is that now we are operating in the digital domain, and things are fundamentally different.
Personal computers have become general-purpose platforms on which we can run many programs: word processors, games, and so on. The day is fast coming when audio playback devices can have that kind of flexibility. It is entirely feasible to have the playback device read a code at the head of a program and configure it self to do the appropriate kind of decoding. We are not there yet, but the technology is available, and many of us believe that is the way things can, and should, go.
Digital, discrete multichannel storage capability should not carry with it any restriction as to the kind of signals that are stored. In a two-channel matrix system, that was not the case; the encoding was part of the storage process. Now it should be possible to envisage a six- or (pick any number) channel system that could store three two- channel programs (stereo, binaural, or Dolby Surround, for example), or a four-channel version of Ambisonics and a two-channel program, or a 5.1-channel discrete pro gram, or...
Suffice it to say that, because technology is changing, it is now not so necessary to establish hard universal standards. We could have several formats, each optimized for different applications, ranging from uncompromised professional and high-end audio formats to those that have been adjusted in various ways to fit the cost and bandwidth limitations of portable, broadcast, or network distribution media. In the short term, there will likely be some angst, but in the long term, it is my sense that these are technical problems that will find appropriate and affordable solutions. Place your bets now.
= = = = The Ambisonics Alternative = = = =
There are two parts to the Ambisonics premise. The first is that, with the appropriate design of microphone, it would be possible to capture (record) the three-dimensional sound field existing at a point. The second part is that, with the appropriate electronic processing, it should be possible to reconstruct a facsimile of that sound field at a specified point within a square or circular arrangement of four or more speakers. Therefore, this system distinguishes itself from all others in that it is based on a specific en code/decode rationale.
Several names are associated with the technology. Duane Cooper first patented the basic idea for this form of surround sound. Patents were also granted to Peter Fellgett and Michael Gerzon, who were working simultaneously and independently in England. Peter Craven contributed to the micro phone design, and aided by some government sponsor ship, the United Kingdom group commercialized the Ambisonics recording and reproduction system.
Ambisonics is an enticing idea, and the spatial algebra tells us that it should work. And it does, up to a point. Ambisonics has enthusiastic supporters, but it remains a niche player in surround sound. Most people know little or nothing about it, although there are some Ambisonics encoded recordings.
The scarcity of playback decoders is a clear problem. However, there are other considerations that may be significant. Ambisonics requires special recordings and playback apparatus. It is incompatible with other multichannel systems (although it need not be). And it ends up entertaining a single listener. Mind you, that listener can be well entertained.
I have heard the system several times in different places (including a precise setup in an anechoic chamber), and I will admit that with large, spacious classical works it creates an attractively enveloping illusion for a listener with the discipline to find and stay in the small sweet spot. It tolerates a certain amount of moving around, but leaning too far forward results in a front bias, leaning too far backward creates a rear bias, leaning too far left—well, you get the idea. Big, spacious reverberant recordings are more tolerant of listener movement, of course. All of this should be no surprise for a system in which the mathematical solution applies only at a point in space, and then only if the setup is absolutely precise in its geometry and the speakers are closely matched in both their amplitude and phase responses.
In fairness, there are numerous ways to encode and store the Ambisonics signals and other ways to process the signals into forms suit able for reproduction from different numbers of speakers in different set ups. All of these I have not heard. Ambisonics may yet play a role in our audio lives. Certainly having multiple, discrete digital channels within which to Store data can only be an advantage for it. As it has been demonstrated, however, there seems to be a lot of paraphernalia for just one listener.
= = = =
1. “Improvements in and Relating to Sound-Transmission, Sound-Recording and Sound-Reproducing Systems,” British Patent No. 394 325, granted to Alan Blumlein and EMI, 1933; reprinted in Journal of the Audio Engineering Society, April 1958 (Vol.6, No. 2).
2. Steinberg, J. C. and W. B. Snow, “Auditory Perspective—Physical Factors,” Electrical Engineering, January 1934 (pp. 12-17),
3. Snow, W. B., “Basic Principles of Stereophonic Sound,” IRE Transactions—Audio, March/April 1955 (Vol. AU-3, pp. 42-53).
4. Hafler, David, “A New Quadraphonic System,” Audio, July 1970.
5. Carver, Robert W., “Sonic Holography,” Audio, March 1982.
6. Nakayama, T., T. Miura, O. Kosaka, M. Okamoto, and T. Shiga, “Subjective Assessment of Multichannel Reproduction,” JABS, October 1971 (Vol. 19, No.9, pp. 744- 751).
7. Cooper, D. H. and T. Shiga, “Discrete Matrix Multichannel Stereo,” JAES, June 1972 (Vol. 20, No. 5, pp. 346-360).
8. Gerzon, M., “Ambisonics in Multi-channel Broadcasting and Video,” AES Preprint No. 2034, 74th Convention, October 1983.
9. Web sites with comprehensive bibliographies and other useful information on Ambisonics and related subjects.