|Home | Audio Magazine | Stereo Review magazine | Good Sound | Troubleshooting|
Fostle Fuels Fire (from AUDIO magazine, LETTERS, JULY 1996)
A fundamental flaw underlies both Michael Riggs’s comments in “Fast Fore Word” and D.W. Fostle’s analysis in “Digital Deliverance” in the April [1996, Audio magazine] issue. That flaw is a total lack of any kind of direct A/B/C comparison of the audible accuracy of HDCD A/D and D/A conversion versus that of any other process of A/D and D/A conversion—using a high-quality analog input source, such as a mike feed or first- generation analog master tape, as a reference. Unless the analog input source is available as a reference, declaring that one digital recording sounds better than another is pointless, as it may simply have colorations preferred by the listener.
The science and art of audio engineering deal with sound, and the only way sound can ultimately he judged (or heard!) is by listening. This is not to say subjectivism is the sole means to advance the state of audio technology; an enormous amount of technical effort must be spent in identifying and quantifying mechanisms of distortion be fore they can be corrected. That type of effort was a constant during the almost 10 years it took to develop the HDCD process. However, even if a device or system measures extremely well, it can still audibly alter the input source because of distortion mechanisms not quantified by the tests employed. Thus, the final criterion of quality in an audio reproduction system must be how much it alters, by subtraction or addition, the signal fed into it, as determined by controlled listening comparisons.
Some of the greatest appreciation of the accuracy of the HDCD process has come from such top professional recording engineers as Bob Ludwig of Gateway Mastering and Denny Purcell of Georgetown Masters. These engineers have built careers on the acuity of their hearing. Unlike almost all reviewers and writers in the audio press, they make direct A/B comparisons—analog source versus digital output—on a daily basis. They know the limitations of today’s digital audio technology because they are intimately familiar with the analog source.
Purcell has said that “HDCD sounds closer to the analog source even when played back undecoded, period.” Ludwig has stated that HDCD conversion is the most accurate he has heard and that it has “greater harmonic integrity than any other digital recording system.”
It is important to note that measurements, while seeming to be completely “objective,” are sometimes nothing of the sort. Of equal importance to what is measured is what is not measured. It is also essential that measured results be correctly under stood within a proper context. In evaluating the HDCD process, Fostle unfortunately reaches many erroneous conclusions, resulting from inadequate measurements and errors in analysis and interpretation.
One of the confused conclusions Fostle draws about the HDCD process is that when HDCD recordings are played back undecoded, they sound “wetter,” with longer reverberation tails than standard 16- bit digital recordings, and therefore less ac curate. The unstated and incorrect assumption underlying this conclusion is that standard 16-bit digital playback is accurate in its portrayal of low-level information and thus can serve as a reference. Recording industry professionals know from long experience that 16-bit digital recordings lose low-level timbral and ambient information compared to the analog source. If, at the option of the recording engineer, an HDCD recording is made using low-level range ex tension, then that recording played back undecoded will have more low-level information than a standard 16-bit recording. However, when an HDCD recording is played on the 95% of today’s CD players that lose low-level resolution, its additional low-level information will yield sound closer to that of the analog source. Decoded HDCD playback, which uses D/A converters whose resolution is greater than 16 bits, is even more accurate and, with the best HDCD playback equipment, is nearly in distinguishable from the analog source.
Many of Fostle’s other observations regarding the HDCD process are extremely misleading because of his lack of scientific method. Without any kind of analog source as a reference, he concludes that HDCD D A converters have “an inclination toward the mellow.” His stated reference is an Apogee DA-1000 D/A converter that can be demonstrated by A/B comparison to have a “glisten” or “edge” not present in the analog source, even if an HDCD processor is used for A/D conversion.
Fostle further states that HDCD has a “signature” sound that is not accurate, implying that it has been tailored to sound rich and distant, in supposed imitation of an “analog” sound. As evidence for this conclusion, he mentions a comparison of standard Sony 1630 versus HDCD transfers of the same analog recording of “Moon- glow,” claiming that the HDCD version violates convention by presenting the trumpet far to the rear of the other horns while the standard version does not. When Keith Johnson, the recording engineer for the “Moonglow” session, was shown this comment, he was flabbergasted. “Convention” or no, the trumpet was recorded fully 15 feet from the microphones and well to the rear of the other horns. The HDCD recording simply preserves that spatial information; the standard recording does not.
Fostle also chose to present the spectrum of the noise floor at the beginning of an HDCD track (decoding not specified) on a Reference Recordings sampler versus that of a dithered 20-bit A/D converter with no input signal. What is the point? As Fostle well knows, the graph of the HDCD noise floor only shows the noise contribution of multiple microphones in a live acoustic space; it tells nothing about the noise level of the HDCD A/D converter Despite Fostle’s later disclaimers, the inclusion of this graph is misleading at best.
Another disturbing aspect of Fostle’s analysis is his seeming confusion of analog noise level, which he expresses in equivalent numbers of bits, with resolution. It is well known that an analog signal can contain audible information many decibels below its noise floor. If that signal undergoes AID conversion through a system that has sufficient resolution, then the information in the sub-noise floor will be preserved and can be reproduced.
Although a complete discussion of Fostle’s conclusions isn’t possible here, one further observation needs to be made. To adequately measure the performance of a Model One HDCD processor, Pacific Microsonics uses over $100,000 worth of test equipment, including the best available 24- bit digital generator/FFTs, digital storage oscilloscopes, and spectrum analyzers. Custom-built signal-interface devices are used, and elaborate test procedures resulting from years of research are precisely followed. Unless a similar level of sophistication and thoroughness is applied to measuring the performance of a digital audio system, little-known but audibly critical distortions, such as complex signal inter- modulation products, will be overlooked. This could result in mistaken conclusions that do not correlate with listening tests.
Pacific Microsonics has submitted the précis of a paper on the HDCD process to the Audio Engineering Society in time to be presented at its November 1996 convention. That paper should effectively answer any remaining technical questions concerning the HDCD process.
In the meantime, both Riggs and Fostle are cordially invited to attend controlled, blind, A/B/C listening tests of the Model One HDCD processor compared to any other A/D converters, D/A converters, or processors, with all digital outputs com pared to the analog input source. This will demonstrate the unprecedented sonic accuracy of the HDCD process.
Michael D. Ritter; President, Pacific Microsonics; Berkeley, Calif.
Mr. Ritter and I talked after my article in the April issue was published. One salient topic was the measured treble rolloff on an HDCD demonstration disc (Reference Recordings RR-905CD). As reported in the article, the two HDCD cuts showed treble deficits of as much as 3 dB at 9 kHz relative to the non-HDCD versions. He did not contest that fact. Months earlier, at the October 1995 Audio Engineering Society Convention, Ritter, with Keith John son present, played for me those same two HDCD cuts and asked if I heard any differences in comparison to the conventional masters. I said I did. These differences, he assured me, were the result of the HDCD process. But after the article was published, Ritter told me he didn’t know about the rolloffs, nor did he know how they occurred. This would seem to imply either that he doesn’t hear the loss in treble energy or, alternatively, that it is an intended effect of the HDCD process. Now Ritter writes that the HDCD version of “Moonglow” is superior in its presentation, to the point that Keith Johnson was “flabbergasted” by my comments. Yet it’s one of the cuts that exhibits the high-frequency rolloff. Treble energy is a key determinant of sonic perspective and spatiality. Go figure.
Ritter also claims that HDCD reduces distortion. But when asked what kind of distortion and by how much, he refused to answer. When I first spoke to him in July 1993, he was promising a technical paper on HDCD. He is still promising, three years later. While claiming extensive measurement capabilities, Pacific Microsonics releases no meaningful technical data about HDCD. And in criticizing my technical analysis of the process, Ritter misconstrues one of the graphs. Figure 5, on page 30 of the April issue, shows noise spectra of an HDCD cut on the Reference Recordings sampler and of the output of a Meridian 618 processor in its “flat dither” mode, fed by a Lexicon 20/20 A/D converter with no input. Thus, the lower curve in the graph represents the noise spectrum of the output of a good 20-bit A/D converter dithered down to 16 bits in conventional fashion, which raises it a few dB above the theoretical minimum noise for 16-bit PCM. The noise in the HDCD recording is a minimum of 15 dB higher across the entire audio band and thus sets the fundamental resolution limit. (As pointed out in both the April article and, more extensively, my article on noise-shaping in the March issue, such discrepancy is very much the rule in commercial recordings of all kinds, not an exception.) No claim is made that this represents the noise floor of the HDCD converter, however, and I think that’s quite clear in the article. The noise performance of the process itself is addressed more directly by Figs. 14 and 15 on page 34 of the April issue.
As before, I encourage readers to order the TestMasters CD produced in cooperation with Audio. For less than 10 bucks, you can compare HDCD to other processes and make up your own mind about them. What you’ll hear is what was there. No celebrity endorsements, though. —D W Fostle
Is Fostle a Bit Off?
D. W. Fostle’s two-part series (“19 Bits in a 16-Bit Sack,” AUDIO magazine, March 1996; “Digital Deliverance,” April 1999) was informative and presented some important points of view. Permit me to present the other side of one coin that Mr. Fostle chose not to flip.
Regarding HDCD, I agree with Fostle that its developers have been less than forthcoming about how the system works, and I also agree that HDCD’s improvements generally have been exaggerated by the high-end press. Reportedly, Pacific Microsonics will be delivering a much-needed, white paper to clear up the confusion at an upcoming AI convention.
However, I feel that Fostle’s comments on the “warm” sound of the HDCD decoder chip reveal some of his own listening biases and that he did not look deep enough into the issue. He said, “Whereas the NPC filter had a certain ‘glisten’ or ‘edge’ when presenting choral voices and strings, this effect was absent when the Pacific Microsonics chip was installed.” The jury is definitely out on which of these two chips presents an accurate tonal balance, but I tend to side with the PM chip at this time: Its internal DSP design implements a superior oversampling filter, and its warmer sonic character cannot be attributed to frequency response errors or lack of monotonicity when reproducing standard (non HDCD) material. The chip measures very well when tested with the industry-standard CBS CD-1 disc. The PM chip uses far more coefficients and longer internal word length in its calculations than the NPC, and it has lower clock jitter (which also con tributes to a warmer sound). Additionally, the NPC uses a deleterious type of noise- shaping (without dither) in an attempt to reduce its long internal word length before feeding the D/A converter.
My listening tests have revealed time and, again that, all other things being equal, dig ital recordings using longer word lengths sound warmer (as well as more spacious, dynamic, ambient, and natural) than those using shorter words. This is why we engineers universally prefer the sound of our 20- and 24-bit masters to their 16-bit derivatives. This, despite Fostle’s assertion that very few venues have low enough noise level to “warrant” 20-bit recording. What he neglected to take into account was the well-known principle of noise masking (and unmasking). Even an analog tape sounds better when converted through a 20-bit system; you can still hear ambience and decay 10 dB, to perhaps 30 dB, below the noise level of a good analog tape!
Digital-audio mastering engineers have the opportunity to perform some unusual experiments and demonstrate sonic differences that Fostle may not have had the opportunity to experience. I can hear the superiority of calculating to 24-bit word length versus 20-bit word length every day that I master. Everyone “knows” that since the least significant bit of a 20-bit word is 120 dB down (relative to 0 dBFS), 20 bits should be enough for us. And since the least significant bit of a 24-bit word is 144 dB down, it must be far below the threshold of importance. But you can prove the importance of those four bits by the following simple test, which can be performed in a high digital audio workstation: Take a 20-bit (or 16-bit) digital audio signal (mu sic), and turn it into a 24-bit signal by drop ping or raising the gain an insignificant amount, perhaps 0.1 dB. (This calculation generates infinite word length, but round the result to 24 bits.) Next, re-dither the 24- bit result with two different processes. First, use a good 24-to-16-bit dithering process on the music (e.g., UV-22 or Meridian). Then try redithering while ignoring the lower four bits—i.e., perform a 20-to-16 process.
The redithering process that accounts for all 24 bits always sounds warmer than a 20- to-16 process. In other words, truncating the lower four bits of a 24-bit word creates subtle audible granulation and loss of ambience; there is meaningful information in those lower four bits. When you remove that information, you hear a kind of additional bite, “glisten,” or “edge” to the sound, very similar to the differences Fostle heard be tween the two filters. Therefore, based on my knowledge of how they handle word lengths in their internal calculations, I have to conclude that the more accurate digital filter is probably the one that sounds warmer.
It would not be difficult to create an experiment that conclusively proves which filter is the more musically accurate. (I would be happy to participate in this experiment and provide any necessary digital audio equipment.) I suggest taking a 24-bit digital audio signal, attenuating it 40 dB or so in the digital domain, and feeding it to a Mark Levinson D/A converter (which can be fit ted with either chip). By amplifying the result (in the analog domain), you’ll be able to tell which chip reveals more ambience and decay and has less quantization noise (distortion).
The winner of this listening test should be quite clear. If the winner is the Pacific Microsonics chip, then I suggest Fostle preferred the sound of the NPC chip because it has an artificial “bite” rather than the demonstrably more “natural” sound (improved ambience, space, and so on) of the PM chip.
Bob Katz; Recording and Mastering Engineer Digital Domain; New York, N.Y
-- -- --
My comments about the sound of the Pacific Microsonics PMD-100 filter were not, as Mr. Katz seems to think, judgmental. I described what was heard, while twice suggesting that readers audition the devices for themselves. On the other hand, he seems to assume that the “glisten” was not part of the recording and thus that the PMD-100 presentation was correct, even though he wasn’t there.
He also seems to misunderstand what I’ve said about 20-bit recording. I never wrote that wide-word recordings do not sound better than 16-bit recordings. I did point out that processes for converting 20- bit masters to the 16-bit CD medium are severely limited by the noise of real-world systems and that nothing approaching 20- bit noise performance is delivered into our homes.
Through most of his letter, Katz bases his points on his own extrapolations. It’s important to note, however, that I can find no study affirming his statement that listeners can hear “10 dB, to perhaps 30 dB” into the noise. In fact, standard work on masking suggests this is false, as does Demonstration 2 on the Acoustical Society of America’s Auditory Demonstrations CD (Philips 1126- 061). Even if it were true, practical use of any ability to hear 30 dB into the noise would require listening with peak levels above the threshold of pain. And that’s not even accounting for real-world hearing thresholds. One large study found a thresh old rise of 19.2 dB SPL at 4kHz between the ages of 20 and 40 for American men. By age 50, the mean threshold had climbed another 9 dB, to an absolute level of 33.3 dB SPL. If you are an average 50-year-old man, and assuming Katz is correct, hearing 30 dB into the noise requires listening with the noise at 63 dB SPL or so. Put a 15-bit (90-dB) S/N on top of that, and the peak level turns out to be 153 dB SPL. A 40-year-old needs only 144 dB SPL. Hmmmm.—D.W. F.
The concept of “listening into the noise” is a little bit tricky and, I think, often misunderstood. Essentially, it refers to our ability to hear tones or other narrow- band sounds at levels below the aggregate level of a wideband noise source. To take a simplified example, if you were to play white noise with a total level of 50 dB SPL over the audio band, you might be able to hear, say, a 500-Hz tone at 40 dB SPL, which is 10 dB below the level of the noise. The reason, however, is that the level of the noise energy near 500 Hz would not only be lower than that of the total noise energy but also lower than the level of the tone, so the noise wouldn’t completely mask the tone. Masking is a narrow-band phenomenon.
Let’s translate that more specifically into the realm of digital audio. The aggregate noise of conventional 16-bit PCM is at about —96 dB relative to full scale (0 dBFS). Performing a spectral analysis of that noise with a high-resolution FFT will yield an essentially flat spectrum across the audio band at about —130 dB. Nothing has changed; it’s the sum of the noise across all those frequencies that comes to the routinely cited —96 dB. Could you hear a tone at —100 dB? Possibly, if you have good hearing, and very likely if you were to amplify both the noise and the tone substantially to get them well above the basic threshold of hearing. Could you hear a tone at —131 dB? No, because that would be below the spectral noise floor.
This fact bears on the significance of the elevated noise spectra shown for many recordings in “19 Bits in a 16-Bit Sack” (March) and “Digital Deliverance” (April). No signal that falls below the recording’s spectral noise floor will be audible. Consequently, the noise floor of the recording medium itself will become a factor in signal masking only if it is above the noise floor of the input or approaches it closely enough to raise the combined noise floor by a perceptible amount.
Also, I notice that people sometimes refer to “hearing into the noise” in ways that suggest the phenomenon is peculiar to analog signals. The same thing happens with properly dithered digital signals.
Also see: HDCD Explained
Prev. | Next