About Nyquist, Shannon, filters, and how stuff sounds
Oct 12, 2014 16:40:04 GMT -5
Nodscene and vcautokid like this
Post by KeithL on Oct 12, 2014 16:40:04 GMT -5
I've come to the conclusion that most people don't REALLY understand what the theory MEANS
I base this conclusion on various quotes like: "According to the theorem, a 44k sample rate should be able to perfectly reproduce anything up to 20 kHz. Since I can hear the differences between a CD and the original, then the theory must be wrong." (I made that specific quote up - based on the gist of what I've heard from several people.)
Unfortunately, that's not exactly what the theory says. What it says is that it's POSSIBLE to perfectly store and reproduce a given signal as long as the sample rate you pick is such that the Nyquiest Frequency (half of the sample rate) is above the highest frequency you wish to record. (So, if you want to store all frequencies up to 20 kHz perfectly, then it is POSSIBLE to do so by using any sample rate of more than twice that frequency. This is why CDs use a sample rate of 44.1kHz, which is slightly more than twice the 20 kHz "limitation of human hearing".)
However, carefully note the highlighted word....
The theorem DOES NOT say that, if you use a sample rate with a Nyquist Frequency higher than what you want to record, you are GUARANTEED to get a perfect reproduction of your original. Instead, it lists that as a minimum requirement. So, what it really says is that, if you want to reproduce everything up to 20 kHz, you must AT LEAST use a sample rate of more than twice that; it doesn't preclude that there might be other requirements.
In fact, if you read a bit further, there are indeed a few more requirement:
1) You MUST limit the frequencies you're trying to store to below that Nyquist Frequency. If you have information at frequencies above that, not only won't it be accurately stored, but it will interfere rather badly with the storage of the frequencies you want. In fact, if you let anything above that frequency reach your encoder (ADC) at all, it will be "magically" transformed into noise and distortion down in the audio band where you CAN hear it.
2) When you play that signal back, in order to get what you're supposed to, you MUST remove some extra junk that "appears" as a result of the conversion process. Specifically, you MUST apply filtering to limit the bandwidth of your output signal to the same bandwidth you were allowed to have going in (below the Nyquist Frequency for your sample rate). And, again, any junk above that frequency that you fail to remove 100% will usually come back to haunt you in the form of distortion inside the audio band (and especially annoying non-harmonic distortion at that).
It is these two requirements that cause all the "problems", and that explain why digital recordings, as good as they are, still aren't perfect.
By #1, if you were recording a perfect 12 kHz sine wave on a CD, you'd be all set. A sine wave contains only one frequency, and 12 kHz is safely under 22 kHz (the Nyquist Frequency for a CD; half of 44.1k), so you're done. However, unless you've created a perfect sine wave on a computer, nothing much in the real world is that perfect. Most musical instruments produce lots of harmonics, many of them reaching well above 20 kHz. Even things like the room ambiance, the creaking of the chairs, and the background hiss from your master tape, may have some energy above 22 kHz. Well, guess what? Unless you want that out-of-band content to cause nasty audible distortion, you have to TOTALLY remove it. Which means that your audio source MUST be passed through a rather powerful filter before you can record it. Since filters tend to have side effects, which tend to be worse as the filter becomes more aggressive, this is your first trade off - between how well the filter works and how much damage it does.
Now, by #2, you also have to apply a similar filter to the analog audio after you convert your digital recording back to analog. Please note that this extra junk, which has to be filtered out, is NOT some nasty distortion or error in the recording process; IT IS THE INEVITABLE RESULT OF THE CONVERSION PROCESS. Even in an absolutely theoretically perfect recording, this extra junk will be there in the decoded analog audio, and you will have to filter it out. So, once again, we're faced with that tradeoff between a filter that does a really great job, and any side effects it may also produce. (And, yes, SACDs have this exact issue too - even worse than with PCM. Also note that those "NOS filterless DACs" simply ignore this requirement; they're hoping that the limited frequency response of the rest of your signal chain will act as a de-facto filter, and that the distortions caused by the missing filter will be less annoying than the side effects of HAVING a filter.)
Both of these tradeoff were a huge problem in the early days of CDs - because you basically needed a filter that passed everything up to 20 kHz perfectly, yet also removed anything above 22 kHz - also pretty much perfectly. (We're talking about a filter sharper than 96 dB/octave - which isn't at all practical to build - and which tends to have nasty side effects. Poorly designed filters are almost certainly a big part of the "digitalitis" present on many early CDs.)
But what about oversampling....
Oversampling is a neat trick. By using a higher sample rate, you get a higher Nyquist Frequency... which, in turn, lets you use a more gradual filter that takes effect further from the audio frequency range. (Passing everything up to 20 kHz, but nothing above 22 kHz, is rather tricky; passing everything up to 20 kHz, and stopping everything above 40 kHz, is much easier, and can be done with a much simpler filter - which is cheaper to build AND produces less audible side effects. That's oversampling in simplest terms.) You can then keep that higher sample rate or, if you want to record the result on a CD, you can use some really tricky math to convert it back down. By doing this you get to keep most of the benefits of your gentler filter, and do a lot of the nasty bits in nice clean math instead (a slight oversimplification there, but philosophically pretty accurate).
And now, digital filters.... .....
Once you get to tricky math-type stuff like oversampling, you're pretty well stuck using digital filters; it's simply not practical to do this stuff in the analog domain. Virtually all digital filters, as part of how they work, do some "time shifting". They take some of the original signal, shift it by a certain amount of time, and add a certain amount of it back into the original; it is by carefully calculating the shift times and amounts of signal from each step that is added back that a digital filter works. The end result, when you're done adding up all the additions and cancellations, is that you end up with what you wanted. There is another variation, where some of the output signal is fed back to the input in calculated quantities, but it has the same flaw - that the results do not happen instantaneously.
When you see those little pictures showing the transient response of the filter in a DAC - the ones with one big spike and little ripples fading off to the left and right - the ripples are the effect of doing things this way. The signal in the center is really very very close to right, then, a little further away, it goes a bit too low, then, a little further away, it goes a little too high - but less so than how far it went too low - in ever decreasing "ripples". Those ripples are almost entirely errors between what the signal really should be and what it is. At this point, I should clarify a few things:
1) In a properly designed filter, those ripples should be pretty small, and should be so close in time to the main signal that you don't hear them - much.
2) The ripples are part of the accurate signal. If you were to add up the main spike and all of the ripples, both negative and positive, you would end up with a TOTAL that was very close to the perfect result. (Adding the ripples will make it more precise rather than less, because they're sort of corrections to the original signal, just not at exactly the right times. You can't simply "cut them off" without creating even worse errors.)
3) That whole picture thing is very misleading. Seeing how a DAC will reproduce a perfect square wave or impulse is INVALID. Since you MUST limit the bandwidth of any signal you record to below the Nyquist Frequency, your real digital signal can ONLY contain BANDWIDTH LIMITED transients. In other words, your DAC CANNOT ever be asked to play the sort of signal they used as a test signal to make that picture - because it wouldn't be a valid signal to record to begin with. The transient pulse used for that test is a test signal - it provides useful information for the designers - but it can't exist on a real recording.
Also, while you simply can't "just remove the ripples", you CAN use some fancy math to shift them around. On common way this is done is to push most of the ripples from before the main signal to after it - you end up with fewer smaller ripples before the spike, and more, larger ripples after it... and the ones after carry on a bit longer. The end result, when you sum them all up, is mathematically the same. However, since the real signal masks the ripples after it better than the ones before it, the ripples we shuffled to after it are less audible - so, at least in theory, it should sound better that way. That last option is commonly called "an apodizing filter", and quite a few DACs offer it as an option (remember that it isn't necessarily more accurate, but some people think it sounds better). Incidentally, the DACs in our current products do NOT include that option, and we think they sound perfect just the way they are ... but you may see it in some future products
***
OK, if you're read this far, and are now totally confused... I guess you've earned the right to know
THE BIG SECRET
DACs aren't perfect; Like everything else on planet Earth, they involve some compromise.
The analog output of your DAC is NOT a perfect recreation of the original analog signal - NONE, NOWHERE, AND NOT AT ANY PRICE.
Even though the S/N and THD of most DACs are absurdly good, they do have some other (rather tiny but possibly audible) flaws.
Digital filters are necessary - simply because they have far fewer and smaller flaws than any of the available alternatives. However, digital filters do alter the signal slightly. The signal coming from your DAC is not an EXACT reproduction of the original. Odds are that the frequency response and distortion really are so close to perfect that no human being could hear the spots where they aren't, but there are indeed tiny areas where the signal, while correct in power amplitude, may be slightly wrong in time. These errors are mostly caused by the way in which digital filters work, and can be seen if you look at oscilloscope pictures of the transient response of various DACs. Obviously, if you look at the transient responses of several DACs, or of different choices of filters on a single DAC, and they are different, then they CAN'T all be right.... right? (The math boys may fall back on proving that the amplitude responses of all of them, taken over time, are actually precisely right, but clearly the waveform has been changed a little bit.)
HOWEVER, the important point to remember here is that the errors introduced by the digital filters on a modern DAC are FAR less audible than the errors introduced by any other ways of reproducing a signal - like the hiss and nonlinearity distortions of magnetic tape, or all the mechanical resonances and mechanical and electrical nonlinearities of analog vinyl. If you really find those few little extra ripples scary, then you really shouldn't look at an oscilloscope picture of a phono cartridge trying to "ride" a square wave, and you'd better take a few stiff drinks (or a hit of your anesthetic or tranquilizer of choice) before comparing frequency response graphs or THD numbers....
But, if you were just looking for an explanation how different DACs could sound different, even though "all the numbers are the same", then there you have it.
You're not crazy, the differences are really there, and that's where you can find them. (Unfortunately, at least so far, there is no simple number that describes those "discrepancies", and, unless you know exactly how to interpret them, those pictures show you that there's something going on, but don't correlate to what it sounds like.
If you were really just tired of "them digital guys" always saying that "DACs wuz perfect", then you can consider yourself vindicated.
(But Nyquist and Shannon were ALSO right )
I base this conclusion on various quotes like: "According to the theorem, a 44k sample rate should be able to perfectly reproduce anything up to 20 kHz. Since I can hear the differences between a CD and the original, then the theory must be wrong." (I made that specific quote up - based on the gist of what I've heard from several people.)
Unfortunately, that's not exactly what the theory says. What it says is that it's POSSIBLE to perfectly store and reproduce a given signal as long as the sample rate you pick is such that the Nyquiest Frequency (half of the sample rate) is above the highest frequency you wish to record. (So, if you want to store all frequencies up to 20 kHz perfectly, then it is POSSIBLE to do so by using any sample rate of more than twice that frequency. This is why CDs use a sample rate of 44.1kHz, which is slightly more than twice the 20 kHz "limitation of human hearing".)
However, carefully note the highlighted word....
The theorem DOES NOT say that, if you use a sample rate with a Nyquist Frequency higher than what you want to record, you are GUARANTEED to get a perfect reproduction of your original. Instead, it lists that as a minimum requirement. So, what it really says is that, if you want to reproduce everything up to 20 kHz, you must AT LEAST use a sample rate of more than twice that; it doesn't preclude that there might be other requirements.
In fact, if you read a bit further, there are indeed a few more requirement:
1) You MUST limit the frequencies you're trying to store to below that Nyquist Frequency. If you have information at frequencies above that, not only won't it be accurately stored, but it will interfere rather badly with the storage of the frequencies you want. In fact, if you let anything above that frequency reach your encoder (ADC) at all, it will be "magically" transformed into noise and distortion down in the audio band where you CAN hear it.
2) When you play that signal back, in order to get what you're supposed to, you MUST remove some extra junk that "appears" as a result of the conversion process. Specifically, you MUST apply filtering to limit the bandwidth of your output signal to the same bandwidth you were allowed to have going in (below the Nyquist Frequency for your sample rate). And, again, any junk above that frequency that you fail to remove 100% will usually come back to haunt you in the form of distortion inside the audio band (and especially annoying non-harmonic distortion at that).
It is these two requirements that cause all the "problems", and that explain why digital recordings, as good as they are, still aren't perfect.
By #1, if you were recording a perfect 12 kHz sine wave on a CD, you'd be all set. A sine wave contains only one frequency, and 12 kHz is safely under 22 kHz (the Nyquist Frequency for a CD; half of 44.1k), so you're done. However, unless you've created a perfect sine wave on a computer, nothing much in the real world is that perfect. Most musical instruments produce lots of harmonics, many of them reaching well above 20 kHz. Even things like the room ambiance, the creaking of the chairs, and the background hiss from your master tape, may have some energy above 22 kHz. Well, guess what? Unless you want that out-of-band content to cause nasty audible distortion, you have to TOTALLY remove it. Which means that your audio source MUST be passed through a rather powerful filter before you can record it. Since filters tend to have side effects, which tend to be worse as the filter becomes more aggressive, this is your first trade off - between how well the filter works and how much damage it does.
Now, by #2, you also have to apply a similar filter to the analog audio after you convert your digital recording back to analog. Please note that this extra junk, which has to be filtered out, is NOT some nasty distortion or error in the recording process; IT IS THE INEVITABLE RESULT OF THE CONVERSION PROCESS. Even in an absolutely theoretically perfect recording, this extra junk will be there in the decoded analog audio, and you will have to filter it out. So, once again, we're faced with that tradeoff between a filter that does a really great job, and any side effects it may also produce. (And, yes, SACDs have this exact issue too - even worse than with PCM. Also note that those "NOS filterless DACs" simply ignore this requirement; they're hoping that the limited frequency response of the rest of your signal chain will act as a de-facto filter, and that the distortions caused by the missing filter will be less annoying than the side effects of HAVING a filter.)
Both of these tradeoff were a huge problem in the early days of CDs - because you basically needed a filter that passed everything up to 20 kHz perfectly, yet also removed anything above 22 kHz - also pretty much perfectly. (We're talking about a filter sharper than 96 dB/octave - which isn't at all practical to build - and which tends to have nasty side effects. Poorly designed filters are almost certainly a big part of the "digitalitis" present on many early CDs.)
But what about oversampling....
Oversampling is a neat trick. By using a higher sample rate, you get a higher Nyquist Frequency... which, in turn, lets you use a more gradual filter that takes effect further from the audio frequency range. (Passing everything up to 20 kHz, but nothing above 22 kHz, is rather tricky; passing everything up to 20 kHz, and stopping everything above 40 kHz, is much easier, and can be done with a much simpler filter - which is cheaper to build AND produces less audible side effects. That's oversampling in simplest terms.) You can then keep that higher sample rate or, if you want to record the result on a CD, you can use some really tricky math to convert it back down. By doing this you get to keep most of the benefits of your gentler filter, and do a lot of the nasty bits in nice clean math instead (a slight oversimplification there, but philosophically pretty accurate).
And now, digital filters.... .....
Once you get to tricky math-type stuff like oversampling, you're pretty well stuck using digital filters; it's simply not practical to do this stuff in the analog domain. Virtually all digital filters, as part of how they work, do some "time shifting". They take some of the original signal, shift it by a certain amount of time, and add a certain amount of it back into the original; it is by carefully calculating the shift times and amounts of signal from each step that is added back that a digital filter works. The end result, when you're done adding up all the additions and cancellations, is that you end up with what you wanted. There is another variation, where some of the output signal is fed back to the input in calculated quantities, but it has the same flaw - that the results do not happen instantaneously.
When you see those little pictures showing the transient response of the filter in a DAC - the ones with one big spike and little ripples fading off to the left and right - the ripples are the effect of doing things this way. The signal in the center is really very very close to right, then, a little further away, it goes a bit too low, then, a little further away, it goes a little too high - but less so than how far it went too low - in ever decreasing "ripples". Those ripples are almost entirely errors between what the signal really should be and what it is. At this point, I should clarify a few things:
1) In a properly designed filter, those ripples should be pretty small, and should be so close in time to the main signal that you don't hear them - much.
2) The ripples are part of the accurate signal. If you were to add up the main spike and all of the ripples, both negative and positive, you would end up with a TOTAL that was very close to the perfect result. (Adding the ripples will make it more precise rather than less, because they're sort of corrections to the original signal, just not at exactly the right times. You can't simply "cut them off" without creating even worse errors.)
3) That whole picture thing is very misleading. Seeing how a DAC will reproduce a perfect square wave or impulse is INVALID. Since you MUST limit the bandwidth of any signal you record to below the Nyquist Frequency, your real digital signal can ONLY contain BANDWIDTH LIMITED transients. In other words, your DAC CANNOT ever be asked to play the sort of signal they used as a test signal to make that picture - because it wouldn't be a valid signal to record to begin with. The transient pulse used for that test is a test signal - it provides useful information for the designers - but it can't exist on a real recording.
Also, while you simply can't "just remove the ripples", you CAN use some fancy math to shift them around. On common way this is done is to push most of the ripples from before the main signal to after it - you end up with fewer smaller ripples before the spike, and more, larger ripples after it... and the ones after carry on a bit longer. The end result, when you sum them all up, is mathematically the same. However, since the real signal masks the ripples after it better than the ones before it, the ripples we shuffled to after it are less audible - so, at least in theory, it should sound better that way. That last option is commonly called "an apodizing filter", and quite a few DACs offer it as an option (remember that it isn't necessarily more accurate, but some people think it sounds better). Incidentally, the DACs in our current products do NOT include that option, and we think they sound perfect just the way they are ... but you may see it in some future products
***
OK, if you're read this far, and are now totally confused... I guess you've earned the right to know
THE BIG SECRET
DACs aren't perfect; Like everything else on planet Earth, they involve some compromise.
The analog output of your DAC is NOT a perfect recreation of the original analog signal - NONE, NOWHERE, AND NOT AT ANY PRICE.
Even though the S/N and THD of most DACs are absurdly good, they do have some other (rather tiny but possibly audible) flaws.
Digital filters are necessary - simply because they have far fewer and smaller flaws than any of the available alternatives. However, digital filters do alter the signal slightly. The signal coming from your DAC is not an EXACT reproduction of the original. Odds are that the frequency response and distortion really are so close to perfect that no human being could hear the spots where they aren't, but there are indeed tiny areas where the signal, while correct in power amplitude, may be slightly wrong in time. These errors are mostly caused by the way in which digital filters work, and can be seen if you look at oscilloscope pictures of the transient response of various DACs. Obviously, if you look at the transient responses of several DACs, or of different choices of filters on a single DAC, and they are different, then they CAN'T all be right.... right? (The math boys may fall back on proving that the amplitude responses of all of them, taken over time, are actually precisely right, but clearly the waveform has been changed a little bit.)
HOWEVER, the important point to remember here is that the errors introduced by the digital filters on a modern DAC are FAR less audible than the errors introduced by any other ways of reproducing a signal - like the hiss and nonlinearity distortions of magnetic tape, or all the mechanical resonances and mechanical and electrical nonlinearities of analog vinyl. If you really find those few little extra ripples scary, then you really shouldn't look at an oscilloscope picture of a phono cartridge trying to "ride" a square wave, and you'd better take a few stiff drinks (or a hit of your anesthetic or tranquilizer of choice) before comparing frequency response graphs or THD numbers....
But, if you were just looking for an explanation how different DACs could sound different, even though "all the numbers are the same", then there you have it.
You're not crazy, the differences are really there, and that's where you can find them. (Unfortunately, at least so far, there is no simple number that describes those "discrepancies", and, unless you know exactly how to interpret them, those pictures show you that there's something going on, but don't correlate to what it sounds like.
If you were really just tired of "them digital guys" always saying that "DACs wuz perfect", then you can consider yourself vindicated.
(But Nyquist and Shannon were ALSO right )