Quantization noise & bit-depth
There is no point in discussing sample rate and bit-depth without having a real experience of “lo-res” audio. Studying audio with reduced sample rate or bit-depth may help us better understand the relationship between audio fidelity and relevant parameters.
Dec. 20, 2021
Fortunately, the investigation of bit-depth and quantization doesn't require complicated and tiring ABX tests. In fact, it only takes a few minutes. Quantization and word-length reduction adds only noise to the original signal, which is called quantization noise. The only difference between different bit-depths is that higher bit-depth offers lower noise floor. Therefore, we just have to listen to the quantization noise at a high volume setting in a quiet place. If we can’t hear the quantization noise at X bits, then we won’t hear it in music and increasing the number of bits will not improve the playback fidelity. That's all. Testing quantization (optimal dithered quantization) and bit-depth with music samples is pointless (unless we want to test how quantization noise is masked by the music at low bit-depths).
This quantization noise audibility test lets you to check the audibility of quantization noise at different bit-depth settings directly from your browser. High quality headphones with head-on response close to the Harman target curve or loudspeakers are recommended. The built-in speakers of laptops and smartphones are not suitable for listening tests.
"What should I hear?" - First, let’s look at the case when the noise spectrum is white ('no noise shaping'). With 12 bits the quantization noise should be clearly audible. You should also hear the quantization noise with 14 bits, but in this case the noise has to be very quiet. When 16-bit is selected and the gain set that peak SPL at the listening position is lower than 105 dBSPL (typical value for a 'loud but not ear bleeding' playback level) then the quantization noise should be inaudible. Noise shaping lowers the audibility of the noise floor by 18 decibels, which means that the system gain can be turned up 18 decibels higher compared to the normal version. 16 bits have enough headroom from the quietest sounds to the roar of a T-Rex in a cinema.
This audibility test can demonstrate:
- How dither sounds.
- Difference between 'flat' and shaped dither. How noise shaping can lower the loudness of the quantization noise and increase the subjective dynamic range.
- How loudness of quantization noise changes as a function of bit-depth or system gain. This test clearly demonstrates that bit-depth of 16 bits is more than enough in consumer delivery formats.
- This test also shows that noise floor of 14-bit can only be heard under extreme - almost unrealistic - conditions.
Quantization, dither and noise shaping
Quantization is the process of converting a sampled analog value to a discrete value (a binary number). There is another type of quantization, which is called requantization: changing the word-length (bit-depth) of the digital values. Word-length reduction (e.g. 24 bit to 16 bit conversion) is a type of requantization.
Dither is a low level noise added to an audio signal prior to word-length reduction (e.g. 24 bit to 16 bit conversion). Dither improves the dynamic range and low-level linearity of a digital system. A digital system with the 'right dither' behaves like an analog system with a well-defined noise floor.
In software dither is created using a random number generator. The output of the random number generator is scaled and added to the original signal before truncation. Software implementation of dither consists of only a few lines of code.
Dither in Wikipedia ➚
The most common type of dither. TPDF stands for triangular probability density function.
Perceptual noise shaping
Perceptual noise shaping shifts noise from our most sensitive hearing range (roughly 2 to 6 kHz) above 15 kHz where noise is not audible. Noise shaping at 44.1 kHz sampling rate extends the subjective dynamic range with 18 decibels (3 bits). Quantization noise in 16 bit/44.1 kHz file created by noise shaping is subjectively equivalent to the quantization noise of 19 bit/44.1 kHz file created by standard dither (white noise spectrum dither).
Dynamic range calculations
When the word length (bit-depth) is reduced without dither and noise shaping and the signal doesn't contain noise, then the lowest signal level and the dynamic range can be calculated with the well-known formula (DR = 6.02 * bits, and SNR = DR + 1.76). At 16 bits the level of the smallest signal is -96.32 dBFS. Everything below -96.32 dBFS will be converted to zero. This 96.32 dB is only valid when a pure tone is created with an audio editor and converted to 16 bits without dither, but not with real-life content. The dynamic range of 16 bits is so high that noise in the recordings - even in modern digital recordings - can function as a dither.
The well-known formula does not apply to dither and noise shaping either. When dither is used, the SNR is lowered, but the 'subjective dynamic range' is increased and higher than 6.02 * bits. Digital systems are limited by quantization noise and the real question is the loudness of this noise. The dynamic range of a noise limited system can only be expressed with the following procedure: the system gain must be set to the highest value at which the quantization noise is still not audible; in this case the SPL of the full scale sinusoid (0 dBFS) can be used to describe the dynamic range of the system.
Csaba Horváth (article & webplayer)