Listening test:
Quantization noise & bit-depth
Last edited: Jan. 29, 2024
Fortunately, testing the effect of bit-depth reduction and quantization doesn't require complicated and tiring ABX tests. Quantization, word-length reduction only adds noise to the original signal, which is called quantization noise and higher bit-depth only offers lower quantization noise floor. Therefore, we only need to listen to the quantization noise at a high volume setting in a quiet room. If we don’t hear the quantization noise at X bits, then we won’t hear it in music either and increasing the number of bits won't improve playback fidelity. That's all.
This quantization noise audibility test lets you to check the audibility of quantization noise at different bit-depth settings directly from your browser. High quality headphones or loudspeakers are recommended. The built-in speakers of laptops and smartphones are not suitable for listening tests.
"What should I hear?" - First, let’s look at the case when the noise spectrum is white ('no noise shaping'). With 12 bits the quantization noise should be clearly audible. You should also hear the quantization noise with 14 bits, but in this case the noise has to be very quiet. When 16-bit is selected and the gain set that peak SPL at the listening position is lower than 105 dBSPL (typical value for a 'loud but not ear bleeding' playback level) then the quantization noise should be inaudible. Noise shaping lowers the audibility of the noise floor by 18 decibels, which means that the system gain can be turned up 18 decibels higher compared to the normal version. 16 bits have enough headroom from the quietest sounds to the roar of a T-Rex in a cinema.
This audibility test can demonstrate:
- Difference between 'flat' and shaped dither. How noise shaping can lower the loudness of the quantization noise and increase the subjective dynamic range.
- How loudness of quantization noise changes as a function of bit-depth or system gain. This test clearly demonstrates that bit-depth of 16 bits is more than enough in consumer delivery formats.
- This test also shows that noise floor of 14-bit can only be heard under extreme - almost unrealistic - conditions.
Quantization, dither and noise shaping
Quantization
Quantization is the process of converting a sampled analog value to a discrete value (a binary number). There is another type of quantization, which is called requantization: changing the word-length (bit-depth) of the digital values. Word-length reduction (e.g. 24 bit to 16 bit conversion) is a type of requantization.
Dither
Dither is a low level noise added to an audio signal prior to word-length reduction (e.g. 24 bit to 16 bit conversion). Dither improves the dynamic range and low-level linearity of a digital system. A digital system with the 'right dither' behaves like an analog system with a well-defined noise floor.
In software dither is created using a random number generator. The output of the random number generator is scaled and added to the original signal before truncation. Software implementation of dither consists of only a few lines of code.
Dither in Wikipedia ➚
Application of dither is not restricted to audio. In the following images, color depth is reduced to 16 colors with and without dither.
Color bit-depth reduction without dither:
Color bit-depth reduction with dither (dither gives a smoother color transition):
TPDF dither
The most common type of dither. TPDF stands for triangular probability density function.
Perceptual noise shaping
Perceptual noise shaping shifts noise from our most sensitive hearing range (roughly 2 to 6 kHz) above 15 kHz where noise is not audible. Noise shaping at 44.1 kHz sampling rate extends the subjective dynamic range with 18 decibels (3 bits). Quantization noise in 16 bit/44.1 kHz file created by noise shaping is subjectively equivalent to the quantization noise of 19 bit/44.1 kHz file created by standard dither (white noise spectrum dither).
What is the dynamic range of 16 bits?
When the word length (bit-depth) is reduced without dither and the signal doesn't contain any noise, then lowest signal level and dynamic range can be calculated with the well-known formula (DR = 6.02 × bits, and SNR = DR + 1.76). At 16 bits the level of the smallest signal is -96.32 dBFS. Everything below -96.32 dBFS will be converted to zero. However, 96.32 dB is only valid when a pure tone is created with an audio editor and converted to 16 bits without dither, but not with real-life content. The dynamic range of 16 bits is so high that noise in the recordings - even in modern digital recordings - can function as a dither.
The well-known formula does not apply to dither and noise shaping either. When dither is used, the SNR is lowered, but the 'subjective dynamic range' is increased and higher than 6.02 * bits. Digital systems are limited by quantization noise and the real question is the loudness of this noise. The "noise-free" dynamic range of a noise limited system can be expressed with the following procedure: the system gain must be set to the highest value at which the quantization noise is still not audible; in this case the SPL of the full scale sinusoid (0 dBFS) can be used to describe the dynamic range of the system.
A more detailed analysis can be found in the following article:
Quantization noise vs. other types of noise
Noise in a WAV or FLAC file is the 'sum' of the quantization noise and the noise already present in the recording (or captured during recording). The "other noise" can be room noise, an other system's quantization noise (e.g. AD converter's internal noise) or in the case of digital version of analog recordings, noise of the tape recorder. If their levels are different, the noise floor is determined by the higher one.
What are 12 bits enough for?
It's well known that 8 bits is enough for heavily compressed pop or rock music. However, it is not so well known that even 12 bits can provide sufficiently low noise floor for jazz, blues or any kind of music with acoustic instruments. Analog recordings are a living example of that. Recording noise in analog recordings is equivalent with the noise floor of a 12-bit...13-bit digital system (without noise shaping, of course). We may hear some background hiss at the beginning and at the end of the songs or during very quiet passages, but the rest of the noise is masked by the music.
Calculated bit-depth of some popular recordings:
Song | Year | Version | dBre | bit-depth |
The Dave Brubeck Quartet - Take Five | 1959 | 2009 | 26.3 | 11.6 |
The Dave Brubeck Quartet - Take Five | 1959 | 2011 | 35.8 | 10 |
Led Zeppelin - Stairway To Heaven | 1971 | 2014 | 27.9 | 11.4 |
Dire Straits - Water of Love | 1978 | 1995 | 24.7 | 11.9 |
The Blues Brothers - She Caught The Katy | 1980 | 1995 | 21.2 | 12.5 |
The Blues Brothers - Sweet Home Chicago | 1980 | 1995 | 27.2 | 11.5 |
Hans Zimmer - Gladiator Soundtrack1 | 2000 | 2000 | 13.7 | 13.7 |
16bit/44.1kHz TPDF dither (reference) | - | - | 0 | 16 |
1 this is a digital recording
Calculation of bit-depth: the noise floor (recording noise) is compared to the noise floor of a 16 bit / 44,1 kHz digital system with TPDF dither. The bit-depth is calculated by converting the difference from decibels to bits (dBre is the difference).
Even 12 bits can provide excellent playback fidelity...
Csaba Horváth
See also:
Sampling rate controversy: simple and conclusive test methods 🔊 🎧
Recording noise measurements
Noise perception, detection threshold & dynamic range
High-resolution audio vs. 16 bit / 44.1kHz