Dynamic range of MP3 and AAC encoding
- Measurements of a fixed-point decoder and a floating-point decoder
- Factors that affect the dynamic range of MP3 and AAC encoding
- Calculation of the dynamic range using the requantization formula
August 9, 2018
The wav files were generated with Audacity, then converted to MP3 with LAME 3.9.100 (and RazorLAME front-end). The fixed-point decoder tests was done with the built-in MP3 decoder in Audacity (MAD decoder). The floating-point decoder tests was done with the converter in Foobar2000 (ffmpeg). To save the frequency response data as 'frd' I used spectral analysis in Audacity (Hann-window, 1024 points, 0.4 sec average).
An 1 kHz sine wave was generated with different level for the 16-bit, 24-bit and 32-bit tests. The amplitudes: -110 dBFS (16 bit), -120 dBFS (24 bit) and -150 dBFS (32bit). The 16-bit test file was converted from 32 bits with shaped dither, the 24-bit version was converted with normal triangular dither.
Fixed-point MP3 decoder test
These are the measurements of the Audacity's internal MP3 decoder (MAD). Dither and noise shaping for the 16-bit output are performed by Audacity. Better matching between the input and output means better decoding.
24-bit test:
Source wav resolution: 24 bits, dither: triangular dither
Decoder output resolution: 24 bits, dither: none
24-bit test
16-bit test (a):
Source wav resolution: 16 bits, dither: shaped dither
Decoder output resolution: 24 bits, dither: triangular dither
(The result would be the same if the input resolution were 24 bits, and the output resolution were 16 bits with the same shaped dither.)
16-bit test (a)
16-bit test (b):
Source wav resolution: 16 bits, dither: shaped dither
Decoder output resolution: 16 bits, dither: shaped dither and triangular dither
16-bit test (b)
Floating-point MP3 decoder test
These are the measurements of the Foobar2000 internal MP3 decoder (ffmpeg).
32-bit test:
Source wav resolution: 32-bit float, dither: none
Decoder output resolution: 32-bit float, dither: none
32-bit test
24-bit test:
Source wav resolution: 24 bits, dither: triangular dither
Decoder output resolution: 24 bits, dither: none
24-bit test
Factors that affect the dynamic range of MP3 and AAC encoding
These factors affect the dynamic range of MP3 and AAC encoding:
- Encoder:
- Source file dynamic range (bit depth, type of dither)
- Encoder's internal data type (32-bit fixed-point vs. 32-bit float)
- Psychoacoustic model (absolute threshold of hearing)
- Decoder:
- Decoder output resolution and type of dither
- Decoder's internal data type (32-bit fixed-point vs. 32-bit float)
- Format restrictions:
- Dynamic range of global gain
- Huffman-code max. value (8206)
Some interesting facts:
- Dynamic range is independent from bitrate.
- LAME (and most MP3 encoders) leaves the original dynamic range of a 16 or 24 bit PCM source untouched.
- The maximum available dynamic range with 32 bit fixed-point encoders and decoders is approx. 150 dB.
- The maximum available dynamic range of 32 bit floating-point encoders and decoders is approx. 200 dB.
Calculation of the dynamic range
In the following calculation I just take into account the format restrictions and ignore the rest. The task is simple: we have to determine the highest and lowest number (MDCT value) that can be represented in an MP3 bitstream. To do this we have to substitute the minimum and maximum values into the requantization formula. Because scalefactors just fine adjust the quantization in each scalefactor band (except in sfb21) therefore I omit them from the calculation.
Highest value:
Highest computed global gain: 2048
Highest Huffman code value: 8206
Highest MDCT value = 2048 * 8206(4/3) = 338978356
Lowest value:
Lowest computed global gain: 1.57009E-16
Lowest Huffman code value: 1 (the lowest non-zero integer)
Lowest MDCT value = 1.57009E-16 * 1(4/3) = 1.57009E-16
The dynamic range of MP3/AAC is: 20 * log(MAX/MIN) = 486.685 dB!
This result is valid for pure tones (sine waves) without saying anything about the quality of the compression. For a given bitrate or quality the dynamic range will be the same as the dynamic range of the global gain, which is 382.5 dB (1.5 dB * 255).
Csaba Horváth
See also:
Lossy audio compression: principles, methods, misconceptions 🔊 🎧
High-resolution audio vs. 16 bit / 44.1kHz
Demonstration of sampling (interactive chart)
Noise perception, detection threshold & dynamic range
Audibility thresholds for SINAD / THD+N measurements