Dynamic range of MP3 and AAC encoding




Last updated: August 9, 2018

It is widely believed that MP3 encoding alters the dynamic range of music. Which is of course false... MP3 doesn't compress dynamics, neither does AAC. Actually, both have wider dynamic range than 24 bit PCM (wav) files.

Why do so many people believe that MP3 and AAC kill dynamics? Even 'non digital nerds' know the relation between the resolution (bit depth) and the dynamic range of PCM audio: the lower the resolution of a PCM file the lower the dynamic range. Because PCM audio is just stream of PCM samples (mainly integers), there is a close relationship between resolution and dynamic range. But the format of the MP3 bitstream is quite different and this rule doesn't apply to MP3.

The key concept in MPEG audio formats that the volume level is separated from the required bit depth. The bit depth (and signal-to-noise ratio, SNR) is reduced during compression while the actual volume level is saved in the global gain. Restoring the original volume level is just a multiplication in the decoder. The separation of volume from bit depth (among other things) makes possible that individual MP3 frames may have low SNR (approx. 40 dB), but still high dynamic range.

Of course, this is a rather simplified view and there are a lot of complex calculations, transforms in MPEG encoders and decoders. Moreover, the relation between the volume level and global gain is indirect, the global gain will be different for a certain waveform at different bitrates (compression levels).

Some words about the organization of this article. The first part is about dynamic range measurements of two widespread MP3 decoder, while the last part reveals the theoretical dynamic range.


About the tests

I've generated the test wav files with Audacity, then converted them to MP3 with LAME 3.9.100 (and RazorLAME front-end). The fixed-point decoder tests was done with the built-in MP3 decoder in Audacity (MAD decoder). The floating-point decoder tests was done with the converter in Foobar2000 (ffmpeg). To save the frequency response data as 'frd' I used spectral analysis in Audacity (Hann-window, 1024 points, 0.4 sec average).

An 1 kHz sine wave was generated with different level for the 16 bit, 24 bit and 32 bit tests. The amplitudes: -110 dBFS (16 bit), -120 dBFS (24 bit) and -150 dBFS (32bit). The 16 bit test file was converted from 32 bits with shaped dither, the 24 bit version was converted with normal triangular dither.


Fixed-point MP3 decoder test

These are the measurements of the Audacity's internal MP3 decoder (MAD). Dither and noise shaping for the 16 bit output are performed by Audacity. Better matching between the input and output means better decoding.

24 bit test:
Source wav resolution: 24 bit, dither: triangular dither
Decoder output resolution: 24 bit, dither: none

mp3 dynamic range, source: 24 bit, encoder: LAME, decoder: Audacity - out: 24 bit

24 bit test

16 bit test (a):
Source wav resolution: 16 bit, dither: shaped dither
Decoder output resolution: 24 bit, dither: triangular dither
(The result would be the same if the input resolution was 24 bit, and the output resolution was 16 bit with the same shaped dither.)

mp3 dynamic range, source: 16 bit shaped dither, encoder: LAME, decoder: Audacity - out: 24 bit

16 bit test (a)

16 bit test (b):
Source wav resolution: 16 bit, dither: shaped dither
Decoder output resolution: 16 bit, dither: shaped dither and triangular dither

mp3 dynamic range, source: 16 bit shaped dither, encoder: LAME, decoder: Audacity - out: 16 bit

16 bit test (b)


Floating-point MP3 decoder test

These are the measurements of the Foobar2000 internal MP3 decoder (ffmpeg).

32 bit test:
Source wav resolution: 32 bit float, dither: none
Decoder output resolution: 32 bit float, dither: none

mp3 dynamic range, source: 32 bit, encoder: LAME, decoder: Foobar2000 - out: 32 bit

32 bit test

24 bit test:
Source wav resolution: 24 bit, dither: triangular dither
Decoder output resolution: 24 bit, dither: none

mp3 dynamic range, source: 24 bit, encoder: LAME, decoder: Foobar2000 - out: 24 bit

24 bit test


Factors that affect the dynamic range of MP3 and AAC encoding

These factors affect the dynamic range of MP3 and AAC encoding:

Some interesting facts:


Calculation of the dynamic range of MP3 and AAC encoding

In the following calculation I just take into account the format restrictions and ignore the rest. The task is simple: we have to determine the highest and lowest number (MDCT value) that can be represented in an MP3 bitstream. To do this we have to substitute the minimum and maximum values into the requantization formula. Because scalefactors just fine adjust the quantization in each scalefactor band (except in sfb21) therefore I omit them from the calculation.

Highest value:
Highest computed global gain: 2048
Highest Huffman code value: 8206
Highest MDCT value = 2048 * 8206(4/3) = 338978356

Lowest value:
Lowest computed global gain: 1.57009E-16
Lowest Huffman code value: 1 (the lowest non-zero integer)
Lowest MDCT value = 1.57009E-16 * 1(4/3) = 1.57009E-16

The dynamic range of MP3/AAC is: 20 * log(MAX/MIN) = 486.685 dB!

This result is valid for pure tones (sine waves) without saying anything about the quality of the compression. For a given bitrate or quality the dynamic range will be the same as the dynamic range of the global gain, which is 382.5 dB (1.5 dB * 255).

Click here to see the conclusion of this article.

Csaba Horvath

Facebook    Google

Complete list of articles and software tools