Sampling rate controversy: simple and conclusive test methods


How to do a proper sample rate test? - Testing the limits of 44.1 kHz sampling rate with high-pass filtered audio samples and pure tones. The difference between minimum phase and linear phase resampling filters is also investigated. Poor quality resampling can lead to fake detection of different sample rates, especially in tests with high sensitivity.


Feb. 2, 2024


A short 'memo' (sampling & hearing)

Using a sampling frequency of 44.1 kHz, the analog curve can be reconstructed from the sampled data points up to 20 kHz with correct amplitude, phase, timing, without any pre and post ringing (since the frequency of ringing is higher than 20 kHz).

Thinking in relative levels, -20 dBFS (dBFS: decibel below full scale; 0 dBFS is the level of a full scale sine) is a good estimation of the highest "critical band level" that may occur in music in the 16 kHz - 25 kHz frequency range during a loud cymbal hit. However, such a high level is a very rare event, more of an extreme moment in music. Without a loud cymbal hit the peak level is approx. -35 dBFS in this range.

Therefore a crucial test for 44.1 kHz sampling rate is a listening test at 22 kHz with a single-frequency tone. The level should be between -30 dBFS and -20 dBFS (definitely shouldn't be more than -20 dBFS and shouldn't be lower than -30 dBFS). The rest is just demonstration.

Since the top of human hearing range is 20 kHz, the answer is pretty obvious...

Furthermore, the devil lies in the details: the quality of resampling depends on the resampling process, more precisely the simplifications applied during resampling.


Read this before you test

Test tones:

Test requirements:

For single-frequency tones, two versions are included (except for the 22 kHz test). By this way a poor quality resampling can be identified and false positive results can be excluded, though it's very unlikely that resampling causes any trouble nowadays. However, it's worth keeping in mind since a poor quality resampling can lead to false positive detection of different sample rates.

How can we detect poor quality resampling? The result of bad quality resampling is distortion that increases with frequency. Thanks to the non-harmonic nature of the distortion poor quality resampling can be easily identified by listening to a pure tone with a frequency higher than 10 kHz. Ideally pure tones at any frequency should sound the same at 44.1 kHz and 48 kHz sampling rates. If high-frequency tones sound different at different sample rates and one version has a distorted sound, then a poor quality sample rate conversion is used on this version. The other sample rate is likely free from resampling.

Different false positive detection may occur when the frequency of the test tone is close to the half of the sampling rate. When a DAC's reconstruction filter has slow roll-off and the image component is not attenuated well, the image component together with the test tone can generate difference tone distortion in amplifiers or speakers (usually this doesn't affect the fidelity of normal playback). A simple solution is to add a bit of masking noise between 1 kHz and 10 kHz.

Clipping is another source of false positive results. (In the following test clipping can be ruled out.)

Structure of the single-frequency test tones: one second silence, followed by a two second test tone, followed by a 0.2 second silence.

This is an online test. The validity of the test depends on your system, your system's settings and your expertise to identify possible errors.

Advice for loudspeaker listening: the volume level should be adjusted with music and should not be turned up when listening to these test files.


Single-frequency tone (14 kHz)

14 kHz is almost the top of the human hearing range and 2 kHz lower than the highest valid frequency at 32 kHz sampling frequency.

The signal level is -15 dBFS (15 dB below full scale).

Sampling rate: 44.1 kHz

Sampling rate: 48 kHz


Single-frequency tone (20 kHz)

Top of the human hearing range and a practical limit of 44.1 kHz sampling rate. 320kbps MP3 files and YouTube audio tracks are also limited at 20 kHz.

Level: -20 dBFS. This value is a good estimation of the highest critical band level that may occur in music in this frequency region (during a cymbal crash).

Sampling rate: 44.1 kHz

Sampling rate: 48 kHz


Single-frequency tone (22 kHz)

A practical limit for 48 kHz sampling rate and a crucial test frequency for 44.1 kHz sampling rate.

Level is -20 dBFS and sampling frequency is 48 kHz for both test files. The second version contains a low level masking noise (12 bit dither). The function of this added noise to mask any possible distortion byproduct between 1 kHz and 10 kHz. (This clever trick were used in some hearing tests.)

Without masking noise

With masking noise


An alternative sampling rate test method: high-pass filtered audio samples

Traditional song-based discrimination tests provide low sensitivity and rely heavily on the auditory memory. Detection of sounds in silence is much easier task than comparing "wideband" sounds with complex harmonic and temporal structure. Sensitivity of a test can be greatly increased by applying different filters and thus "format discrimination tests" can be converted to "sound detection tests". Selection of audio samples (instrument sounds, signals) is still a critical step since audio samples have a great influence on the sensitivity of the tests. Furthermore, though no test is free from false positive and false negative results, identifying false positive results is easier in sound detection type tests.

The simplest and fastest way to do a sample rate discrimination test is to remove frequencies below 16 kHz or 20 kHz with a high pass filter. This method has many advantages over a null test: definitely faster and gives more freedom. We can test sampling rates that normally can't be tested with a null test: 32 kHz vs. 44.1 kHz, 44.1 kHz vs. 96 kHz (null testing 44.1k and 82k is simple, but null testing 44.1k and 96k is a nightmare as it requires intermediate steps: conversion to a common frequency (14.112 MHz), manual sample shifts).

Why does it work?

Because higher sampling rate only offers more bandwidth and nothing else. Downsampling a 96 kHz WAV/FLAC file to 44.1 kHz with a correct anti-aliasing filter (simple low pass filter) and reconstructing the downsampled signal with an anti-image filter (simple low pass filter) is equivalent to applying a low-pass filter at 20 kHz.


Bipolar pulse - audibility of filter ringing?

The "worst" kind of transient. Sampling rate is 44.1 kHz for this test.

Unfiltered:

High pass filtered at 16 kHz:

High pass filtered at 19.5 kHz:

A high pass filter creates the same "ringing" in a pulse as a low pass filter with the same cut-off frequency and filter length.

Why is pre-ringing in the impulse response not an issue at 44.1 kHz sampling rate?

The pre- and post-ringing in the impulse response can be considered a burst signal (a few periods of sine signal with a smooth fade-in and fade out). The frequency of the ringing is usually between 21 kHz and 22 kHz. So one answer to the question is that we can't hear the ringing in the impulse because we can't hear a short burst signal created from a 20 kHz or 21 kHz sine signal.

A different answer: in order for a resonance or burst signal to be heard, the time-integrated level of the resonance must be higher than the hearing threshold at the resonance frequency (both masked and absolute threshold, but now we can ignore masking). The level of the ringing in the impulse response of a resampling filter is about -50 dBFS (Audacity, Reaper, CoolEdit test; 100 msec integration time). Converting to SPL, this is 55 dBSPL at most - well below the absolute threshold at 20 kHz.


Crash cymbal audio sample (44.1 kHz)

Let's listen to something different: a famous cymbal crash. Sampling rate is 44.1 kHz for this test.

High pass filtered at 14 kHz:

High pass filtered at 16 kHz:

Original sample:
WARNING: LOUD! - Start with low volume.

Notes:


Minimum phase vs. linear phase resampling filter

All filters introduce some delay. In a linear phase filter's passband the delay is constant. In a minimum phase filter's passband the delay is not constant, but changes as a function of frequency. But can we hear the introduced excess delay?

The following sample contains eight pulses ordered in two groups. In the second group the pulses were filtered with a minimum phase low-pass filter, with 21 kHz cut-off frequency and 80 dB attenuation at 22 kHz. Sampling rate is 44.1 kHz for this test.



Can we hear the group delay distortion? No, it is too small. At a cut-off frequency of 20 kHz, the introduced excess group delay is about 100 µsec.

Csaba Horváth




Main page