avfilter: add Dynamic Audio Normalizer filter

10 years ago · 21436b95dc
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -1544,6 +1544,164 @@ Optional. It should have a value much less than 1 (e.g. 0.05 or 0.02) and is
 used to prevent clipping.
@end table
@section dynaudnorm
 Dynamic Audio Normalizer.
 This filter applies a certain amount of gain to the input audio in order
 to bring its peak magnitude to a target level (e.g. 0 dBFS). However, in
 contrast to more "simple" normalization algorithms, the Dynamic Audio
 Normalizer *dynamically* re-adjusts the gain factor to the input audio.
 This allows for applying extra gain to the "quiet" sections of the audio
 while avoiding distortions or clipping the "loud" sections. In other words:
 The Dynamic Audio Normalizer will "even out" the volume of quiet and loud
 sections, in the sense that the volume of each section is brought to the
 same target level. Note, however, that the Dynamic Audio Normalizer achieves
 this goal *without* applying "dynamic range compressing". It will retain 100%
 of the dynamic range *within* each section of the audio file.
@table @option
@item f
 Set the frame length in milliseconds. In range from 10 to 8000 milliseconds.
 Default is 500 milliseconds.
 The Dynamic Audio Normalizer processes the input audio in small chunks,
 referred to as frames. This is required, because a peak magnitude has no
 meaning for just a single sample value. Instead, we need to determine the
 peak magnitude for a contiguous sequence of sample values. While a "standard"
 normalizer would simply use the peak magnitude of the complete file, the
 Dynamic Audio Normalizer determines the peak magnitude individually for each
 frame. The length of a frame is specified in milliseconds. By default, the
 Dynamic Audio Normalizer uses a frame length of 500 milliseconds, which has
 been found to give good results with most files.
 Note that the exact frame length, in number of samples, will be determined
 automatically, based on the sampling rate of the individual input audio file.
@item g
 Set the Gaussian filter window size. In range from 3 to 301, must be odd
 number. Default is 31.
 Probably the most important parameter of the Dynamic Audio Normalizer is the
@code{window size} of the Gaussian smoothing filter. The filter's window size
 is specified in frames, centered around the current frame. For the sake of
 simplicity, this must be an odd number. Consequently, the default value of 31
 takes into account the current frame, as well as the 15 preceding frames and
 the 15 subsequent frames. Using a larger window results in a stronger
 smoothing effect and thus in less gain variation, i.e. slower gain
 adaptation. Conversely, using a smaller window results in a weaker smoothing
 effect and thus in more gain variation, i.e. faster gain adaptation.
 In other words, the more you increase this value, the more the Dynamic Audio
 Normalizer will behave like a "traditional" normalization filter. On the
 contrary, the more you decrease this value, the more the Dynamic Audio
 Normalizer will behave like a dynamic range compressor.
@item p
 Set the target peak value. This specifies the highest permissible magnitude
 level for the normalized audio input. This filter will try to approach the
 target peak magnitude as closely as possible, but at the same time it also
 makes sure that the normalized signal will never exceed the peak magnitude.
 A frame's maximum local gain factor is imposed directly by the target peak
 magnitude. The default value is 0.95 and thus leaves a headroom of 5%*.
 It is not recommended to go above this value.
@item m
 Set the maximum gain factor. In range from 1.0 to 100.0. Default is 10.0.
 The Dynamic Audio Normalizer determines the maximum possible (local) gain
 factor for each input frame, i.e. the maximum gain factor that does not
 result in clipping or distortion. The maximum gain factor is determined by
 the frame's highest magnitude sample. However, the Dynamic Audio Normalizer
 additionally bounds the frame's maximum gain factor by a predetermined
 (global) maximum gain factor. This is done in order to avoid excessive gain
 factors in "silent" or almost silent frames. By default, the maximum gain
 factor is 10.0, For most inputs the default value should be sufficient and
 it usually is not recommended to increase this value. Though, for input
 with an extremely low overall volume level, it may be necessary to allow even
 higher gain factors. Note, however, that the Dynamic Audio Normalizer does
 not simply apply a "hard" threshold (i.e. cut off values above the threshold).
 Instead, a "sigmoid" threshold function will be applied. This way, the
 gain factors will smoothly approach the threshold value, but never exceed that
 value.
@item r
 Set the target RMS. In range from 0.0 to 1.0. Default is 0.0 - disabled.
 By default, the Dynamic Audio Normalizer performs "peak" normalization.
 This means that the maximum local gain factor for each frame is defined
 (only) by the frame's highest magnitude sample. This way, the samples can
 be amplified as much as possible without exceeding the maximum signal
 level, i.e. without clipping. Optionally, however, the Dynamic Audio
 Normalizer can also take into account the frame's root mean square,
 abbreviated RMS. In electrical engineering, the RMS is commonly used to
 determine the power of a time-varying signal. It is therefore considered
 that the RMS is a better approximation of the "perceived loudness" than
 just looking at the signal's peak magnitude. Consequently, by adjusting all
 frames to a constant RMS value, a uniform "perceived loudness" can be
 established. If a target RMS value has been specified, a frame's local gain
 factor is defined as the factor that would result in exactly that RMS value.
 Note, however, that the maximum local gain factor is still restricted by the
 frame's highest magnitude sample, in order to prevent clipping.
@item n
 Enable channels coupling. By default is enabled.
 By default, the Dynamic Audio Normalizer will amplify all channels by the same
 amount. This means the same gain factor will be applied to all channels, i.e.
 the maximum possible gain factor is determined by the "loudest" channel.
 However, in some recordings, it may happen that the volume of the different
 channels is uneven, e.g. one channel may be "quieter" than the other one(s).
 In this case, this option can be used to disable the channel coupling. This way,
 the gain factor will be determined independently for each channel, depending
 only on the individual channel's highest magnitude sample. This allows for
 harmonizing the volume of the different channels.
@item c
 Enable DC bias correction. By default is disabled.
 An audio signal (in the time domain) is a sequence of sample values.
 In the Dynamic Audio Normalizer these sample values are represented in the
 -1.0 to 1.0 range, regardless of the original input format. Normally, the
 audio signal, or "waveform", should be centered around the zero point.
 That means if we calculate the mean value of all samples in a file, or in a
 single frame, then the result should be 0.0 or at least very close to that
 value. If, however, there is a significant deviation of the mean value from
 0.0, in either positive or negative direction, this is referred to as a
 DC bias or DC offset. Since a DC bias is clearly undesirable, the Dynamic
 Audio Normalizer provides optional DC bias correction.
 With DC bias correction enabled, the Dynamic Audio Normalizer will determine
 the mean value, or "DC correction" offset, of each input frame and subtract
 that value from all of the frame's sample values which ensures those samples
 are centered around 0.0 again. Also, in order to avoid "gaps" at the frame
 boundaries, the DC correction offset values will be interpolated smoothly
 between neighbouring frames.
@item b
 Enable alternative boundary mode. By default is disabled.
 The Dynamic Audio Normalizer takes into account a certain neighbourhood
 around each frame. This includes the preceding frames as well as the
 subsequent frames. However, for the "boundary" frames, located at the very
 beginning and at the very end of the audio file, not all neighbouring
 frames are available. In particular, for the first few frames in the audio
 file, the preceding frames are not known. And, similarly, for the last few
 frames in the audio file, the subsequent frames are not known. Thus, the
 question arises which gain factors should be assumed for the missing frames
 in the "boundary" region. The Dynamic Audio Normalizer implements two modes
 to deal with this situation. The default boundary mode assumes a gain factor
 of exactly 1.0 for the missing frames, resulting in a smooth "fade in" and
 "fade out" at the beginning and at the end of the input, respectively.
@item s
 Set the compress factor. In range from 0.0 to 30.0. Default is 0.0.
 By default, the Dynamic Audio Normalizer does not apply "traditional"
 compression. This means that signal peaks will not be pruned and thus the
 full dynamic range will be retained within each local neighbourhood. However,
 in some cases it may be desirable to combine the Dynamic Audio Normalizer's
 normalization algorithm with a more "traditional" compression.
 For this purpose, the Dynamic Audio Normalizer provides an optional compression
 (thresholding) function. If (and only if) the compression feature is enabled,
 all input frames will be processed by a soft knee thresholding function prior
 to the actual normalization process. Put simply, the thresholding function is
 going to prune all samples whose magnitude exceeds a certain threshold value.
 However, the Dynamic Audio Normalizer does not simply apply a fixed threshold
 value. Instead, the threshold value will be adjusted for each individual
 frame.
 In general, smaller parameters result in stronger compression, and vice versa.
 Values below 3.0 are not recommended, because audible distortion may appear.
@end table
@section earwax
 Make audio easier to listen to on headphones.
--- a/libavfilter/Makefile
+++ b/libavfilter/Makefile
@@ -67,6 +67,7 @@ OBJS-$(CONFIG_CHANNELSPLIT_FILTER)           += af_channelsplit.o
 OBJS-$(CONFIG_CHORUS_FILTER)                 += af_chorus.o generate_wave_table.o
 OBJS-$(CONFIG_COMPAND_FILTER)                += af_compand.o
 OBJS-$(CONFIG_DCSHIFT_FILTER)                += af_dcshift.o
 OBJS-$(CONFIG_DYNAUDNORM_FILTER)             += af_dynaudnorm.o
 OBJS-$(CONFIG_EARWAX_FILTER)                 += af_earwax.o
 OBJS-$(CONFIG_EBUR128_FILTER)                += f_ebur128.o
 OBJS-$(CONFIG_EQUALIZER_FILTER)              += af_biquads.o
--- a/libavfilter/af_dynaudnorm.c
+++ b/libavfilter/af_dynaudnorm.c
@@ -0,0 +1,734 @@
 /*
 * Dynamic Audio Normalizer
 * Copyright (c) 2015 LoRd_MuldeR <mulder2@gmx.de>. Some rights reserved.
 *
 * This file is part of FFmpeg.
 *
 * FFmpeg is free software; you can redistribute it and/or
 * modify it under the terms of the GNU Lesser General Public
 * License as published by the Free Software Foundation; either
 * version 2.1 of the License, or (at your option) any later version.
 *
 * FFmpeg is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 * Lesser General Public License for more details.
 *
 * You should have received a copy of the GNU Lesser General Public
 * License along with FFmpeg; if not, write to the Free Software
 * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
 */
 /**
 * @file
 * Dynamic Audio Normalizer
 */
 #include <float.h>
 #include "libavutil/avassert.h"
 #include "libavutil/opt.h"
 #define FF_BUFQUEUE_SIZE 302
 #include "libavfilter/bufferqueue.h"
 #include "audio.h"
 #include "avfilter.h"
 #include "internal.h"
 typedef struct cqueue {
    double *elements;
    int size;
    int nb_elements;
    int first;
 } cqueue;
 typedef struct DynamicAudioNormalizerContext {
    const AVClass *class;
    struct FFBufQueue queue;
    int frame_len;
    int frame_len_msec;
    int filter_size;
    int dc_correction;
    int channels_coupled;
    int alt_boundary_mode;
    double peak_value;
    double max_amplification;
    double target_rms;
    double compress_factor;
    double *prev_amplification_factor;
    double *dc_correction_value;
    double *compress_threshold;
    double *fade_factors[2];
    double *weights;
    int channels;
    int delay;
    cqueue **gain_history_original;
    cqueue **gain_history_minimum;
    cqueue **gain_history_smoothed;
 } DynamicAudioNormalizerContext;
 #define OFFSET(x) offsetof(DynamicAudioNormalizerContext, x)
 #define FLAGS AV_OPT_FLAG_AUDIO_PARAM|AV_OPT_FLAG_FILTERING_PARAM
 static const AVOption dynaudnorm_options[] = {
    { "f", "set the frame length in msec",     OFFSET(frame_len_msec),    AV_OPT_TYPE_INT,    {.i64 = 500},   10,  8000, FLAGS },
    { "g", "set the filter size",              OFFSET(filter_size),       AV_OPT_TYPE_INT,    {.i64 = 31},     3,   301, FLAGS },
    { "p", "set the peak value",               OFFSET(peak_value),        AV_OPT_TYPE_DOUBLE, {.dbl = 0.95}, 0.0,   1.0, FLAGS },
    { "m", "set the max amplification",        OFFSET(max_amplification), AV_OPT_TYPE_DOUBLE, {.dbl = 10.0}, 1.0, 100.0, FLAGS },
    { "r", "set the target RMS",               OFFSET(target_rms),        AV_OPT_TYPE_DOUBLE, {.dbl = 0.0},  0.0,   1.0, FLAGS },
    { "n", "enable channel coupling",          OFFSET(channels_coupled),  AV_OPT_TYPE_INT,    {.i64 = 1},      0,     1, FLAGS },
    { "c", "enable DC correction",             OFFSET(dc_correction),     AV_OPT_TYPE_INT,    {.i64 = 0},      0,     1, FLAGS },
    { "b", "enable alternative boundary mode", OFFSET(alt_boundary_mode), AV_OPT_TYPE_INT,    {.i64 = 0},      0,     1, FLAGS },
    { "s", "set the compress factor",          OFFSET(compress_factor),   AV_OPT_TYPE_DOUBLE, {.dbl = 0.0},  0.0,  30.0, FLAGS },
    { NULL }
 };
 AVFILTER_DEFINE_CLASS(dynaudnorm);
 static av_cold int init(AVFilterContext *ctx)
 {
    DynamicAudioNormalizerContext *s = ctx->priv;
    if (!(s->filter_size & 1)) {
        av_log(ctx, AV_LOG_ERROR, "filter size %d is invalid. Must be an odd value.\n", s->filter_size);
        return AVERROR(EINVAL);
    }
    return 0;
 }
 static int query_formats(AVFilterContext *ctx)
 {
    AVFilterFormats *formats;
    AVFilterChannelLayouts *layouts;
    static const enum AVSampleFormat sample_fmts[] = {
        AV_SAMPLE_FMT_DBLP,
        AV_SAMPLE_FMT_NONE
    };
    int ret;
    layouts = ff_all_channel_layouts();
    if (!layouts)
        return AVERROR(ENOMEM);
    ret = ff_set_common_channel_layouts(ctx, layouts);
    if (ret < 0)
        return ret;
    formats = ff_make_format_list(sample_fmts);
    if (!formats)
        return AVERROR(ENOMEM);
    ret = ff_set_common_formats(ctx, formats);
    if (ret < 0)
        return ret;
    formats = ff_all_samplerates();
    if (!formats)
        return AVERROR(ENOMEM);
    return ff_set_common_samplerates(ctx, formats);
 }
 static inline int frame_size(int sample_rate, int frame_len_msec)
 {
    const int frame_size = round((double)sample_rate * (frame_len_msec / 1000.0));
    return frame_size + (frame_size % 2);
 }
 static void precalculate_fade_factors(double *fade_factors[2], int frame_len)
 {
    const double step_size = 1.0 / frame_len;
    int pos;
    for (pos = 0; pos < frame_len; pos++) {
        fade_factors[0][pos] = 1.0 - (step_size * (pos + 1.0));
        fade_factors[1][pos] = 1.0 - fade_factors[0][pos];
    }
 }
 static cqueue *cqueue_create(int size)
 {
    cqueue *q;
    q = av_malloc(sizeof(cqueue));
    if (!q)
        return NULL;
    q->size = size;
    q->nb_elements = 0;
    q->first = 0;
    q->elements = av_malloc(sizeof(double) * size);
    if (!q->elements) {
        av_free(q);
        return NULL;
    }
    return q;
 }
 static void cqueue_free(cqueue *q)
 {
    av_free(q->elements);
    av_free(q);
 }
 static int cqueue_size(cqueue *q)
 {
    return q->nb_elements;
 }
 static int cqueue_empty(cqueue *q)
 {
    return !q->nb_elements;
 }
 static int cqueue_enqueue(cqueue *q, double element)
 {
    int i;
    av_assert2(q->nb_elements |= q->size);
    i = (q->first + q->nb_elements) % q->size;
    q->elements[i] = element;
    q->nb_elements++;
    return 0;
 }
 static double cqueue_peek(cqueue *q, int index)
 {
    av_assert2(index < q->nb_elements);
    return q->elements[(q->first + index) % q->size];
 }
 static int cqueue_dequeue(cqueue *q, double *element)
 {
    av_assert2(!cqueue_empty(q));
    *element = q->elements[q->first];
    q->first = (q->first + 1) % q->size;
    q->nb_elements--;
    return 0;
 }
 static int cqueue_pop(cqueue *q)
 {
    av_assert2(!cqueue_empty(q));
    q->first = (q->first + 1) % q->size;
    q->nb_elements--;
    return 0;
 }
 static const double s_pi = 3.1415926535897932384626433832795028841971693993751058209749445923078164062862089986280348253421170679;
 static void init_gaussian_filter(DynamicAudioNormalizerContext *s)
 {
    double total_weight = 0.0;
    const double sigma = (((s->filter_size / 2.0) - 1.0) / 3.0) + (1.0 / 3.0);
    double adjust;
    int i;
    // Pre-compute constants
    const int offset = s->filter_size / 2;
    const double c1 = 1.0 / (sigma * sqrt(2.0 * s_pi));
    const double c2 = 2.0 * pow(sigma, 2.0);
    // Compute weights
    for (i = 0; i < s->filter_size; i++) {
        const int x = i - offset;
        s->weights[i] = c1 * exp(-(pow(x, 2.0) / c2));
        total_weight += s->weights[i];
    }
    // Adjust weights
    adjust = 1.0 / total_weight;
    for (i = 0; i < s->filter_size; i++) {
        s->weights[i] *= adjust;
    }
 }
 static int config_input(AVFilterLink *inlink)
 {
    AVFilterContext *ctx = inlink->dst;
    DynamicAudioNormalizerContext *s = ctx->priv;
    int c;
    s->frame_len =
    inlink->min_samples =
    inlink->max_samples =
    inlink->partial_buf_size = frame_size(inlink->sample_rate, s->frame_len_msec);
    av_log(ctx, AV_LOG_DEBUG, "frame len %d\n", s->frame_len);
    s->fade_factors[0] = av_malloc(s->frame_len * sizeof(*s->fade_factors[0]));
    s->fade_factors[1] = av_malloc(s->frame_len * sizeof(*s->fade_factors[1]));
    s->prev_amplification_factor = av_malloc(inlink->channels * sizeof(*s->prev_amplification_factor));
    s->dc_correction_value = av_calloc(inlink->channels, sizeof(*s->dc_correction_value));
    s->compress_threshold = av_calloc(inlink->channels, sizeof(*s->compress_threshold));
    s->gain_history_original = av_calloc(inlink->channels, sizeof(*s->gain_history_original));
    s->gain_history_minimum = av_calloc(inlink->channels, sizeof(*s->gain_history_minimum));
    s->gain_history_smoothed = av_calloc(inlink->channels, sizeof(*s->gain_history_smoothed));
    s->weights = av_malloc(s->filter_size * sizeof(*s->weights));
    if (!s->prev_amplification_factor || !s->dc_correction_value ||
        !s->compress_threshold || !s->fade_factors[0] || !s->fade_factors[1] ||
        !s->gain_history_original || !s->gain_history_minimum ||
        !s->gain_history_smoothed || !s->weights)
        return AVERROR(ENOMEM);
    for (c = 0; c < inlink->channels; c++) {
        s->prev_amplification_factor[c] = 1.0;
        s->gain_history_original[c] = cqueue_create(s->filter_size);
        s->gain_history_minimum[c]  = cqueue_create(s->filter_size);
        s->gain_history_smoothed[c] = cqueue_create(s->filter_size);
        if (!s->gain_history_original[c] || !s->gain_history_minimum[c] ||
            !s->gain_history_smoothed[c])
            return AVERROR(ENOMEM);
    }
    precalculate_fade_factors(s->fade_factors, s->frame_len);
    init_gaussian_filter(s);
    s->channels = inlink->channels;
    s->delay = s->filter_size;
    return 0;
 }
 static int config_output(AVFilterLink *outlink)
 {
    outlink->flags |= FF_LINK_FLAG_REQUEST_LOOP;
    return 0;
 }
 static inline double fade(double prev, double next, int pos,
                          double *fade_factors[2])
 {
    return fade_factors[0][pos] * prev + fade_factors[1][pos] * next;
 }
 static inline double pow2(const double value)
 {
    return value * value;
 }
 static inline double bound(const double threshold, const double val)
 {
    const double CONST = 0.8862269254527580136490837416705725913987747280611935; //sqrt(PI) / 2.0
    return erf(CONST * (val / threshold)) * threshold;
 }
 static double find_peak_magnitude(AVFrame *frame, int channel)
 {
    double max = DBL_EPSILON;
    int c, i;
    if (channel == -1) {
        for (c = 0; c < frame->channels; c++) {
            double *data_ptr = (double *)frame->extended_data[c];
            for (i = 0; i < frame->nb_samples; i++)
                max = FFMAX(max, fabs(data_ptr[i]));
        }
    } else {
        double *data_ptr = (double *)frame->extended_data[channel];
        for (i = 0; i < frame->nb_samples; i++)
            max = FFMAX(max, fabs(data_ptr[i]));
    }
    return max;
 }
 static double compute_frame_rms(AVFrame *frame, int channel)
 {
    double rms_value = 0.0;
    int c, i;
    if (channel == -1) {
        for (c = 0; c < frame->channels; c++) {
            const double *data_ptr = (double *)frame->extended_data[c];
            for (i = 0; i < frame->nb_samples; i++) {
                rms_value += pow2(data_ptr[i]);
            }
        }
        rms_value /= frame->nb_samples * frame->channels;
    } else {
        const double *data_ptr = (double *)frame->extended_data[channel];
        for (i = 0; i < frame->nb_samples; i++) {
            rms_value += pow2(data_ptr[i]);
        }
        rms_value /= frame->nb_samples;
    }
    return FFMAX(sqrt(rms_value), DBL_EPSILON);
 }
 static double get_max_local_gain(DynamicAudioNormalizerContext *s, AVFrame *frame,
                                 int channel)
 {
    const double maximum_gain = s->peak_value / find_peak_magnitude(frame, channel);
    const double rms_gain = s->target_rms > DBL_EPSILON ? (s->target_rms / compute_frame_rms(frame, channel)) : DBL_MAX;
    return bound(s->max_amplification, FFMIN(maximum_gain, rms_gain));
 }
 static double minimum_filter(cqueue *q)
 {
    double min = DBL_MAX;
    int i;
    for (i = 0; i < cqueue_size(q); i++) {
        min = FFMIN(min, cqueue_peek(q, i));
    }
    return min;
 }
 static double gaussian_filter(DynamicAudioNormalizerContext *s, cqueue *q)
 {
    double result = 0.0;
    int i;
    for (i = 0; i < cqueue_size(q); i++) {
        result += cqueue_peek(q, i) * s->weights[i];
    }
    return result;
 }
 static void update_gain_history(DynamicAudioNormalizerContext *s, int channel,
                                double current_gain_factor)
 {
    if (cqueue_empty(s->gain_history_original[channel]) ||
        cqueue_empty(s->gain_history_minimum[channel])) {
        const int pre_fill_size = s->filter_size / 2;
        s->prev_amplification_factor[channel] = s->alt_boundary_mode ? current_gain_factor : 1.0;
        while (cqueue_size(s->gain_history_original[channel]) < pre_fill_size) {
            cqueue_enqueue(s->gain_history_original[channel], s->alt_boundary_mode ? current_gain_factor : 1.0);
        }
        while (cqueue_size(s->gain_history_minimum[channel]) < pre_fill_size) {
            cqueue_enqueue(s->gain_history_minimum[channel], s->alt_boundary_mode ? current_gain_factor : 1.0);
        }
    }
    cqueue_enqueue(s->gain_history_original[channel], current_gain_factor);
    while (cqueue_size(s->gain_history_original[channel]) >= s->filter_size) {
        av_assert0(cqueue_size(s->gain_history_original[channel]) == s->filter_size);
        const double minimum = minimum_filter(s->gain_history_original[channel]);
        cqueue_enqueue(s->gain_history_minimum[channel], minimum);
        cqueue_pop(s->gain_history_original[channel]);
    }
    while (cqueue_size(s->gain_history_minimum[channel]) >= s->filter_size) {
        av_assert0(cqueue_size(s->gain_history_minimum[channel]) == s->filter_size);
        const double smoothed = gaussian_filter(s, s->gain_history_minimum[channel]);
        cqueue_enqueue(s->gain_history_smoothed[channel], smoothed);
        cqueue_pop(s->gain_history_minimum[channel]);
    }
 }
 static inline double update_value(double new, double old, double aggressiveness)
 {
    av_assert0((aggressiveness >= 0.0) && (aggressiveness <= 1.0));
    return aggressiveness * new + (1.0 - aggressiveness) * old;
 }
 static void perform_dc_correction(DynamicAudioNormalizerContext *s, AVFrame *frame)
 {
    const double diff = 1.0 / frame->nb_samples;
    int is_first_frame = cqueue_empty(s->gain_history_original[0]);
    int c, i;
    for (c = 0; c < s->channels; c++) {
        double *dst_ptr = (double *)frame->extended_data[c];
        double current_average_value = 0.0;
        for (i = 0; i < frame->nb_samples; i++)
            current_average_value += dst_ptr[i] * diff;
        const double prev_value = is_first_frame ? current_average_value : s->dc_correction_value[c];
        s->dc_correction_value[c] = is_first_frame ? current_average_value : update_value(current_average_value, s->dc_correction_value[c], 0.1);
        for (i = 0; i < frame->nb_samples; i++) {
            dst_ptr[i] -= fade(prev_value, s->dc_correction_value[c], i, s->fade_factors);
        }
    }
 }
 static double setup_compress_thresh(double threshold)
 {
    if ((threshold > DBL_EPSILON) && (threshold < (1.0 - DBL_EPSILON))) {
        double current_threshold = threshold;
        double step_size = 1.0;
        while (step_size > DBL_EPSILON) {
            while ((current_threshold + step_size > current_threshold) &&
                   (bound(current_threshold + step_size, 1.0) <= threshold)) {
                current_threshold += step_size;
            }
            step_size /= 2.0;
        }
        return current_threshold;
    } else {
        return threshold;
    }
 }
 static double compute_frame_std_dev(DynamicAudioNormalizerContext *s,
                                    AVFrame *frame, int channel)
 {
    double variance = 0.0;
    int i, c;
    if (channel == -1) {
        for (c = 0; c < s->channels; c++) {
            const double *data_ptr = (double *)frame->extended_data[c];
            for (i = 0; i < frame->nb_samples; i++) {
                variance += pow2(data_ptr[i]);  // Assume that MEAN is *zero*
            }
        }
        variance /= (s->channels * frame->nb_samples) - 1;
    } else {
        const double *data_ptr = (double *)frame->extended_data[channel];
        for (i = 0; i < frame->nb_samples; i++) {
            variance += pow2(data_ptr[i]);      // Assume that MEAN is *zero*
        }
        variance /= frame->nb_samples - 1;
    }
    return FFMAX(sqrt(variance), DBL_EPSILON);
 }
 static void perform_compression(DynamicAudioNormalizerContext *s, AVFrame *frame)
 {
    int is_first_frame = cqueue_empty(s->gain_history_original[0]);
    int c, i;
    if (s->channels_coupled) {
        const double standard_deviation = compute_frame_std_dev(s, frame, -1);
        const double current_threshold  = FFMIN(1.0, s->compress_factor * standard_deviation);
        const double prev_value = is_first_frame ? current_threshold : s->compress_threshold[0];
        s->compress_threshold[0] = is_first_frame ? current_threshold : update_value(current_threshold, s->compress_threshold[0], (1.0/3.0));
        const double prev_actual_thresh = setup_compress_thresh(prev_value);
        const double curr_actual_thresh = setup_compress_thresh(s->compress_threshold[0]);
        for (c = 0; c < s->channels; c++) {
            double *const dst_ptr = (double *)frame->extended_data[c];
            for (i = 0; i < frame->nb_samples; i++) {
                const double localThresh = fade(prev_actual_thresh, curr_actual_thresh, i, s->fade_factors);
                dst_ptr[i] = copysign(bound(localThresh, fabs(dst_ptr[i])), dst_ptr[i]);
            }
        }
    } else {
        for (c = 0; c < s->channels; c++) {
            const double standard_deviation = compute_frame_std_dev(s, frame, c);
            const double current_threshold  = setup_compress_thresh(FFMIN(1.0, s->compress_factor * standard_deviation));
            const double prev_value = is_first_frame ? current_threshold : s->compress_threshold[c];
            s->compress_threshold[c] = is_first_frame ? current_threshold : update_value(current_threshold, s->compress_threshold[c], 1.0/3.0);
            const double prev_actual_thresh = setup_compress_thresh(prev_value);
            const double curr_actual_thresh = setup_compress_thresh(s->compress_threshold[c]);
            double *const dst_ptr = (double *)frame->extended_data[c];
            for (i = 0; i < frame->nb_samples; i++) {
                const double localThresh = fade(prev_actual_thresh, curr_actual_thresh, i, s->fade_factors);
                dst_ptr[i] = copysign(bound(localThresh, fabs(dst_ptr[i])), dst_ptr[i]);
            }
        }
    }
 }
 static void analyze_frame(DynamicAudioNormalizerContext *s, AVFrame *frame)
 {
    if (s->dc_correction) {
        perform_dc_correction(s, frame);
    }
    if (s->compress_factor > DBL_EPSILON) {
        perform_compression(s, frame);
    }
    if (s->channels_coupled) {
        const double current_gain_factor = get_max_local_gain(s, frame, -1);
        int c;
        for (c = 0; c < s->channels; c++)
            update_gain_history(s, c, current_gain_factor);
    } else {
        int c;
        for (c = 0; c < s->channels; c++)
            update_gain_history(s, c, get_max_local_gain(s, frame, c));
    }
 }
 static void amplify_frame(DynamicAudioNormalizerContext *s, AVFrame *frame)
 {
    int c, i;
    for (c = 0; c < s->channels; c++) {
        double *dst_ptr = (double *)frame->extended_data[c];
        double current_amplification_factor;
        cqueue_dequeue(s->gain_history_smoothed[c], &current_amplification_factor);
        for (i = 0; i < frame->nb_samples; i++) {
            const double amplification_factor = fade(s->prev_amplification_factor[c],
                                                     current_amplification_factor, i,
                                                     s->fade_factors);
            dst_ptr[i] *= amplification_factor;
            if (fabs(dst_ptr[i]) > s->peak_value)
                dst_ptr[i] = copysign(s->peak_value, dst_ptr[i]);
        }
        s->prev_amplification_factor[c] = current_amplification_factor;
    }
 }
 static int filter_frame(AVFilterLink *inlink, AVFrame *in)
 {
    AVFilterContext *ctx = inlink->dst;
    DynamicAudioNormalizerContext *s = ctx->priv;
    AVFilterLink *outlink = inlink->dst->outputs[0];
    int ret = 0;
    if (!cqueue_empty(s->gain_history_smoothed[0])) {
        AVFrame *out = ff_bufqueue_get(&s->queue);
        amplify_frame(s, out);
        ret = ff_filter_frame(outlink, out);
    }
    analyze_frame(s, in);
    ff_bufqueue_add(ctx, &s->queue, in);
    return ret;
 }
 static int flush_buffer(DynamicAudioNormalizerContext *s, AVFilterLink *inlink,
                        AVFilterLink *outlink)
 {
    AVFrame *out = ff_get_audio_buffer(outlink, s->frame_len);
    int c, i;
    if (!out)
        return AVERROR(ENOMEM);
    for (c = 0; c < s->channels; c++) {
        double *dst_ptr = (double *)out->extended_data[c];
        for (i = 0; i < out->nb_samples; i++) {
            dst_ptr[i] = s->alt_boundary_mode ? DBL_EPSILON : ((s->target_rms > DBL_EPSILON) ? FFMIN(s->peak_value, s->target_rms) : s->peak_value);
            if (s->dc_correction) {
                dst_ptr[i] *= ((i % 2) == 1) ? -1 : 1;
                dst_ptr[i] += s->dc_correction_value[c];
            }
        }
    }
    s->delay--;
    return filter_frame(inlink, out);
 }
 static int request_frame(AVFilterLink *outlink)
 {
    AVFilterContext *ctx = outlink->src;
    DynamicAudioNormalizerContext *s = ctx->priv;
    int ret = 0;
    ret = ff_request_frame(ctx->inputs[0]);
    if (ret == AVERROR_EOF && !ctx->is_disabled && s->delay)
        ret = flush_buffer(s, ctx->inputs[0], outlink);
    return ret;
 }
 static av_cold void uninit(AVFilterContext *ctx)
 {
    DynamicAudioNormalizerContext *s = ctx->priv;
    int c;
    av_freep(&s->prev_amplification_factor);
    av_freep(&s->dc_correction_value);
    av_freep(&s->compress_threshold);
    av_freep(&s->fade_factors[0]);
    av_freep(&s->fade_factors[1]);
    for (c = 0; c < s->channels; c++) {
        cqueue_free(s->gain_history_original[c]);
        cqueue_free(s->gain_history_minimum[c]);
        cqueue_free(s->gain_history_smoothed[c]);
    }
    av_freep(&s->gain_history_original);
    av_freep(&s->gain_history_minimum);
    av_freep(&s->gain_history_smoothed);
    av_freep(&s->weights);
    ff_bufqueue_discard_all(&s->queue);
 }
 static const AVFilterPad avfilter_af_dynaudnorm_inputs[] = {
    {
        .name           = "default",
        .type           = AVMEDIA_TYPE_AUDIO,
        .filter_frame   = filter_frame,
        .config_props   = config_input,
        .needs_writable = 1,
    },
    { NULL }
 };
 static const AVFilterPad avfilter_af_dynaudnorm_outputs[] = {
    {
        .name          = "default",
        .type          = AVMEDIA_TYPE_AUDIO,
        .config_props  = config_output,
        .request_frame = request_frame,
    },
    { NULL }
 };
 AVFilter ff_af_dynaudnorm = {
    .name          = "dynaudnorm",
    .description   = NULL_IF_CONFIG_SMALL("Dynamic Audio Normalizer."),
    .query_formats = query_formats,
    .priv_size     = sizeof(DynamicAudioNormalizerContext),
    .init          = init,
    .uninit        = uninit,
    .inputs        = avfilter_af_dynaudnorm_inputs,
    .outputs       = avfilter_af_dynaudnorm_outputs,
    .priv_class    = &dynaudnorm_class,
 };
--- a/libavfilter/allfilters.c
+++ b/libavfilter/allfilters.c
@@ -83,6 +83,7 @@ void avfilter_register_all(void)
    REGISTER_FILTER(CHORUS,         chorus,         af);
    REGISTER_FILTER(COMPAND,        compand,        af);
    REGISTER_FILTER(DCSHIFT,        dcshift,        af);
    REGISTER_FILTER(DYNAUDNORM,     dynaudnorm,     af);
    REGISTER_FILTER(EARWAX,         earwax,         af);
    REGISTER_FILTER(EBUR128,        ebur128,        af);
    REGISTER_FILTER(EQUALIZER,      equalizer,      af);