Diff-DEQ: Differentiable Dynamic Equalization for Studio-Quality Speech Processing

Publications

Diff-DEQ: Differentiable Dynamic Equalization for Studio-Quality Speech Processing

Year : 2025

Publisher : European Signal Processing Conference, EUSIPCO

Source Title : European Signal Processing Conference

Document Type :

Abstract

We present Differentiable Dynamic Equalization (Diff-DEQ), a fully differentiable deep learning framework for speech equalization and enhancement to achieve studio quality for audio post-production tasks. Unlike fixed-rule equalization methods, it adapts spectral components dynamically, responding to input signal variations to attain precise and content-aware spectral shaping. The model combines a FiLM-modulated Temporal Convolutional Network (TCN) and a Bidirectional Gated Recurrent Unit (BiGRU) to predict per-band equalization parameters with audio feature-based conditioning for improved adaptability. We have trained the model in a self-supervised manner that eliminates the need for paired input-target data. We evaluate its performance using objective metrics on Diff-DEQ and parametric equalization (PEQ) across LibriTTS, DAPS, and VCTK datasets and non-intrusive speech quality assessment for subjective evaluation. Our results show that Diff-DEQ enhances speech intelligibility and perceived quality, making it well-suited for audio post-production.