Example audio from subjective evaluation experiments in which participants rated the "naturalness" of speech clips processed by audio-to-audio transforms optimized for various loss functions.
Audio label | Description of denoising model used to process input signal |
---|---|
Unprocessed Input | Noisy input signal consisting of clean speech superimposed on background noise (no denoising model) |
A123 | Wave-U-Net trained to minimize deep feature losses from three AudioSet-trained DNNs (deep features producing highest subjective ratings) |
Random123 | Wave-U-Net trained to minimize deep feature losses from three untrained DNNs (random weights) |
GermainDeepFeatures | Wave-U-Net trained to minimize deep feature loss proposed by Germain et al. (2018) |
CochlearModel (human) | Wave-U-Net trained to minimize loss derived from a cochlear model with human-like frequency tuning (ERB-spaced filter bank) |
CochlearModel (reverse) | Wave-U-Net trained to minimize loss derived from a cochlear model with altered frequency tuning (reverse-ERB-spaced filter bank) |
Waveform Wave-U-Net | Wave-U-Net trained to reconstruct clean speech waveform |
Waveform WaveNet | WaveNet trained to reconstruct clean speech waveform |
Unprocessed Input | A123 | Random123 | GermainDeepFeatures | CochlearModel (human) | CochlearModel (reverse) | Waveform Wave-U-Net | Waveform WaveNet |
---|---|---|---|---|---|---|---|
Unprocessed Input | A123 | Random123 | GermainDeepFeatures | CochlearModel (human) | CochlearModel (reverse) | Waveform Wave-U-Net | Waveform WaveNet |
---|---|---|---|---|---|---|---|
Unprocessed Input | A123 | Random123 | GermainDeepFeatures | CochlearModel (human) | CochlearModel (reverse) | Waveform Wave-U-Net | Waveform WaveNet |
---|---|---|---|---|---|---|---|
Unprocessed Input | A123 | Random123 | GermainDeepFeatures | CochlearModel (human) | CochlearModel (reverse) | Waveform Wave-U-Net | Waveform WaveNet |
---|---|---|---|---|---|---|---|
Unprocessed Input | A123 | Random123 | GermainDeepFeatures | CochlearModel (human) | CochlearModel (reverse) | Waveform Wave-U-Net | Waveform WaveNet |
---|---|---|---|---|---|---|---|
Unprocessed Input | A123 | Random123 | GermainDeepFeatures | CochlearModel (human) | CochlearModel (reverse) | Waveform Wave-U-Net | Waveform WaveNet |
---|---|---|---|---|---|---|---|
Unprocessed Input | A123 | Random123 | GermainDeepFeatures | CochlearModel (human) | CochlearModel (reverse) | Waveform Wave-U-Net | Waveform WaveNet |
---|---|---|---|---|---|---|---|
Unprocessed Input | A123 | Random123 | GermainDeepFeatures | CochlearModel (human) | CochlearModel (reverse) | Waveform Wave-U-Net | Waveform WaveNet |
---|---|---|---|---|---|---|---|
Unprocessed Input | A123 | Random123 | GermainDeepFeatures | CochlearModel (human) | CochlearModel (reverse) | Waveform Wave-U-Net | Waveform WaveNet |
---|---|---|---|---|---|---|---|
Unprocessed Input | A123 | Random123 | GermainDeepFeatures | CochlearModel (human) | CochlearModel (reverse) | Waveform Wave-U-Net | Waveform WaveNet |
---|---|---|---|---|---|---|---|