Waveform Modeling Using Hierarchical Recurrent Neural Networks for Speech Bandwidth Extension


 




System Descriptions Demos
Input 8k Input 8kHz speech waveforms of BWE systems.

wav_1   wav_2   wav_3   wav_4   wav_5

VRNN Vocoder-based BWE method using LSTM-based RNNs which uses the logarithmic magnitude spectrum (LMS) as the input and output of the RNN.

wav_1   wav_2   wav_3   wav_4   wav_5

DCNN Dilated CNN-based BWE method which uses CNNs to model the waveforms directly.

wav_1   wav_2   wav_3   wav_4   wav_5

SRNN SRNN-based BWE method which uses sample-level RNNs to model the waveforms point by point.

wav_1   wav_2   wav_3   wav_4   wav_5

HRNN HRNN-based BWE method which uses hierarchical RNNs to model the waveforms.

wav_1   wav_2   wav_3   wav_4   wav_5

CHRNN Conditional HRNN-based BWE method which uses hierarchical RNNs and BN features as additional conditions to model the waveforms.

wav_1   wav_2   wav_3   wav_4   wav_5

Nature Original 16kHz speech recording. wav_1   wav_2   wav_3   wav_4   wav_5